PolynomialSized Topological Approximations Using the Permutahedron
 415 Downloads
Abstract
Classical methods to model topological properties of point clouds, such as the Vietoris–Rips complex, suffer from the combinatorial explosion of complex sizes. We propose a novel technique to approximate a multiscale filtration of the Rips complex with improved bounds for size: precisely, for n points in \(\mathbb {R}^d\), we obtain a O(d)approximation whose kskeleton has size \(n2^{O(d \log k)}\) per scale and \(n2^{O(d\log d)}\) in total over all scales. In conjunction with dimension reduction techniques, our approach yields a \(O(\mathrm {polylog} (n))\)approximation of size \(n^{O(1)}\) for Rips filtrations on arbitrary metric spaces. This result stems from highdimensional lattice geometry and exploits properties of the permutahedral lattice, a wellstudied structure in discrete geometry. Building on the same geometric concept, we also present a lower bound result on the size of an approximation: we construct a point set for which every \((1+\varepsilon )\)approximation of the Čech filtration has to contain \(n^{\Omega (\log \log n)}\) features, provided that \(\varepsilon <\frac{1}{\log ^{1+c} n}\) for \(c\in (0,1)\).
Keywords
Persistent homology Topological data analysis Simplicial approximation Permutahedron Approximation algorithmsMathematics Subject Classification
55U10 11H06 68W251 Introduction
1.1 Motivation and Previous Work
Topological data analysis aims at finding and reasoning about the underlying topological features of metric spaces. The idea is to represent a data set by a set of discrete structures on a range of scales and to track the evolution of homological features as the scale varies. The theory of persistent homology allows for a topological summary, called the persistence diagram which summarizes the lifetimes of topological features in the data as the scale under consideration varies monotonously. A major step in the computation of this topological signature is the question of how to compute a multiscale representation of a given data set.
For data in the form of finite point clouds, two frequently used constructions are the (Vietoris–)Rips complex \(\mathcal {R}_\alpha \) and the Čech complex \(\mathcal {C}_\alpha \) which are defined with respect to a scale parameter \(\alpha \ge 0\). Both are simplicial complexes capturing the proximity of points at scale \(\alpha \), with different levels of accuracy. Increasing \(\alpha \) from 0 to \(\infty \) yields a nested sequence of simplicial complexes called a filtration.
Unfortunately, Rips and Čech filtrations can be uncomfortably large to handle. For homological features in low dimensions, it suffices to consider the kskeleton of the complex, that is, all simplices of dimension at most k. Still, the kskeleton of Rips and Čech complexes can be as large as \(n^{k+1}\) for n points, which is already impractical for small k when n is large.
One remedy is to approximate the topological signature of the filtration. A common way to do this is to construct a tower, which is a sequence of simplicial complexes connected by simplicial maps. The size of a tower is the number of simplices which are included in the sequence of simplicial complexes. A tower is said to approximate a filtration, if it yields a topological signature similar to that of the original filtration, while having a significantly smaller size. The notion of “similarity” in this context can be made formal through a distance measure on persistence diagrams. The most frequently used similarity measure is the bottleneck distance, which finds correspondences between topological features of two towers, such that the lifetimes of each pair of matched features are as close as possible. A related notion is the logscale bottleneck distance which allows a larger discrepancy for larger scales and thus can be seen as a relative approximation, with usual bottleneck distance as its absolute counterpart. We call an approximate tower a capproximation of the original, if their persistence diagrams have logscale bottleneck distance at most c.
Sheehy [28] gave the first such approximate tower for Rips complexes with a formal guarantee. For \(0<\varepsilon \le 1/3\), he constructs a \((1+\varepsilon )\)approximate tower of the Rips filtration. The approximation tower of [28] consists of a filtration of simplicial complexes, whose kskeleton has size only \(n({1}/{\varepsilon })^{O(\lambda k)}\), where \(\lambda \) is the doubling dimension of the metric. Since then, several alternative techniques have been explored for Rips [12] and Čech complexes [5, 22], all arriving at the same complexity bound.
While the above approaches work well for instances where \(\lambda \) and k are small, we focus on highdimensional point sets. This has two reasons: first, one might simply want to analyze data sets for which the intrinsic dimension is high, but the existing methods do not succeed in reducing the complex size sufficiently. Second, even for mediumsize dimensions, one might not want to restrict its scope to the lowhomology features, so that \(k=\lambda \) is not an unreasonable parameter choice. To adapt the aforementioned schemes to play nice with high dimensional point clouds, it makes sense to use dimension reduction results to eliminate the dependence on \(\lambda \). Indeed, it has been shown, in analogy to the famous Johnson–Lindenstrauss Lemma [18], that an orthogonal projection of a point set of \(\mathbb {R}^d\) to a \(O(\log n/\varepsilon ^2)\)dimensional subspace yields a \((1+\varepsilon )\) approximation of the Čech filtration [20, 29]. Combining these two approximation schemes, however, yields an approximation of size \(O(n^{k+1})\) (ignoring \(\varepsilon \)factors) and does not improve upon the exact case.
1.2 Our Contributions
We present two results about the approximation of Rips and Čech filtrations: we give a scheme for approximating the Rips filtration with smaller complex size than existing approaches, at the price of guaranteeing only an approximation quality of \(\mathrm {polylog}(n)\). Since Rips and Čech filtrations approximate each other by a constant factor, our result also extends to the Čech filtration, with an extra constant factor in the approximation quality. Second, we prove that any approximation scheme for the Čech filtration has superpolynomial size in n if high accuracy is required. For this result, our proof technique does not extend to Rips complexes. In more detail, our results are as follows:
Upper bound. We present a \(6(d+1)\)approximation of the Rips filtration for n points in \(\mathbb {R}^d\). The construction scheme is outputsensitive in the size of the approximate tower. We show that the kskeleton of the approximate tower has size \(n2^{O(d\log k)}\) per scale. Through a randomized selection of the origin at each scale, we show that the tower has size \(n2^{O(d\log d)}\) over all scales, in expectation. Here, the expectation is over the random choice of the origin, and not on the input point set. This shows that by using a more rough approximation, we can achieve asymptotic improvements on the complex size. We present two algorithms for computing the approximation, whose expected runtimes are upper bounded by \(n2^{O(d)}\log \Delta +n2^{O(d\log d)}\) and \((O(nd^2)+\mathrm{poly}(d))\log \Delta + n\log n 2^{O(d)}+n2^{O(d\log d)}\), where \(\Delta \) is the spread of the point set. The expected space required by the algorithms is upper bounded by \(n2^{O(d\log d)}\). The real power of our approach reveals itself in high dimensions, in combination with dimension reduction techniques. In conjunction with the lemma of Johnson and Lindenstrauss [18], we obtain a \(O(\log n)\)approximation of expected size \(n^{O(\log \log n)}\), which is much smaller than the original tower; however, the bound is still superpolynomial in n. Combined with a different dimension reduction result of Matoušek [25], we obtain a \(O(\log ^{3/2} n )\)approximation of expected size \(n^{O(1)}\). This is the first polynomial bound in n of an approximate tower, independent of the dimensionality of the point set. For inputs from arbitrary metric spaces (instead of points in \(\mathbb {R}^d\)), the same results hold with an additional \(O(\log n)\) factor in the approximation quality.
Lower bound. We construct a point set of n points in \(d=\Theta (\log n)\) dimensions whose Čech filtration has \(n^{\Omega (\log \log n)}\) persistent features with “relatively long” lifetime. Precisely, that means that any \((1+\delta )\)approximation has to contain a bar of nonzero length for each of those features if \(\delta <O(\frac{1}{\log ^{1+c}n})\) with \(c\in (0,1)\). This shows that it is impossible to define an approximation scheme that yields an accurate approximation of the Čech complexes as well as polynomial size in n.
Methods. Our results follow from a link to lattice geometry: the \(A^*\)lattice is a configuration of points in \(\mathbb {R}^d\) which realizes the thinnest known coverings for low dimensions [11]. The dual Voronoi polytope of a lattice point is the permutahedron, whose vertices are obtained by all coordinate permutations of a fixed point in \(\mathbb {R}^d\).
Our technique resembles the perhaps simplest approximation scheme for point sets: if we digitize \(\mathbb {R}^d\) with ddimensional pixels, we can take the union of pixels that contain input points as our approximation. Our approach does the same, except that we use a tessellation of permutahedra for digitization. In \(\mathbb {R}^2\), our approach corresponds to the common approach of replacing the square tiling by a hexagonal tiling. We utilize the fact that the permutahedral tessellation is in generic position, that is, no more than \(d+1\) polytopes have a common intersection. At the same time, permutahedra are still relatively round, that is, they have small diameter and nonadjacent polytopes are wellseparated. These properties ensure good approximation quality and a small complex. In comparison, a cubical tessellation yields a \(O(\sqrt{d})\)approximation of the Rips filtration which looks like an improvement over our O(d)approximation, but the highly degenerate configuration of the cubes yields a complex size of \(n2^{O(dk)}\), and therefore does not constitute an improvement over Sheehy’s approach [28].
For the lower bound, we arrange n points in a way that one center point has the permutahedron as Voronoi polytope, and we consider simplices incident to that center point in a fixed dimension. We show a superpolynomial number of these simplices create or destroy topological features of nonnegligible persistence.
1.3 Updates from the Conference Version
An earlier version of this work appeared at the 32nd International Symposium on Computational Geometry [10]. There, we gave a naive upper bound on the size of the tower: if \(\Delta \) denotes the spread of the point set, then a simple upper bound on the size of the tower is \(n2^{O(d\log k)}\log \Delta \). In the current version, we show that the dependence on spread can be removed, by a randomized translation of the \(A^*_d\) lattice at each scale of the tower. With this technique, we show that there are \(n2^{O(d\log d)}\) simplices in the tower in expectation, which is independent of the spread \(\Delta \). In the conference version, an upper bound for the time to compute the approximate tower was shown to be \(n2^{O(d\log k)}\log \Delta \). In this version, we present two algorithms to compute the approximation tower, with better upper bounds for the runtime. Also, the current version of the paper contains the proofs which were missing from the conference version.
1.4 Outline of the Paper
We begin by reviewing basics of persistent homology in Sect. 2. Next, we study several relevant properties of the \(A^*\) lattice in Sect. 3. An approximation scheme based on concepts from Sect. 3 is presented in Sect. 4. We present the computational aspects of the scheme from Sect. 4 in Sect. 5. In Sect. 6, we present the effects of dimension reduction techniques on our approximation scheme. In Sect. 7, we present the lower bound result on the size of approximations of Čech filtration. We conclude in Sect. 8.
2 Topological Background
We review some topological concepts needed in our argument. More extensive treatments covering most of the material can be found in the textbooks [13, 17, 26].
2.1 Simplicial Complexes
For an arbitrary set V, called vertices, a simplicial complex over V is a collection of nonempty subsets which is closed under taking nonempty subsets. The elements of a simplicial complex K are called simplices of K. A simplex \(\sigma \) is a face of \(\tau \) if \(\sigma \subseteq \tau \). A facet is a face of codimension 1. The dimension of \(\sigma \) is \(k:=\sigma 1\); we also call \(\sigma \) a ksimplex in this case. The kskeleton of K is the collection of all simplices of dimension at most k. For instance, the 1skeleton of K is a graph defined by its 0 and 1simplices.
We discuss two ways of generating simplicial complexes. In the first one, take a collection \(\mathcal {S}\) of sets over a common universe (for instance, polytopes in \(\mathbb {R}^d\)), and define the nerve of \(\mathcal {S}\) as the simplicial complex whose vertex set is \(\mathcal {S}\), and a ksimplex \(\sigma \) is in the nerve if the corresponding \(k+1\) sets have a nonempty common intersection. The nerve theorem [4] states that if all sets in \(\mathcal {S}\) are convex subsets of \(\mathbb {R}^d\), their nerve is homotopically equivalent to the union of the sets (the statement can be generalized significantly; see [17, Sect. 4.G]). The second construction that we consider are flag complexes: Given a graph \(G=(V,E)\), we define a simplicial complex \(K_G\) over the vertex set V such that a ksimplex \(\sigma \) is in K if for every distinct pair of vertices \(v_1, v_2\in \sigma \), the edge \((v_1,v_2)\) is in E. In other words, \(K_G\) is the maximal simplicial complex with G as its 1skeleton. In general, a complex K is called a flag complex, if \(K=K_G\) with G being the 1skeleton of K.
Given a set of points P in \(\mathbb {R}^d\) and a parameter r, the Čech complex at scale r, \(\mathcal {C}_r\) is defined as the nerve of the balls centered at the elements of P, each of radius r. This is a collection of convex sets. Therefore, the nerve theorem is applicable and it asserts that the nerve agrees homotopically with the union of balls. In the same setup, we can as well consider the intersection graph G of the balls (that is, we have an edge between two points if their distance is at most 2r). The flag complex of G is called the (Vietoris–)Rips complex at scale r, denoted by \(\mathcal {R}_r\). The relation \(\mathcal {C}_r\subseteq \mathcal {R}_r\subseteq \mathcal {C}_{\sqrt{2}r}\) follows from Jung’s Theorem [19].
2.2 Persistence Modules and Simplicial Towers
A persistence module\((V_\alpha )_{\alpha \in G}\) for a totally ordered index set \(G\subseteq \mathbb {R}\) is a sequence of vector spaces with linear maps \(F_{\alpha ,\alpha '}:V_\alpha \rightarrow V_{\alpha '}\) for any \(\alpha \le \alpha '\), satisfying \(F_{\alpha ,\alpha }=\mathrm{id}\) and \(F_{\alpha ',\alpha ''}\circ F_{\alpha ,\alpha '}=F_{\alpha ,\alpha ''}\). Persistence modules can be decomposed into indecomposable intervals giving rise to a persistent barcode which is a complete discrete invariant of the corresponding module.
A distance measure between persistence modules is defined through interleavings: we call two modules \((V_\alpha )\) and \((W_\alpha )\) with linear maps \(F_{\cdot ,\cdot }\) and \(G_{\cdot ,\cdot }\)additively\(\varepsilon \)interleaved, if there exist linear maps \(\phi :V_\alpha \rightarrow W_{\alpha +\varepsilon }\) and \(\psi :W_\alpha \rightarrow V_{\alpha +\varepsilon }\) such that the maps \(\phi \) and \(\psi \) commute with F and G (see [9]). We call the modules multiplicativelycinterleaved with \(c\ge 1\), if there exist linear maps \(\phi :V_\alpha \rightarrow W_{c\alpha }\) and \(\psi :W_\alpha \rightarrow V_{c\alpha }\) with the same commuting properties. Equivalently, this means that the modules are additively \((\log c)\)interleaved when switching to a logarithmic scale. In this case, we also call the module \((G_\alpha )\) a capproximation of \((F_\alpha )\) (and vice versa). Note that the case \(c=1\) implies that the two modules give rise to the same persistent barcode, which is usually referred to as the persistence equivalence theorem [13].
The most common way to generate persistence modules is through the homology of sequences of simplicial complexes. Let K, L be two simplicial complexes with a vertex map \(f:K\rightarrow L\), such that for each simplex \(\sigma =(v_0,\ldots ,v_k)\in K\), there is a simplex \((f(v_0),\ldots ,f(v_k))\in L\). The linear extension of f to simplices of K is called a simplicial map induced by f. A (simplicial) tower\((K_\alpha )_{\alpha \in G}\) over a totally ordered index set \(G\subseteq \mathbb {R}\) is a sequence of simplicial complexes connected by simplicial maps \(f_{\alpha ,\alpha '}:K_{\alpha }\rightarrow K_{\alpha '}\) for any \(\alpha \le \alpha '\), such that \(f_{\alpha ,\alpha }=\mathrm{id}\) and \(f_{\alpha ',\alpha ''}\circ f_{\alpha ,\alpha '}=f_{\alpha ,\alpha ''}\). By the functorial properties of homology (using some fixed field \(\mathbb {F}\) and some fixed dimension \(p\ge 0\)), such a tower gives rise to a persistence module \((H_p(K_\alpha ,\mathbb {F}))_{\alpha \in G}\) [26]. We call a tower a capproximation of another tower if the corresponding persistence modules induced by homology are capproximations of each other.
The standard way of obtaining a tower is through a nested sequence of simplicial complexes, where the simplicial maps are induced by inclusion. Such towers are called filtrations. Examples are the Čech filtration\((\mathcal {C}_\alpha )_{\alpha \in \mathbb {R}}\) and the Rips filtration\((\mathcal {R}_\alpha )_{\alpha \in \mathbb {R}}\). By the relation of Rips and Čech complexes from above, the Rips filtration is a \(\sqrt{2}\)approximation of the Čech filtration.
Any simplicial tower can be written in the form \( K_1\rightarrow \cdots \rightarrow K_M \) where each \(K_i\) differs from \(K_{i1}\) in an elementary fashion [12, 21]. Either each \(K_i\) contains one more simplex than \(K_{i1}\) (which is called an elementary inclusion), or a pair of vertices of \(K_{i1}\) collapse into one in \(K_i\) (elementary contraction). In particular, if \(K_i{\setminus }K_{i1}\) is a vertex, then we call the elementary inclusion as a vertex inclusion. Each contraction collapses simplices which were introduced at an earlier scale. A simplex which contracts does not reappear in the tower. As a result, the number of inclusions is at least the number of contractions in the tower. The size of a tower is then defined as the total number of elementary inclusions in this sequence of simplicial complexes.
2.3 SimplexWise Čech Filtration and (Co)Face Distances
Lemma 2.1
If \(\sigma \) is negative, the barcode interval associated to \(\sigma \) has persistence at least \(L_\sigma \).
Proof
\(\sigma \) kills a \((k1)\)homology class by assumption, and this class is represented by the cycle \(\partial \sigma \). However, this cycle came into existence when the last facet \(\tau \) of \(\sigma \) was added. Therefore, the lifetime of the cycle destroyed by \(\sigma \) is at least \(\alpha _\sigma \alpha _\tau \). \(\square \)
Lemma 2.2
If \(\sigma \) is positive, the homology class created by \(\sigma \) has persistence at least \(L^*_\tau \).
Proof
\(\sigma \) creates a khomology class; every representative cycle of this class is nonzero for \(\sigma \). To turn such a cycle into a boundary, we have to add a \((k+1)\)simplex \(\tau \) with \(\sigma \) in its boundary (otherwise, any \((k+1)\)chain formed will be zero for \(\sigma \)). Therefore, the cycle created at \(\sigma \) persists for at least \(\alpha _\tau \alpha _\sigma \). \(\square \)
3 The \(A^*\)Lattice and the Permutahedron
A latticeL in \(\mathbb {R}^d\) is the set of all integervalued linear combinations of d independent vectors, called the basis of the lattice. Note that the origin belongs to every lattice. The Voronoi polytope of a lattice L is the closed set of all points in \(\mathbb {R}^d\) for which the origin is among the closest lattice points. Since lattices are invariant under translations, the Voronoi polytopes for other lattice points are just translations of the one at the origin, and these polytopes tile \(\mathbb {R}^d\). An elementary example is the integer lattice, spanned by the unit vectors \((e_1,\ldots ,e_d)\), whose Voronoi polytope is the unit dcube, shifted by \(1/2\) in each coordinate direction.
We are interested in a different lattice, called the \(A_d^*\)lattice, whose properties are also wellstudied [11]. First, we define the \(A_d\) lattice as the set of points \((x_1,\ldots ,x_{d+1})\in \mathbb {Z}^{d+1}\) satisfying \(\sum _{i=1}^{d+1} x_i=0\). \(A_d\) is spanned by vectors of the form \((e_i,1)\), \(i=1,\ldots ,d\). While it is defined in \(\mathbb {R}^{d+1}\), all points lie on the hyperplane H defined by \(\sum _{i=1}^{d+1} y_i = 0\). After a suitable change of basis, we can express \(A_d\) by d vectors in \(\mathbb {R}^d\); thus, it is indeed a lattice. In low dimensions, \(A_2\) is the hexagonal lattice, and \(A_3\) is the FCC lattice that realizes the best sphere packing configuration in \(\mathbb {R}^3\) [15].
The dual lattice\(L^*\) of a lattice L is defined as the set of points \((y_1,\ldots ,y_{d})\) in \(\mathbb {R}^{d}\) such that \(y\cdot x\in \mathbb {Z}\) for all \(x\in L\) [11]. Both the integer lattice and the hexagonal lattice are selfdual, while the dual of \(A_3\) is the BCC lattice that realizes the thinnest sphere covering configuration among lattices in \(\mathbb {R}^3\) [2].
3.1 Combinatorics
The kfaces of \(\Pi \) correspond to ordered partitions of the coordinate indices \([d+1]:=\{1,\ldots ,d+1\}\) into \((d+1k)\) nonempty ordered subsets \(\{S_1,\ldots ,S_{d+1k}\}\) such that all coordinates in \(S_i\) are smaller than all coordinates in \(S_j\) for \(i<j\) [32]. For example, with \(d=3\), the partition \((\{1,3\},\{2,4\})\) is the 2face spanned by all points for which the two smallest coordinates appear at the first and the third position. This is an example of a facet of \(\Pi \), for which we need to partition the indices in exactly two subsets; equivalently, the facets of \(\Pi \) are in onetoone correspondence to nonempty proper subsets of \([d+1]\) so \(\Pi \) has \(2^{d+1}2\) facets. The vertices of \(\Pi \) are the \((d+1)\)fold ordered partitions which correspond to permutations of \([d+1]\), reassuring the fact that \(\Pi \) has \((d+1)!\) vertices. Moreover, two faces \(\sigma \), \(\tau \) of \(\Pi \) with \(\dim \sigma < \dim \tau \) are incident if the partition of \(\sigma \) is a refinement of the partition of \(\tau \). Continuing our example from before, the four 1faces bounding the 2face \((\{1,3\},\{2,4\})\) are \((\{1\},\{3\},\{2,4\})\),\((\{3\},\{1\},\{2,4\})\), \((\{1,3\},\{2\},\{4\})\), and \((\{1,3\},\{4\},\{2\})\). Vice versa, we obtain cofaces of a face by combining consecutive partitions into one larger partition. For instance, the two cofacets of \((\{1,3\},\{4\},\{2\})\) are \((\{1,3\},\{2,4\})\) and \((\{1,3,4\},\{2\})\).
Lemma 3.1
Let \(\sigma , \tau \) be two facets of \(\Pi \), defined using the two partitions \((S_\sigma ,[d+1]{\setminus }S_\sigma )\) and \((S_\tau ,[d+1]{\setminus }S_\tau )\), respectively. Then \(\sigma \) and \(\tau \) are adjacent in \(\Pi \) iff \(S_\sigma \subseteq S_\tau \) or \(S_\tau \subseteq S_\sigma \).
Proof
Two facets are adjacent if they share a common face. By the properties of the permutahedron, this means that the two facets are adjacent if and only if their partitions permit a common refinement, which is only possible if one set is contained in the other. \(\square \)
We have already established that \(\Pi \) has “few” (\(2^{d+1}2=O(2^d)\)) \((d1)\)faces and “many” (\((d+1)!=O(2^{d\log d})\)) 0faces. We give an interpolating bound for all intermediate dimensions.
Lemma 3.2
The number of \((dk)\)faces of \(\Pi \) is bounded by \(2^{3 (d+1)\log _2 (k+1)}\).
Proof
3.2 Geometry
All vertices of \(\Pi \) are equidistant from the origin, and it can be checked with a simple calculation that this distance is \(\sqrt{\frac{d(d+2)}{12(d+1)}}\). Using the triangle inequality, we obtain:
Lemma 3.3
The diameter of \(\Pi \) is at most \(\sqrt{d}\).
The permutahedra centered at all lattice points of \(A^*\) define the Voronoi tessellation of \(A^*\). Its nerve is the Delaunay triangulation \(\mathcal {D}\) of \(A^*\). An important property of \(A^*\) is that, unlike for the integer lattice, \(\mathcal {D}\) is nondegenerate – this will ultimately ensure small upper bounds for the size of our approximation scheme.
Lemma 3.4
Each vertex of a permutahedral cell has precisely \(d+1\) cells adjacent to it. In other words, the \(A^*_d\) lattice points are in general position. As a consequence, we can identify Delaunay simplices incident to the origin with faces of \(\Pi \).
Proof
To prove the claim, the idea is to look at any vertex of the Voronoi cell and argue that it has precisely \(d+1\) equidistant lattice points. See [1, Thm. 2.5] for a concise argument. Here, we rephrase the proof idea of [1, Thm. 2.5] in slightly simplified terms.
Now, we show that the Delaunay cells of the \(A^*_d\) lattice are all dsimplices, which will prove our claim. Let \(\mathbf {v}\) be a vertex of the permutahedron which is the Voronoi cell of the origin. Without loss of generality, we can assume that \(\mathbf {v}=\frac{1}{2(d+1)}(d,d2,\ldots ,d)\). The \(A^*_d\) lattice points closest to \(\mathbf {v}\) define the Delaunay cell of \(\mathbf {v}\). We have seen that the lattice points have the form \(\mathbf {y}=\frac{1}{d+1}(\mathbf {m}(d+1)+k\mathbf {1})\), where \(m\in \mathbb {Z}^{d+1}\). Also, \(\mathbf {m}\cdot \mathbf {1}=k\) because \(\mathbf {y}\cdot \mathbf {1}=0\).
Recall that any other vertex \(\mathbf {u}\) of \(\Pi \) can be written as some permutation \(\pi \) of \(\mathbf {v}\), that is, \(\mathbf {u}=\pi (\mathbf {v})\). Following the above derivation, the nearest lattice points for \(\mathbf {u}\) can be found by simply applying the permutation \(\pi \) on the nearest lattice points for \(\mathbf {v}\). As a result, the vertex \(\mathbf {u}\) also has \(d+1\) nearest lattice points, and the corresponding dsimplices are congruent for all \(\mathbf {u}\). This proves the claim. \(\square \)
Proposition 3.5
The \((k1)\)simplices in \(\mathcal {D}\) that are incident to the origin are in onetoone correspondence to the \((dk+1)\)faces of \(\Pi \) and, hence, in onetoone correspondence to the ordered kpartitions of \([d+1]\).
Let V denote the set of lattice points that share a Delaunay edge with the origin. The following statement shows that the point set V is in convex position, and the convex hull encloses \(\Pi \) with some “safety margin”. The proof is a mere calculation, deriving an explicit equation for each hyperplane supporting the convex hull and applying it to all vertices of V and of \(\Pi \).
Lemma 3.6
For each dsimplex attached to the origin, the facet \(\tau \) opposite to the origin lies on a hyperplane which is at least a distance \(\frac{1}{\sqrt{2}(d+1)}\) to \(\Pi \) and all points of V are either on the hyperplane or on the same side as the origin.
Proof
We can verify at once that all these points lie on the hyperplane \(x_1+x_{d+1}+1=0\), so this plane supports \(\tau \). The origin lies on the positive side of the plane. All points in V either lie on the plane or are on the positive side as well, as one can easily check. For the vertices of \(\Pi \), observe that the value \(x_1x_{d+1}\) is minimized for the point v above, for which \(x_1x_{d+1}+1=1/(d+1)\) is obtained. It follows that v as well as any vertex of V is at least in distance \(\frac{1}{\sqrt{2}(d+1)}\) from H (the \(\sqrt{2}\) comes from the length of the normal vector). This proves the claim for the simplex dual to v.
Any other choice of \(\sigma \) is dual to a permuted version of v. Let \(\pi \) denote the permutation on v that yields the dual vertex. The vertices of \(\tau \) are obtained by applying the same permutation on the points \(\ell _k\) from above. Consequently, the plane equation changes to \(x_{\pi (1)}+x_{\pi (d+1)}+1=0\). The same reasoning as above applies, proving the statement in general. \(\square \)
Lemma 3.7
If two lattice points are not adjacent in \(\mathcal {D}\), then the corresponding Voronoi polytopes have a distance of at least \(\frac{\sqrt{2}}{d+1}\).
Proof
Lemma 3.6 shows that \(\Pi \) is contained in a convex polytope C and the distance of \(\Pi \) to the boundary of C is at least \(\frac{1}{\sqrt{2}(d+1)}\). Moreover, if \(\Pi '\) is the Voronoi polytope of a nonadjacent lattice point \(o'\), the corresponding polytope \(C'\) is interiordisjoint from C. To see that, note that the simplices in \(\mathcal {D}\) incident to the origin triangulate the interior of C, and likewise for \(o'\) any interior intersection would be covered by a simplex incident to o and one incident to \(o'\), and since they are not connected, the simplices are distinct, contradicting the fact that \(\mathcal {D}\) is a triangulation. Having established that C and \(C'\) are interiordisjoint, the distance between \(\Pi \) and \(\Pi '\) is at least \(\frac{2}{\sqrt{2}(d+1)}\), as required. \(\square \)
Recall the definition of a flag complex as the maximal simplicial complex one can form from a given graph. We next show that \(\mathcal {D}\) is of this form. While our proof exploits certain properties of \(A^*\), we could not exclude the possibility that the Delaunay triangulation of any lattice is a flag complex.
Lemma 3.8
\(\mathcal {D}\) is a flag complex.
Proof
The proof is based on two claims: consider two facets \(f_1\) and \(f_2\) of \(\Pi \) that are disjoint, that is, do not share a vertex. In the tessellation, there are permutahedra \(\Pi _1\) and \(\Pi _2\) that are adjacent to \(\Pi \), such that \(\Pi \cap \Pi _1=f_1\) and \(\Pi \cap \Pi _2=f_2\). The first claim is that \(\Pi _1\) and \(\Pi _2\) are disjoint. We prove this explicitly by constructing a hyperplane separating \(\Pi _1\) and \(\Pi _2\). See the Appendix for further details.
The second claim is that if k facets of \(\Pi \) are pairwise intersecting, they also have a common intersection. Another way to phrase this statement is that the link of any vertex in \(\mathcal {D}\) is a flag complex. This is a direct consequence of Lemma 3.1. See the Appendix for more details.
The lemma follows directly with these two claims: consider \(k+1\) vertices of \(\mathcal {D}\) which pairwise intersect. We can assume that one point is the origin, and the other k points are the centers of permutahedra that intersect \(\Pi \) in a facet. By the contrapositive of the first claim, all these facets have to intersect pairwisely, because all vertices have pairwise Delaunay edges. By the second claim, there is some common vertex of \(\Pi \) to all these facets, and the dual Delaunay simplex contains the ksimplex spanned by the vertices. \(\square \)
Lemma 3.9
The shortest lattice vector of the \(A^*_d\) lattice has length \(\sqrt{\frac{d}{d+1}}\).
Proof
For any \(\beta >0\), by scaling the lattice vectors of the \(A_d^*\) lattice by \(\beta \), we get a scaled \(A^*_d\) lattice. The Voronoi cells of this scaled lattice are scaled permutahedra. For scaled permutahedra we show an additional property:
Lemma 3.10

\(\pi '\subset \pi \), and

the minimum distance between any facet of \(\pi '\) and any facet of \(\pi \) is at least \(\frac{(\beta \beta ')}{2}\sqrt{\frac{d}{d+1}}\).
Proof
The first claim, \(\pi '\subset \pi \), follows since both permutahedra are scalings of a convex object centered at the origin.
For the second claim, consider any lattice vector v of the standard \(A^*_d\) lattice. The corresponding vectors at scales \(\beta \) and \(\beta '\) are \(v\beta \) and \(v\beta '\), respectively. Let f and \(f'\) be facets of \(\pi \) and \(\pi '\), corresponding to \(v\beta \) and \(v\beta '\), respectively. Then f and \(f'\) lie in parallel hyperplanes, which are separated by distance \((v\beta v\beta ')/2=v(\beta \beta ')/2\). From Lemma 3.9, we know that the shortest lattice vector has length \(\sqrt{\frac{d}{d+1}}\) for the standard \(A^*_d\) lattice. This quantity scales linearly for any scaling of the lattice. This means that the minimal distance between facets of the form \(f,f'\) is \(\delta :=\frac{(\beta \beta ')}{2}\sqrt{\frac{d}{d+1}}\). Let \(f'\) be a facet of \(\pi '\) and g be a facet of \(\pi \). Then there is a facet \(g'\) of \(\pi '\) which is a scaled version of g. Let H be the supporting hyperplane of \(g'\). Since \(\pi '\) is convex, \(f'\) lies in the halfspace of H(on H if \(f'=g'\)) containing the origin. On the other hand, g lies in other halfspace. Moreover, g is at a distance at least \(\delta \) from \(g'\). Therefore, \(f'\) is separated from g by distance at least \(\delta \). This is true for any choice of \(f'\) or g, so the second claim follows. \(\square \)
4 Approximation Scheme
Given a point set P of n points in \(\mathbb {R}^d\), we describe our approximation complex \(X_\beta \) for a fixed scale \(\beta >0\). For that, let \(L_\beta \) denote the \(A_d^*\) lattice in \(\mathbb {R}^d\), with each lattice vector scaled by \(\beta \). Recall that the Voronoi cells of the lattice points are scaled permutahedra which tile \(\mathbb {R}^d\). The bounds for the diameter (Lemma 3.3) as well as for the distance between nonintersecting Voronoi polytopes (Lemma 3.7) remain valid when multiplying them with the scale factor. Hence, any cell of \(L_\beta \) has diameter at most \(\beta \sqrt{d}\). Moreover any two nonadjacent cells have a distance at least \(\beta \frac{\sqrt{2}}{d+1}\).
4.1 Interleaving
To prove that \(X_\beta \) approximates the Rips filtration, we define simplicial maps connecting the complexes on related scales.
Let \(V_{\beta }\) denote the subset of \(L_\beta \) corresponding to full permutohedra. To construct \(X_\beta \), we use a map \(v_\beta :P\rightarrow V_\beta \), which maps each point \(p\in P\) to its closest lattice point. Vice versa, we define \(w_\beta :V_\beta \rightarrow P\) to map a vertex in \(V_\beta \) to the closest point of P. Note that \(v_\beta \circ w_\beta \) is the identity map, while \(w_\beta \circ v_\beta \) is not.
Lemma 4.1
The map \(v_\beta \) induces a simplicial map \(\phi _\beta :\mathcal {R}_{\frac{\beta }{\sqrt{2}(d+1)}} \rightarrow X_{\beta }\).
Proof
Because \(X_\beta \) is a flag complex, it is enough to show that for any edge (p, q) in \(\mathcal {R}_{\frac{\beta }{\sqrt{2}(d+1)}}\), \((v_\beta (p),v_\beta (q))\) is an edge of \(X_{\beta }\). This follows at once from the contrapositive of Lemma 3.7. \(\square \)
Lemma 4.2
The map \(w_\beta \) induces a simplicial map \(\psi _\beta :X_{\beta } \rightarrow \mathcal {R}_{\beta 2\sqrt{d}}\).
Proof
It is enough to show that for any edge (p, q) in \(X_\beta \), \((w_\beta (p),w_\beta (q))\) is an edge of \(\mathcal {R}_{\beta 2\sqrt{d}}\). Note that \(w_\beta (p)\) lies in the permutahedron of p and similarly, \(w_\beta (q)\) lies in the permutahedron of q, so their distance is bounded by twice the diameter of the permutahedron. The statement follows from Lemma 3.3. \(\square \)
Lemma 4.3
Diagram (1) commutes on the homology level, that is, \(\theta _*=\phi _*\circ \psi _*\) and \(g_*=\psi _*\circ \phi _*\), where the asterisk denotes the homology map induced by the simplicial map.
Proof
For the first statement, since \(\theta \) is defined as \(\phi \circ \psi \), so the maps commute at the simplicial level. The second identity is not true on a simplicial level; we show that the maps g and \(h:=\psi \circ \phi \) are contiguous, that means, for every simplex \((x_0,\ldots ,x_k)\in \mathcal {R}_{\beta 2(d+1)}\), the simplex \((g(x_0),\ldots ,g(x_k),h(x_0),\ldots ,h(x_k))\) forms a simplex in \(\mathcal {R}_{\beta 8(d+1)^3}\). Contiguity implies that the induced homology maps \(g_*\) and \(h_*=\psi _*\circ \phi _*\) are equal [26, Sect. 12].
Theorem 4.4
The persistence module \(\left( H_{*}(X_{\beta (2(d+1))^{2k}})\right) _{k\in \mathbb {Z}}\) approximates the persistence module \((H_{*}(\mathcal {R}_{\beta }))_{\beta \ge 0}\) by a factor of \(6(d+1)\).
Proof
Lemma 4.3 proves that on the logarithmic scale, the two towers are weakly\(\varepsilon \)interleaved with \(\varepsilon =2(d+1)\), in the sense of [9]. Theorem 4.3 of [9] asserts that the bottleneck distance of the towers is at most \(3\varepsilon \). \(\square \)
Remark 4.5
The simplicial maps \(\phi \) and \(\psi \) are still valid when the lattices \(L_{\beta }\) are subject to arbitrary unitary transformations independently at each scale. Consequently, Lemma 4.3 and Theorem 4.4 remain valid in such settings. For instance, a random translation can be applied to \(L_\beta \) without altering its approximation qualities.
With a minor modification in the construction, we can improve the approximation factor by \(O(d^{1/4})\). The main observation is that the simplicial maps \(\phi :\mathcal {R}_{\frac{\beta }{\sqrt{2}(d+1)}} \rightarrow X_\beta \) and \(\psi :X_{\beta } \rightarrow \mathcal {R}_{\beta 2\sqrt{d}}\) do not increase the scale parameters of the Rips and the approximate complexes by the same amount: \(\phi \) increases the scale by a factor of \(\sqrt{2}(d+1)\) while \(\psi \) increases it by \(2\sqrt{d}\). We balance this jump in scales by redefining the approximation complex.
For simplicity, we consider the original definition of the approximation complexes in the rest of the paper.
5 Computational Aspects
We utilize the nondegenerate configuration of the permutahedral tessellation to prove that \(X_\beta \) is not too large. We let \(X_\beta ^{(k)}\) denote the kskeleton of \(X_\beta \). In the rest of the section, we make no distinction between a vertex of \(X_\beta \) and the corresponding permutahedron, when it is clear from the context.
Theorem 5.1
For any scale \(\beta \), each vertex of \(X_\beta \) has at most \(2^{O( d\log k)}\) incident ksimplices. This means that \(X_\beta ^{(k)}\) has at most \(n2^{O( d\log k)}\) simplices.
Proof
We fix k and a vertex v of \(V_\beta \). Recall that v represents a permutahedron, which we also denote by \(\Pi (v)\). By definition, any ksimplex containing v corresponds to an intersection of \(k+1\) permutahedra, involving \(\Pi (v)\). By Proposition 3.5, such an intersection corresponds to a \((dk)\)face of \(\Pi (v)\). Therefore, the number of ksimplices involving v is bounded by the number of \((dk)\)faces of the permutahedron, which is \(2^{O( d\log k)}\) using Lemma 3.2. The bound follows because \(X_\beta \) has at most n vertices. \(\square \)
5.1 Range of Scales
To mitigate this undesirable dependence, we introduce a slight modification in the construction: at each scale \(\beta \) of the tower, we apply a random translation to the \(A^*_d\) lattice. More specifically, let \(\pi \) be the permutahedron at the origin at scale \(\beta \). We translate the origin uniformly at random inside \(\pi \), so that the lattice and the cells translate by the same amount. With this randomization, we show that the expected size of the tower is independent of the spread. Specifically, we use random translations to bounding the expected number of vertex inclusions in the tower, which then leads to the main result. The expectation is taken over the random translation of the origin, and does not depend on the choice of the input. We emphasize that the selection of the origin is the only randomized part of our construction.
5.2 WellSeparated Pair Decomposition
Given a set of n points P in \(\mathbb {R}^d\), a wellseparated pair decomposition (WSPD) [8] of P is a collection of pairs of subsets of P, such that for each pair, the diameter (denoted as \(\mathrm{diam}(\,)\)) of the subsets is much smaller than the distance between the subsets. More formally, given a parameter \(\varepsilon >0\), an \(\varepsilon \)WSPD consists of pairs of the form \((A_i,B_i)\subset P\) such that \(\mathrm{max}(\mathrm{diam}(A_i),\mathrm{diam}(B_i))\le \varepsilon d(A_i,B_i)\) where d(A, B) is the minimum separation between points of \(A_i\) and points of \(B_i\). Additionally, for each pair of points \(p,q\in P\), there exists a pair \((A_j,B_j)\) such that either \((p\in A_j,q\in B_j)\) or \((p\in B_j,q\in A_j)\). In other words, a WSPD covers each pair of points of P. An \(\varepsilon \)WSPD of size at most \(n(1/\varepsilon )^{O(d)}\) can be computed in time \(n\log n 2^{O(d)}+n(1/\varepsilon )^{O(d)}\) (see, for instance [8, 16, 30]).
Let W be an \(\varepsilon \)WSPD on P with \(\varepsilon =\frac{1}{6d^2}\). For each pair \((A,B)\in W\), let \(P_A\subset P\) denote the set of points of A and \(P_B\subset P\) denote the set of points of B. We select a representative point for A, which we call \(\mathrm{rep}(A)\), by taking an arbitrary point \(\mathrm{rep}(A)\in P_A\). Similarly, we select a representative \(\mathrm{rep}(B)\in P_B\) for B. For the pair (A, B), we denote the distance between the representatives by \( \hat{d}(A,B) :=\Vert \mathrm{rep}(A)\mathrm{rep}(B)\Vert \). We have that \(d(A,B)\le \hat{d}(A,B) \le d(A,B)+\mathrm{diam}(A)+\mathrm{diam}(B)\), which can be simplified to \(d(A,B)\le \hat{d}(A,B) \le d(A,B)(1+2\varepsilon )\) or \(\frac{ \hat{d}(A,B) }{1+2\varepsilon }\le d(A,B)\le \hat{d}(A,B) \).
5.3 Critical Scales
For any pair \((A,B)\in W\), let i be the largest integer such that \( \hat{d}(A,B) >(1+2\varepsilon )\beta _i2\sqrt{d}\). We say that the scales \(\{\beta _{i+1},\beta _{i+2}\}\) are critical for (A, B). All higher scales are noncritical for (A, B).
For any permutahedron \(\pi \), let \(\mathcal {NBR}(\pi )\) denote the union of \(\pi \) and its neighboring cells.
Lemma 5.2

At scale \(\beta \), let \(\pi ,\pi '\) denote the permutahedra containing \(\mathrm{rep}(A),\mathrm{rep}(B)\), respectively. Then \(P_A\) lies in \(\mathcal {NBR}(\pi )\). Similarly, \(P_B\) lies in \(\mathcal {NBR}(\pi ')\).

At scale \(\delta \), let \(\Pi \) denote the permutahedron that contains \(\mathrm{rep}(A)\). Then, \(P_A\cup P_B\) lies in \(\mathcal {NBR}(\Pi )\).
Proof
Lemma 5.3
Let \((A,B)\in W\) be any WSPD pair. Let \((\beta <\delta )\) denote the critical scales for (A, B). Consider any arbitrary pair of points \((a\in P_A,b\in P_B)\). Let \(\alpha '<\alpha \) be a pair of consecutive scales such that at \(\alpha '\), a and b lie in distinct nonadjacent permutahedra but at \(\alpha \), they lie in adjacent (or the same) permutahedra. Then \(\alpha \) is a critical scale for (A, B), that is, \(\alpha =\beta \) or \(\alpha =\delta \).
Proof

\(\alpha <\beta \): From the definition of critical scales, we have that \(d(A,B)\ge \frac{ \hat{d}(A,B) }{1+2\varepsilon }>\alpha (2\sqrt{d})\), that is, the minimum distance between points of \(P_A\) and \(P_B\) is more than twice the diameter of the cells at scale \(\alpha \). This means that for all \((a\in P_A, b\in P_B)\), the cells containing a and b are not adjacent. This contradicts our assumption that at \(\alpha \), there exists a pair of points \((a\in P_A, b\in P_B)\), such that they lie in adjacent (or the same) cells.

\(\alpha >\delta \): In such a case, we have \(\alpha '\ge \delta \). From Lemma 5.2, we know that if \(\mathrm{rep}(A)\) lies in cell \(\pi \) at scale \(\delta \) or higher, then \(P_A\cup P_B\) lies in \(\mathcal {NBR}(\pi )\). This contradicts our assumption that at \(\alpha '\), there exists a pair of points \((a\in P_A, b\in P_B)\) which lies in distinct nonadjacent cells.
5.4 Size of the Tower
5.4.1 Splits
In the permutahedral tessellation, a cell at a given scale may not be entirely contained within a single cell at larger scales. This can lead to cases where the input points contained in a single cell map to several distinct cells at a higher scale. Formally, at a given scale \(\beta \), let \(\pi \) be a nonempty cell and denote by \(P_\pi \subset P\) the set of input points contained in \(\pi \). At the next scale \(\beta '\), let \(\{\pi _0,\pi _1,\ldots ,\pi _m\}\) be the collection of cells to which \(P_\pi \) maps, with \(\pi \) mapping to \(\pi _0\). We call each pair \((\pi _0,\pi _i)\) for \(1\le i\le m\) a split at scale \(\beta '\). For each split \((\pi _0,\pi _i)\), there exists at least one pair of points \((a,b)\subset P\) such that \(a,b\in \pi , a\in \pi _0, b\in \pi _i\). We call such a pair a split inducing pair (SIP). Each split is induced by some SIP. Also, several SIPs may induce the same split.
Let \((A,B)\in W\) be a WSPD pair. We upper bound the number of splits induced by SIPs of the form (a, b) where \((a\in P_A, b\in P_B)\) over all scales. Counting this for each pair of W gives an upper bound on the number of splits for all SIPs, since each pair of points is covered by some pair of W.
Lemma 5.4
Let \((A,B)\in W\) be a pair of the WSPD. The expected number of splits for SIPs of the form \((a\in P_A, b\in P_B)\) is upper bounded by \(2^{O(d)}\).
Proof
First, we count the number of scales at which splits can be induced by pairs of points of the form \((a\in P_A,b\in P_B)\). From Lemma 5.3, at scales below the critical scales for (A, B), there are no splits induced by such SIPs, so we ignore those scales. There are two relevant cases:
1. Critical scales. Let \(\beta <\delta \) be the two critical scales for (A, B). Suppose there is a split at scale \(\beta \). Then there exists a SIP \((a\in P_A, b\in P_B)\) which was in a single cell at the scale immediately lower than \(\beta \), but is in different cells at scale \(\beta \). By Lemma 5.3, this is not possible. Therefore, there are no splits at \(\beta \).
At the next critical scale \(\delta \), if \(\mathrm{rep}(A)\) lies in cell \(\pi \), then the points of \(P_A\cup P_B\) lie in \(\mathcal {NBR}(\pi )\), using Lemma 5.2. An upper bound on the number of full cells occupied by points \(P_A\cup P_B\) at scale \(\delta \) is therefore the number of cells in \(\mathcal {NBR}(\pi )\), which is \(2^{O(d)}\).
Lemma 5.5
The expected number of vertex inclusions in the tower is upper bounded by \(n2^{O(d\log d)}\).
Proof
At scale \(\beta _0\), there are n vertex inclusions in the tower due to n full permutahedra. First, we show that each vertex inclusion at higher scales is caused by a split.
Let \(\alpha '<\alpha \) be any two consecutive scales in the tower, with the set of full vertices being \(V'\) and V, respectively and let \(\theta \) be the simplicial map from the complex at \(\alpha '\) to the complex at \(\alpha \). Let \(v\in V{\setminus }\theta (V')\) denote a full cell. There is an input point \(p\in v\), since v is full. Let u denote the full cell at scale \(\alpha '\), which contains p. Since \(\theta (u)\ne v\), there exists another input point \(p'\in u\) at scale \(\alpha '\) such that \(p'\) is the closest input point to u’s center. Then (u, v) is a split induced by the SIP \((p,p')\), implying that v was created from a split.
There are at most \(n(6d^2)^{O(d)}=n2^{O(d\log d)}\) pairs in the WSPD, so the total number of expected splits is upper bounded by \(n2^{O(d\log d)}\cdot 2^{O(d)}\), using Lemma 5.4. The claim follows. \(\square \)
Theorem 5.6
The expected size of the tower is upper bounded by \(n2^{O(d\log d)}\).
Proof
Recall that the size of a tower is the number of simplex inclusions involved. From Lemma 5.5, we know that the expected number of vertex inclusions in the tower is upper bounded by \(n2^{O(d\log d)}\). Each simplex included in the filtration is attached to one of these vertices. From Theorem 5.1 we know that each vertex has at most \(2^{O(d\log k)}\)ksimplices attached to it. Therefore, the expected number of simplex inclusions in the tower is upper bounded by \(n2^{O(d\log d)}2^{O(d\log k)}=n2^{O(d\log d)}\). \(\square \)
Note that we do not explicitly construct the WSPD W to argue about the size of the tower. Existence of W suffices for our claims. We next show that any simplex that is included in the tower, collapses to a vertex very soon, within the next few scales:
Lemma 5.7

\(\theta ^1(\sigma )\) is a vertex with probability greater than 1 / 2.

Let j denote the smallest integer such that \(\theta ^{j}(\sigma )\) is a vertex. We say that \(\sigma \)survives for j scales. Then, the expected value of j is at most four.
Proof
5.5 Computing the Tower
5.5.1 Determining the Range of Scales
If the range of scales \([\beta _0,\beta _m]\) is provided as an input, we build the tower at each of the relevant scales. If the range is not provided, we calculate the spread of the point set to determine the relevant scales. For our purpose, it suffices to calculate constantfactor approximations of \(\mathrm{diam}(P)\) and \(\mathrm{CP}(P)\). Taking an arbitrary point \(p\in P\) and calculating \(\max _{q\in P}\Vert pq\Vert \) gives a 1 / 2approximation of \(\mathrm{diam}(P)\). \(\mathrm{CP}(P)\) can be computed exactly using a randomized algorithm in \(n2^{O(d)}\) expected time [23]. Using this information, we calculate the range of scales and call them \([\beta _0,\beta _m]\). The scales of the tower can then be written in the form \(\beta _i=\beta _0c^i, i\ge 0\), where \(c=(2(d+1))^2\).
Algorithm 5.8
We construct the tower scalebyscale. Let \(\alpha '<\alpha \) be any two consecutive scales and \(X',X\) the respective complexes, with \(\theta :X'\rightarrow X\) being the induced simplicial map. Suppose we have already constructed \(X'\). There are two steps in constructing the complex X:
Adding vertices and edges. We translate the lattice by picking a point uniformly at random from the cell at the origin, which can be done using random walks in polytopes [24]. We compute the set of full permutahedra by finding the closest lattice point for each point in P [11, Alg. 4, Chap. 20]. Then, for each full cell \(\pi \), we go over \(\mathcal {NBR}(\pi ){\setminus }\pi \) to find neighboring full cells. If a full neighbor is found, we add an edge between \(\pi \) and its neighbor. This completes the 1skeleton X.

Those which are images of simplices of \(X'\) under \(\theta \): To construct these, we first construct \(\theta \) for vertices of X. Then we go over each simplex \(\sigma =(v_0,\ldots ,v_k)\in X'\) and add the simplex \(\theta (\sigma )\) on the vertices \((\theta (v_0),\ldots ,\theta (v_k))\) of X.

Those which are not in the image of \(\theta \), that is, those simplices which are included in the tower at scale \(\alpha \): Each such simplex \(\sigma \) must contain at least one edge which is not in the image of \(\theta \), since otherwise all edges of \(\sigma \) and hence \(\sigma \) itself would be under the image of \(\theta \). We first enumerate all the edges which are not in the image of \(\theta \). To do this, for each edge \((u,v)\in X'\), we calculate \((\theta (u),\theta (v))\) and exclude them from the list of edges of X, to get the list of new edges. To complete the kskeleton, we go over the new edges of the complex in an arbitrary order, and at each step we add the new simplices induced by the current edge. Let (u, v) be the edge under consideration. We construct the simplices incident to (u, v) inductively by dimension. The base case is the 1skeleton, with simplex (u, v). Assume that we have completed the \((j1)\)simplices incident to (u, v). Let \(\sigma \) be a jsimplex incident to (u, v). Then, \(\sigma \) is of the form \(\sigma =w*\gamma \), where \(\gamma \) is a \((j1)\)simplex incident to (u, v) and w is a full cell which is a common neighbor of u and v. To find \(\sigma \), we go over each \(2^{O(d)}\) common neighbors of u and v and each \((j1)\)simplex \(\gamma \) containing (u, v), and test whether \(w*\gamma \) is a jsimplex in the complex. The test works by checking whether each \(w*\gamma _i\) is a \((j1)\)simplex in the complex, where \(\gamma _i\) is a facet of \(\gamma \). Since we enumerate all the simplices attached to each new edge, this step generates all simplices included in the tower at scale \(\alpha \).
Theorem 5.9
Algorithm 5.8 takes \(n2^{O(d)}\log \Delta +M2^{O(d)}\) time in expectation and M space to compute the kskeleton, where M is the size of the tower. Additionally, the expected runtime is upper bounded by \(n2^{O(d)}\log \Delta +n2^{O(d\log d)}\) and the expected space is upper bounded by \(n2^{O(d\log d)}\).
Proof
At each scale, picking the origin takes \(\mathrm{poly}(d)\) time [24]. Finding the closest lattice vertex for any given input point takes \(O(d^2)\) time [11, Alg. 4, Chap. 20]. Therefore, finding the full vertices at each scale takes \(O(nd^2)\) time per scale, and in total \(O(nd^2\log \Delta )\) time. Each cell has \(2^{O(d)}\) neighbors, so finding the full neighbors and adding the edges takes \(n2^{O(d)}\) per scale. Computing the map \(\theta \) for the vertices of \(X_1\) takes time \(O(nd^2)\) per scale. In total, these steps take \(n2^{O(d)}\log \Delta \) time.
For each simplex of \(X'\), we compute the image under \(\theta \). This takes time O(d) per simplex of \(X'\), since the vertex map has already been established. From Lemma 5.7, each simplex in the tower survives at most four scales (in expectation) under \(\theta \), until it collapses to a vertex. Therefore, for each simplex in the tower, we compute its related images four times in expectation. This step takes 4MO(d) time over the tower, in expectation.
Computing \(\theta \) for the edges of \(X'\) takes time O(1) time per edge, since we already computed the vertex map. Finding new edges takes \(n2^{O(d)}\) time, since that is the maximum number of edges at any scale. In total, finding new edges takes \(n2^{O(d)}\log \Delta \) time. To complete the kskeleton, the testing technique requires an overhead of \(k^22^{O(d)}=2^{O(d)}\) for each simplex in the tower. Since we do the kcompletion only for newly added edges, the test is not repeated for any simplex. The time bound follows.
The space complexity follows by storing the tower. The expected size of the tower is upper bounded by \(n2^{O(d\log d)}\), from Theorem 5.6. The claims follow. \(\square \)
It is possible to compute the persistence barcode of towers in a streaming setting [21], where instead of storing the entire tower in memory, the complex is constructed at each scale and fed to the output stream. In this setting, Algorithm 5.8 only needs to store the kskeleton of the current scale in memory, to complete the kskeleton of the next scale. So, the maximum memory consumption of Algorithm 5.8 comes down to \(n2^{O(d\log k)}\), which is the maximum size of the kskeleton per scale, using Theorem 5.1.
In Algorithm 5.8, to construct the edges of the complex at each scale, we scan the neighborhood of each full cell. By adding the edges in more careful method, we reduce the complexity of this step. Let W denote a \(\frac{1}{6d^2}\)WSPD on P. Let \(\alpha '<\alpha \) be any two consecutive scales of the tower, with \(X',X\) being the complexes at the respective scales. Let \(\theta :X'\rightarrow X\) be the induced simplicial map. For any permutahedron \(\pi \), let \(\mathcal {NBR}(\mathcal {NBR}(\pi ))\) denote the union of the collections of cells \(\mathcal {NBR}(\pi _i)\) for each cell \(\pi _i\in \mathcal {NBR}(\pi )\).
Lemma 5.10

There exist adjacent full cells \(u,v\in X'\) such that \(\theta (u,v)=(\pi _1,\pi _2)\), that is, \((\pi _1,\pi _2)\) is the image of an edge from the previous scale. In such a case, we call \((\pi _1,\pi _2)\) an inherited edge.

There exist full cells \(u,v\in X'\) such that \(\theta (u)=\pi _1\) and \(\theta (v)=\pi _2\), but (u, v) is not an edge in \(X'\). We call \((\pi _1,\pi _2)\) an interactive edge.

At least one of \(\{\pi _1,\pi _2\}\) have no preimage in \(X'\) under \(\theta \), that is, there do not exist \(u,v\in X'\), such that \(\theta (u)=\pi _1\) and \(\theta (v)=\pi _2\) both hold. In such a case we call \((\pi _1,\pi _2)\) a split edge.

There exists a pair \((A,B)\in W\) such that \(\alpha \) is a critical scale for (A, B).

Let \(\pi _3\) be the permutahedron containing \(\mathrm{rep}(A)\) at \(\alpha \). Then, \(\pi _1\) and \(\pi _2\) are cells in \(\mathcal {NBR}(\mathcal {NBR}(\pi _3))\).
Proof
Since the three edge classes are exhaustive, each edge of X is either an inherited edge, or an interactive edge, or a split edge.

Let u, v be distinct full cells at \(\alpha '\) such that \(\theta (u)=\pi _1\) and \(\theta (v)=\pi _2\). Since u and v are full cells, there exist points \(p_1,p_2\in P\) such that \(p_1\in u,p_2\in v\), and \(p_1\) and \(p_2\) are the closest points to centers of u and v, respectively. At \(\alpha \), \(p_1\in \pi _1\) and \(p_2\in \pi _2\). Let (A, B) be a WSPD pair which covers \((p_1,p_2)\), that is, \(p_1\in P_A,p_2\in P_B\). Using Lemma 5.3, \(\alpha \) is a critical scale for (A, B).

Using Lemma 5.2, points of \(P_A\) lie in \(\mathcal {NBR}(\pi _3)\), so \(\pi _1\in \mathcal {NBR}(\pi _3) \). Since \(\pi _2\in \mathcal {NBR}(\pi _1)\), the claim follows. \(\square \)
Algorithm 5.11
There are two stages in the algorithm.
Stage 1. We compute a \(\frac{1}{6d^2}\)WSPD W on P. For each WSPD pair \((A,B)\in W\), the two critical scales are determined using \( \hat{d}(A,B) \). For each scale, we store the WSPD pairs for which the scale is critical.
Stage 2. We construct the complex scale by scale. For this, let \(\alpha '<\alpha \) be any two consecutive scales. Suppose we have constructed the complex \(X'\) at \(\alpha '\). We choose the origin at \(\alpha \) as in Algorithm 5.8. To construct the complex X at \(\alpha \), we start by finding the full vertices by mapping points of P to their closest lattice point. Then we calculate the vertex map from \(X'\) to X which induces the simplicial map \(\theta :X'\rightarrow X\).
The simplices in X are of two kinds: those which are images of \(\theta \) and those which are not. For simplices of the former kind, we use the vertex map to compute the image under \(\theta \), and add it to X. For the latter case, each simplex must contain a new edge, since otherwise the simplex was already in the image of \(\theta \). To compute these new edges at \(\alpha \), we use Lemma 5.10: the only new edges at this scale are the interactive and split edges.
Step 1. We process all WSPD pairs which are critical at \(\alpha \). Let (A, B) be the current pair and let \(\pi \) denote the permutahedron which contains \(\mathrm{rep}(A)\). For each cell \(\pi '\in \mathcal {NBR}(\pi )\), we add edges of \(\pi '\) with full cells of \(\mathcal {NBR}(\pi '){\setminus }\pi '\). This amounts to adding edges between all pairs of adjacent full cells in \(\mathcal {NBR}(\mathcal {NBR}(\pi ))\). By Lemma 5.10, all interactive edges are added by this procedure.
Step 2. We collect the full cells which do not have a preimage under \(\theta \). This is done by excluding the images of the vertices of \(X'\) under \(\theta \), from the set of vertices of X. For each such full cell \(\pi \), we go over \(\mathcal {NBR}(\pi ){\setminus }\pi \) and add edges with full cells. This step enumerates all split edges (Lemma 5.10).
Step 3. Steps 1 and 2 generate the new edges of X. With this information, we enumerate the kskeleton of X, using the technique from Algorithm 5.8.
Theorem 5.12
Proof
In Stage 1, we compute a \(\frac{1}{6d^2}\)WSPD. This takes time \(n\log n 2^{O(d)}+W\). For each WSPD pair we calculate two critical scales. This takes O(1) time per pair, so O(W) in total. Stage 1, therefore, takes \(n\log n 2^{O(d)}+O(W)\) time.

In Step 1, we add edges between adjacent full cells of \(\mathcal {NBR}(\mathcal {NBR}(\pi ))\). There are \(2^{O(d)}\) such cells, so it takes \(2^{O(d)}\) time per WSPD pair per critical scale. Since there are O(W) such instances, in total this step takes \(2^{O(d)}W\) time.

In Step 2, we inspect the neighbors of full cells which do not have a preimage under \(\theta \). The number of such full cells is the number of vertex inclusions in the tower, which is upper bounded by M. Per cell, this takes \(2^{O(d)}\) time, so this step takes no more than \(2^{O(d)}M\) time in total.

In Step 3, the new edges are the inherited and split edges. Each such edge survives four scales in expectation, from Lemma 5.7, so the expected number of new edges in the tower is upper bounded by 4M. This is also the time required to find the new edges. Completing the kskeleton has an overhead of \(k^22^{O(d)}\) per simplex in the tower as in Algorithm 5.8, so it takes \(M2^{O(d)}\) time in total.
Storing the critical scales for each WSPD pair takes O(1) space per pair. Additionally, we store the tower. The space bound follows.
In the worst case, \(W=n(6d^2)^{O(d)}=n2^{O(d\log d)}\) and M is upperbounded by \(n2^{O(d\log d)}\) in expectation. The claim follows. \(\square \)
Algorithm 5.11 can be used in a streaming setting, similar to Algorithm 5.8. In this case, the memory consumption of Algorithm 5.11 is \(O(W)+M_i\), where \(M_i\) is the size of the complex at any scale. Since W can be as large as \(n2^{O(d\log d)}\) and \(M_i\) can be at most \(n2^{O(d\log k)}\)(Theorem 5.1), the space requirement is at most \(n2^{O(d\log d)}\).
If the spread is a constant, then Algorithm 5.8 has a better runtime, since it does not compute the WSPD. Also, Algorithm 5.8 does not have to store the critical scales of the WSPD, neither in the normal setting nor in the streaming environment, so it is more spaceefficient. However, if the spread is large, then Algorithm 5.11 achieves better runtime, since it avoids the \(n2^{O(d)}\log \Delta \) factor in the complexity of Algorithm 5.8.
6 Dimension Reduction
For large d, our approximation complex plays nicely together with dimension reduction techniques. We start with noting that interleavings satisfy the triangle inequality. This result is folklore; see [7, Thm. 3.3] for a proof in a generalized context.
Lemma 6.1
Let \((A_\beta )\), \((B_\beta )\), and \((C_\beta )\) be persistence modules. If \((A_\beta )\) is a \(t_1\)approximation of \((B_\beta )\) and \((B_\beta )\) is a \(t_2\)approximation of \((C_\beta )\), then \((A_\beta )\) is a \((t_1t_2)\)approximation of \((C_\beta )\).
The following statement is a simple application of interleaving distances from [9].
Lemma 6.2
Proof
As a first application, we show that we can shrink the approximation size from Theorem 5.6 for the case \(d\gg \log n\), only worsening the approximation quality by a constant factor.
Theorem 6.3
Let P be a set of n points in \(\mathbb {R}^d\). There exists a constant c and a discrete tower of the form \((\overline{X}_{(c\log n)^{2k}})_{k\in \mathbb {Z}}\) that is \((3c\log n)\)interleaved with the Rips filtration of P and has only \(n^{O(\log \log n)}\) simplices in expectation. With high success probability, we can compute the tower in deterministic expected running time \(n(\log n)^2O(\log \Delta )+n^{O(\log \log n)}\) using Algorithm 5.11.
Proof
The famous lemma of Johnson and Lindenstrauss [18] asserts the existence of a map f as in Lemma 6.2 for \(m=\lambda \log n/\varepsilon ^2\) with some absolute constant \(\lambda \) and \(\xi _1=(1\varepsilon )\), \(\xi _2=(1+\varepsilon )\). Choosing \(\varepsilon =1/2\), we obtain that \(m=O(\log n)\) and \(\xi _2/\xi _1=3\). With \(\overline{\mathcal {R}}_{\alpha }\) the Rips complex of the Johnson–Lindenstrauss transform, we have therefore that \((H_{*}(\overline{\mathcal {R}}_\alpha ))_{\alpha \ge 0}\) is a 3approximation of \((H_{*}(\mathcal {R}_\alpha ))_{\alpha \ge 0}\). Moreover, using the approximation scheme from this section, we can define a tower \((\overline{X}_\beta )_{\beta \ge 0}\) whose induced persistence module \((H_{*}(X_{\beta }))_{\beta \ge 0}\) is a \(6(m+1)\)approximation of \((H_{*}(\overline{\mathcal {R}}_\alpha ))_{\alpha \ge 0}\), and its expected size is \(n2^{O(\log n\log \log n)}=n^{O(\log \log n)}\). The first half of the result follows using Lemma 6.1.
The Johnson–Lindenstrauss lemma further implies that an orthogonal projection to a randomly chosen subspace of dimension m will yield an f as above, with high probability. Our algorithm picks such a subspace, projects all points into this subspace (this requires \(O(dn\log n)\) time) and applies the approximation scheme for the projected point set. The runtime bound follows from Theorem 5.12. \(\square \)
Note that the approximation complex from the previous theorem has size \(n^{O(\log \log n)}\) which is superpolynomial in n. Using a slightly more elaborate dimension reduction result by Matoušek [25], we can get a size bound polynomial in n, at the price of an additional \(\log n\)factor in the approximation quality. Let us first state Matoušek’s result (whose proof follows a similar strategy as for the Johnson–Lindenstrauss lemma):
Theorem 6.4
Let P be an npoint set in \(\mathbb {R}^d\). Then, a random orthogonal projection into \(\mathbb {R}^k\) for \(3\le k\le C\log n\) distorts pairwise distances in P by at most \(O(n^{2/k}\sqrt{\log n/k})\). The constants in the bound depend only on C.
By setting \(k:=\frac{4\log n}{\log \log n}\) in Matoušek’s result, we see that this results in a distortion of at most \(O(\sqrt{\log n \log \log n})\).
Theorem 6.5
Proof
The proof follows the same pattern of Theorem 6.3 with a few changes. We use Matoušek’s dimension reduction result described in Theorem 6.4 with the projection dimension being \(m:=\frac{4\log n}{\log \log n}\). Hence, \(\xi _2/\xi _1=O(\sqrt{\log n \log \log n})\) for the Rips construction. The final approximation factor is \(6(m+1)\xi _2/\xi _1\) which simplifies to \(O(\log n \big (\frac{\log n}{\log \log n}\big )^{1/2})\). The size and runtime bounds follow by substituting the value of m in the respective bounds. \(\square \)
Finally, we consider the important generalization that P is not given as an embedding in \(\mathbb {R}^d\), but as a point sample from a general metric space. We use the classical result by Bourgain [6] to embed P in Euclidean space with small distortion. In the language of Lemma 6.2, Bourgain’s result permits an embedding into \(m=O(\log ^2 n)\) dimensions with a distortion \(\xi _2/\xi _1=O(\log n)\), where the constants are independent of n. Our strategy for approximating a general metric space consists of first embedding it into \(\mathbb {R}^{O(\log ^2 n)}\), then reducing the dimension, and finally applying our approximation scheme on the projected embedding. The results are similar to Theorems 6.3 and 6.5, except that the approximation quality further worsens by a factor of \(\log n\) due to Bourgain’s embedding. We only state the generalized version of Theorem 6.5, omitting the corresponding generalization of Theorem 6.3. The proof is straightforward with the same techniques as before.
Theorem 6.6
7 A Lower Bound for Approximation Schemes
We describe a point configuration for which the Čech filtration gives rise to a large number, say N, of features with “large” persistence, relative to the scale on which the persistence appears. Any \(\varepsilon \)approximation of the Čech filtration, for \(\varepsilon \) small enough, has to contain at least one interval per such feature in its persistent barcode, yielding a barcode of size at least N. This constitutes a lower bound on the size of the approximation itself, at least if the approximation stems from a simplicial tower: in this case, the introduction of a new interval in the barcode requires at least one simplex to be added to the tower; also more generally, it makes sense to assume that any representation of a persistence module is at least as large as the size of the resulting persistence barcode.
To formalize what we mean by a “large” persistent feature, we call an interval \((\alpha ,\alpha ')\) of \((H_*(\mathcal {C}_\alpha ))_{\alpha \ge 0}\, \delta \)significant for \(0<\delta <\frac{\alpha '\alpha }{2\alpha '}\). Our approach from above translates into the following statement:
Lemma 7.1
For \(\delta >0\), and a point set P, let N denote the number of \(\delta \)significant intervals of \((H_*(\mathcal {C}_\alpha ))_{\alpha \ge 0}\). Then, any persistence module \((X_\alpha )_{\alpha \ge 0}\) that is a \((1+\delta )\)approximation of \((H_*(\mathcal {C}_\alpha ))_{\alpha \ge 0}\) has at least N intervals in its barcode.
Proof
If \((\alpha ,\alpha ')\) is \(\delta \)significant, that means that there exist some \(\varepsilon >0\) and \(c\in (\alpha ,\alpha ')\) such that \(\alpha /(1\varepsilon )\le c/(1+\delta )<c(1+\delta )\le \alpha '\). Any persistence module that is an \((1+\delta )\)approximation of \((H_*(\mathcal {C}_\alpha ))_{\alpha \ge 0}\) needs to represent an approximation of the interval in the range \((c(1\varepsilon )/2,c)\); in other words, there is an interval corresponding to \((\alpha ,\alpha ')\) in the approximation.
We first argue that \(\delta \)significance implies the existence of \(\varepsilon >0\) and \(c\in (\alpha ,\alpha ')\) such that \(\alpha /(1\varepsilon )\le c/(1+\delta )<c(1+\delta )\le \alpha '\): We choose \(c:=\alpha '/(1+\delta )\), so that the last inequality is satisfied. For the first inequality, we note first that \((12\delta )<{1}/{(1+\delta )^2}\) for all \(\delta <1/2\). By assumption, \(\alpha '\alpha >2\alpha '\delta \), so \(\alpha<\alpha '(12\delta )<{\alpha '}/{(1+\delta )^2}={c}/({1+\delta })\). Since the inequality is strict, we can choose some small \(\varepsilon >0\), such that \(\alpha /(1\varepsilon )\le {c}/({1+\delta })\).
7.1 Setup
We next define our point set for a fixed dimension d. Consider the \(A^*\) lattice with origin o. Recall that o has \(2^{d+1}2\) neighbors in the Delaunay triangulation \(\mathcal {D}\) of \(A_d^*\), because its dual Voronoi polytope, the permutahedron \(\Pi \), has that many facets. We define P as the union of o with all its Delaunay neighbors, yielding a point set of cardinality \(2^{d+1}1\). As usual, we set \(n:=P\), so that \(d=\Theta (\log n)\).
We write \(\mathcal {D}_P\) for the Delaunay triangulation of P. Since P contains o and all its neighbors, the Delaunay simplices of \(\mathcal {D}_P\) incident to o are the same as the Delaunay simplices of \(\mathcal {D}\) incident to o. Thus, according to Proposition 3.5, a \((k1)\)simplex of \(\mathcal {D}_P\) incident to o corresponds to a \((dk+1)\)face of \(\Pi \) and thus to an ordered kpartition of \([d+1]\).
Fix an integer parameter \(\ell \ge 3\), to be defined later. We call an ordered kpartition \((S_1,\ldots ,S_k)\)good, if \(S_i\ge \ell \) for every \(i=1,\ldots ,k\). We define good Delaunay simplices and good permutahedron faces accordingly using Proposition 3.5.
Our proof has two main ingredients: First, we show that a good Delaunay simplex either gives birth to or kills an interval in the Čech module that has a lifetime of at least \(\frac{\ell }{8(d+1)^2}\). This justifies our notion of “good”, since good ksimplices create features that have to be preserved by a sufficiently precise approximation. Second, we show that there are \(2^{\Omega (d\log \ell )}\) good kpartitions, so good faces are abundant in the permutahedron.
7.2 Persistence of Good Simplices
Let us consider our first statement. Recall that \(\alpha _\sigma \) is the tower value of \(\sigma \) in the Čech filtration. It will be convenient to have an upper bound for \(\alpha _\sigma \). Clearly, such a value is given by the diameter of P. It is not hard to see the following bound (compare Lemma 3.3), which we state for reference:
Lemma 7.2
The diameter of P is at most \(2\sqrt{d}\). Consequently, \(\alpha _\sigma \le 2\sqrt{d}\) for each simplex \(\sigma \) of the Čech filtration.
Recall that by fixing a simplexwise tower of the Čech filtration, it makes sense to talk about the persistence of an interval associated to a simplex. Fix a \((k1)\)simplex \(\sigma \) of \(\mathcal {D}_P\) incident to o (which also belongs to the Čech filtration).
Lemma 7.3
Let \(f_\sigma \) be the \((dk)\)face of \(\Pi \) dual to \(\sigma \), and let \(o_\sigma \) denote its barycenter. Then, \(\alpha _\sigma \) is the distance of \(o_\sigma \) from o.
Proof
\(o_\sigma \) is the closest point to o on \(f_\sigma \) because \(\mathbf {o}o_\sigma \) is orthogonal to \(\mathbf {p}o_\sigma \) for any boundary vertex p of \(f_\alpha \). Since \(f_\sigma \) is dual to \(\sigma \), all vertices of \(\sigma \) are in same distance to \(o_\sigma \). \(\square \)
Recall \(L_\sigma \) and \(L^*_\sigma \) from Sect. 2 as the difference of the alpha value of \(\sigma \) and its (co)facets.
Theorem 7.4
For a good simplex \(\sigma \) of \(\mathcal {D}_P\), both \(L_\sigma \) and \(L^*_\sigma \) are at least \(\frac{\ell }{24(d+1)^{3/2}}\).
Proof
We start with \(L^*_\sigma \). Let \(\sigma \) be a \((k1)\)simplex and let \(S_1,\ldots ,S_k\) be the corresponding partition. We obtain a cofacet \(\tau \) of \(\sigma \) through splitting one \(S_i\) into two nonempty parts.
The main step is to bound the quantity \(\alpha _\tau ^2\alpha _\sigma ^2\). By Lemma 7.3, the alpha values are the squared norms of the barycenters \(o_\tau \) of \(\tau \) and \(o_\sigma \) of \(\sigma \), respectively. It is possible to derive an explicit expression of the coordinates of \(o_\sigma \) and \(o_\tau \). It turns out that almost all coordinates are equal, and thus cancel out in the sum, except at those indices that lie in the split set \(S_i\).
Recall that \(\alpha ^2_\sigma \) is the squared length of the barycenter \(o_{\sigma }\), and an analogue statement holds for \(o_\tau \). Also, recall that \(\tau \) is obtained from \(\sigma \) by splitting one \(S_i\) in the corresponding partition \((S_1,\ldots ,S_k)\) of \(\sigma \). Assume wlog that \(S_k\) is split into \(S'_k\) and \(S'_{k+1}\) (splitting any other \(S_i\) yields the same bound) and that \(S_k\) is of size exactly \(\ell \) (a larger cardinality only leads to a larger difference).
As a consequence of Theorem 7.4, the interval associated with a good simplex has length at least \(\frac{\ell }{24(d+1)^{3/2}}\) using Lemmas 2.1 and 2.2. Moreover, the interval cannot persist beyond the scale \(2\sqrt{d}\) by Lemma 7.2. It follows
Corollary 7.5
The interval associated to a good simplex is \(\delta \)significant for \(\delta <\frac{\ell }{96(d+1)^2}\).
7.3 The Number of Good Simplices
We assume for simplicity that \(d+1\) is divisible by \(\ell \). We call a good partition \((S_1,\ldots ,S_k)\)uniform, if each set consists of exactly\(\ell \) elements. This implies that \(k=(d+1)/\ell \).
Lemma 7.6
The number of uniform good partitions is exactly \(\frac{(d+1)!}{\ell !^{(d+1)/\ell }}\).
Proof
Choose an arbitrary permutation and place the first \(\ell \) entries in the \(S_1\), the second \(\ell \) entries in \(S_2\), and so forth. In each \(S_i\), we can interchange the elements and obtain the same ksimplex. Thus, we have to divide out \(\ell !\) choices for each of the \((d+1)/\ell \) bins. \(\square \)
We use this result to bound the number of good ksimplices in the following theorem. To obtain the bound, we use estimates for the factorials using Stirling’s approximation. Moreover, we fix some constant \(\rho \in (0,1)\) and set \(\ell =(d+1)^\rho \). After some calculations (see Appendix), we obtain:
Theorem 7.7
For any constant \(\rho \in (0,1)\), \(\ell =(d+1)^\rho \), \(k=(d+1)/\ell \) and d large enough, there exists a constant \(\lambda \in (0,1)\) that depends only on \(\rho \), such that the number of good ksimplices is at least \((d+1)^{\lambda (d+1)}=2^{\Omega (d\log d)}\).
Putting everything together, we prove our lower bound theorem:
Theorem 7.8
There exists a point set of n points in \(d=\Theta (\log n)\) dimensions, such that any \((1+\delta )\)approximation of its Čech filtration contains \(2^{\Omega (d\log d)}\) intervals in its persistent barcode, provided that \(\delta <\frac{1}{96(d+1)^{1+\varepsilon }}\) with an arbitrary constant \(\varepsilon \in (0,1)\).
Proof
Setting \(\rho :=1\varepsilon \), Theorem 7.7 guarantees the existence of \(2^{\Omega (d\log d)}\) good simplices, all in a fixed dimension k. In particular, the intervals of the Čech persistence module associated to these intervals are all distinct. Since \(\ell =(d+1)^{1\varepsilon }\), Corollary 7.5 states that all these intervals are significant because \(\delta <\frac{1}{96d^{1+\varepsilon }}=\frac{\ell }{96(d+1)^2}\). Therefore, by Lemma 7.1, any \((1+\delta )\)approximation of the Čech filtration has \(2^{\Omega (d\log d)}\) intervals in its barcode. \(\square \)
Replacing d by \(\log n\) in the bounds of theorem, we see the number of intervals appearing in any approximation superpolynomial is n if \(\delta \) is small enough.
8 Conclusion
We presented upper and lower bound results on approximating Rips and Čech filtrations of point sets in arbitrarily high dimensions. For Čech complexes, the major result can be summarized as: for a dimensionindependent bound on the complex size, there is no way to avoid a superpolynomial complexity for fine approximations of about \(O(\log ^{1} n)\), while polynomial size can be achieved for rough approximation of about \(O(\log ^2 n)\).
Filling in the large gap between the two approximation factors is an attractive avenue for future work. A possible approach is to look at other lattices. It seems that lattices with good covering properties are correlated with a good approximation quality, and it may be worthwhile to study lattices in higher dimension which improve largely on the covering density of \(A^*\) (e.g., the Leech lattice [11]).
Further research directions include approximations using smallsized triangulations of cubes, such as the barycentric subdivision. Since the ratio of the diameter to the shortest distance between nonadjacent cells is less for cubes compared to permutahedra, this approach can yield superior quality approximations of comparable size. Another possibility for approximating Čech filtrations is to approximate the union of balls with small permutahedra and to take its nerve as the approximation complex. This amounts to replacing the original input points with a fine sample of the union of balls. The approach shows potential for \((1+\varepsilon )\)approximations.
Footnotes
 1.
Often, a scaled, translated and rotated version is considered, in which all permutations of the point \((1,\ldots ,d+1)\) are taken.
Notes
Acknowledgements
Open access funding provided by the Max Planck Society. Sharath Raghvendra acknowledges support of NSF CRII Grant CCF1464276. Michael Kerber is supported by the Austrian Science Fund (FWF) grant number P 29984N35.
References
 1.Baek, J., Adams, A.: Some useful properties of the permutohedral lattice for Gaussian filtering. Stanford University. http://graphics.stanford.edu/papers/permutohedral/permutohedral_techreport.pdf (2009)
 2.Bambah, R.P.: On lattice coverings by spheres. Proc. Indian Natl. Sci. Acad. 20(1), 25–52 (1954)MathSciNetzbMATHGoogle Scholar
 3.Bernoulli’s inequality. https://en.wikipedia.org/wiki/Bernoulli’s_inequality
 4.Borsuk, K.: On the imbedding of systems of compacta in simplicial complexes. Fundam. Math. 35, 217–234 (1948)MathSciNetCrossRefzbMATHGoogle Scholar
 5.Botnan, M.B., Spreemann, G.: Approximating persistent homology in Euclidean space through collapses. Appl. Algebra Eng. Commun. Comput. 26(1–2), 73–101 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
 6.Bourgain, J.: On Lipschitz embedding of finite metric spaces in Hilbert space. Isr. J. Math. 52(1–2), 46–52 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
 7.Bubenik, P., Scott, J.A.: Categorification of persistent homology. Discrete Comput. Geom. 51(3), 600–627 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
 8.Callahan, P.B., Kosaraju, S.R.: A decomposition of multidimensional point sets with applications to \(k\)nearestneighbors and \(n\)body potential fields. J. ACM 42(1), 67–90 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
 9.Chazal, F., CohenSteiner, D., Glisse, M., Guibas, L.J., Oudot, S.Y.: Proximity of persistence modules and their diagrams. In: Proceedings of the 25th Annual Symposium on Computational Geometry (SoCG’09), pp. 237–246. ACM, New York (2009)Google Scholar
 10.Choudhary, A., Kerber, M., Raghvendra, S.: Polynomialsized topological approximations using the permutahedron. In: Proceedings of the 32nd International Symposium on Computational Geometry (SoCG’16). Leibniz International Proceedings in Informatics, vol. 51, pp. 1–16. Schloss Dagstuhl, Dagstuhl (2016)Google Scholar
 11.Conway, J.H., Sloane, N.J.A.: Sphere Packings, Lattices, and Groups. Grundlehren der Mathematischen Wissenschaften, vol. 290. With additional contributions by Bannai, E. et al. Springer, New York (1988)Google Scholar
 12.Dey, T.K., Fan, F., Wang, Y.: Computing topological persistence for simplicial maps. In: Proceedings of the 30th Annual Symposium on Computational Geometry (SoCG’14), pp. 345–354. ACM, New York (2014)Google Scholar
 13.Edelsbrunner, H., Harer, J.L.: Computational Topology: An Introduction. American Mathematical Society, Providence (2010)zbMATHGoogle Scholar
 14.Edelsbrunner, H., Mücke, E.P.: Simulation of simplicity: a technique to cope with degenerate cases in geometric algorithms. ACM Trans. Graph. 9(1), 66–104 (1990)CrossRefzbMATHGoogle Scholar
 15.Hales, T.C.: A proof of the Kepler conjecture. Ann. Math. 162(3), 1065–1185 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
 16.HarPeled, S.: Geometric Approximation Algorithms. Mathematical Surveys and Monographs, vol. 173. American Mathematical Society, Providence (2011)zbMATHGoogle Scholar
 17.Hatcher, A.: Algebraic Topology. Cambridge University Press, Cambridge (2002)zbMATHGoogle Scholar
 18.Johnson, W.B., Lindenstrauss, J., Schechtman, G.: Extensions of Lipschitz maps into Banach spaces. Isr. J. Math. 54(2), 129–138 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
 19.Jung, H.: Über die kleinste Kugel, die eine räumliche Figur einschliesst. J. Reine Angew. Math. 123, 241–257 (1901)MathSciNetzbMATHGoogle Scholar
 20.Kerber, M., Raghvendra, S.: Approximation and streaming algorithms for projective clustering via random projections. In: Proceedings of the 27th Canadian Conference on Computational Geometry (CCCG’15), pp. 179–185 (2015)Google Scholar
 21.Kerber, M., Schreiber, H.: Barcodes of towers and a streaming algorithm for persistent homology. In: Accepted to Proceedings of the 33rd International Symposium on Computational Geometry (SoCG’17), pp. 57:1–57:15 (2017)Google Scholar
 22.Kerber, M., Sharathkumar, R.: Approximate Čech complex in low and high dimensions. In: Cai, L., Cheng, S.W., Lam, T.W. (eds.) Algortihms and Computation (ISAAC’13). Lecture Notes in Computer Science, vol. 8283, pp. 666–676. Springer, Heidelberg (2013)Google Scholar
 23.Khuller, S., Matias, Y.: A simple randomized sieve algorithm for the closestpair problem. Inform. and Comput. 118(1), 34–37 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
 24.Lovász, L., Simonovits, M.: Random walks in a convex body and an improved volume algorithm. Random Struct. Algorithms 4(4), 359–412 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
 25.Matoušek, J.: BiLipschitz embeddings into lowdimensional Euclidean spaces. Commentat. Math. Univ. Carol. 31(3), 589–600 (1990)MathSciNetzbMATHGoogle Scholar
 26.Munkres, J.R.: Elements of Algebraic Topology. Westview Press, Boulder (1984)zbMATHGoogle Scholar
 27.Rennie, B.C., Dobson, A.J.: On Stirling numbers of the second kind. J. Comb. Theory 7(2), 116–121 (1969)MathSciNetCrossRefzbMATHGoogle Scholar
 28.Sheehy, D.: Linearsize approximations to the Vietoris–Rips filtration. Discrete Comput. Geom. 49(4), 778–796 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
 29.Sheehy, D.: The persistent homology of distance functions under random projection. In: Proceedings of the 30th Annual Symposium on Computational Geometry (SoCG’14), pp. 328–334. ACM, New York (2014)Google Scholar
 30.Smid, M.H.M.: The wellseparated pair decomposition and its applications. In: Gonzalez, T.F. (ed.) Handbook of Approximation Algorithms and Metaheuristics, pp. 531–5312. Chapman and Hall/CRC, Boca Raton (2007)Google Scholar
 31.Stirling’s approximation for factorials. https://en.wikipedia.org/wiki/Stirling’s_approximation
 32.Ziegler, G.M.: Lectures on Polytopes. Graduate Texts in Mathematics, vol. 152. Springer, New York (1995)CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.