Divisive cover

The aim of this paper is to present a method for computation of persistent homology that performs well at large filtration values. To this end we introduce the concept of filtered covers. We show that the persistent homology of a bounded metric space obtained from the \v{C}ech complex is the persistent homology of the filtered nerve of the filtered \v{C}ech cover. Given a parameter $\delta$ with $0<\delta \le 1$ we introduce the concept of a $\delta$-filtered cover and show that its filtered nerve is interleaved with the \v{C}ech complex. Finally, we introduce a particular $\delta$-filtered cover, the divisive cover. The special feature of the divisive cover is that it is constructed top-down. If we disregard fine scale structure and $X$ is a finite subspace of euclidean space, then we obtain a filtered simplicial complex whose size is bounded by an upper bound independent of the cardinality of $X$. The time needed to compute this filtered simplicial complex depends linearly on the cardinality of $X$.


Introduction
The concept of persistent homology was introduced in Edelsbrunner et al. [2000] and has since been used in a wide range of applications. The persistent homology of a finite metric space X can be approached by using several different constructions of filtered simplicial complexes, such as theČech complex, Vietoris-Rips complex or witness complex. Several approximations of the Vietoris-Rips complex have recently been proposed to speed up calculations Sheehy [2013], Oudot and Sheehy [2015], Dey et al. [2016].
In this paper we construct a new approximation to theČech complex computing persistent homology down to a predefined threshold that can be chosen arbitrarily. The complexity of our algorithm grows with the ratio between the radius of X and the threshold. We also present a version with theoretical guarantees on size and time. If X is a subset of d-dimensional Euclidean space then the size of our approximation is bounded by an upper bound that is independent of the cardinality n of X and the required computation time is linear in n. However the constants are so big that this is no improvement in practice.
The method presented here is fundamentally different from existing algorithms for persistent homology. Most approximations to the Vietoris-Rips complex are fundamentally bottom-up Hudson et al. [2010], Sheehy [2013], Oudot and Sheehy [2015], whereas our approach is top-down.
We introduce the notion of filtered nerve of a filtered cover and we give an example of a filtered cover whose filtered nerve has a filtered chain complex which is computationally tractable. Moreover we show that the resulting nerve is interleaved with thě Cech nerve in a multiplicative sense, similar to the Vietoris-Rips complex. In Section 2 we introduce the notion of filtered and δ -filtered covers and show that δ -filtered covers are interleaved with theČech filtration. Section 3 introduces divisive covers, a particular class of δ -filtered covers. Complexity estimates for divisive covers are presented in Section 4. In Section 5 we show how divisive covers can be applied to synthetic and to real world data sets and in Section 6 we discuss our results.

Filtered covers
We introduce the notions of a filtered and δ -filtered cover of a bounded metric space. Throughout this section X = (X, d) will be a fixed but arbitrary bounded metric space. First recall the definition of a cover.
Definition 2.1. A cover of X is a set U of subsets of X such that every point in X is contained in a member of U .
Recall that a simplicial complex K consists of a vertex set V and a set K of subsets of V with the property that if σ is a member of K and if τ is a subset of σ , then τ is a member of K. Also recall that every simplicial complex K has an underlying topological space |K|. The book Lee [2011] may serve as gentle introduction and reference to abstract simplicial complexes.
Definition 2.2. Let U be a cover of X. The nerve N(U ) of U is the simplicial complex with vertex set U defined as follows: A finite subset σ = {U 0 , . . . ,U n } of U is a member of N(U ) if and only if the intersection of U 0 ∩ · · · ∩U n is non-empty.
Note that the nerve construction U → N(U ) is functorial in the sense that if U ⊆ V is an inclusion of covers of X, then we have an induced inclusion N(U ) ⊆ N(V ) of nevers.
If B ⊆ A is an inclusion of partially ordered sets, we say that B is cofinal in A if for every a ∈ A, there exists b ∈ B so that a ≤ b. Given a cover U , we consider it as a partially ordered set with partial order given by inclusion. We will need the following result several times: Since V ⊆ f (V ) for all V ∈ V , the composite N f • Ni is contiguous with the identity map on N(U ) in the sense that for every face σ of N(U ), the set σ ∪ (N f • Ni(σ )) is a face of N(U ). It follows that the geometric realization of N f • Ni is homotopic to the identity on the geometric realization of N(U ) [Spanier, 1966, Lemma 2 p. 130]. Similarly the geometric realization of Ni • N f is homotopic to the identity on the geometric realization of N(V ).
We are now ready to define and establish some basic properties of filtered bases and filtered nerves.
Definition 2.4. A filtered basis of X is a basis U for the metric topology on X with the property that X is a member of U . Given t > 0 we write U t for the cover of X consisting of members of U contained in an open ball in X of radius t.
Definition 2.5. Let U be a filtered basis of X. The filtered nerve of U is the collection Definition 2.6. Let U be a filtered basis of X and let δ be a parameter satisfying 0 < δ ≤ 1. We say that U is a δ -filtered basis of X if for every x ∈ X and every r > 0, there exists a member A of U r containing B(x, δ r).
Example 2.7. TheČech cover C =Č (X) consisting of all balls in X is 1-filtered.
Example 2.8. Let 0 < δ < 1 and choose x ∈ X and R > 0 so that X is contained in the open ball B(x, R). We claim that the subset U =Č (X, δ ) of theČech coverČ (X) consisting of balls of radius δ k R, where k is a nonnegative integer, is a δ -filtered basis of X. Indeed, let k be the nonnegative integer with δ k+1 R ≤ r ≤ δ k R. Since δ r ≤ δ k+1 R, the set B(p, δ r) is contained in the member B(p, δ k+1 R) of U δ k+1 R for every p ∈ X. We can finish the argument by noting that since δ k+1 R ≤ r the cover U δ k+1 R is a subcover of U r . We now introduce some notation regarding persistent homology. For the rest of this section F will denote a fixed but arbitrary field.
A persistence module V = (V t ) t>0 consists of a F-vector space V t for each positive real number t together with homomorphisms for all real numbers t such that for all s < t the following relations hold Given a simplicial complex K, we write H * (K) for the homology of K with coefficients in the field F.
The following example justifies working with the intrisicČech complex instead of the relativeČech complex.
Example 2.9. Let X be a subspace of a metric space M, let C (X) be the filtered basis from Example 2.7. Let C (X, M) be the relativeČech cover consisting of balls in M with center in X, that is, with C (X, M) t consisting of balls in M with center in X of radius at most t.
The homology of the intrinsicČech chain complex C * (X) t consisting of linear combinations of subsets σ ⊆ X with the property that σ ⊆ B(x,t) for some x ∈ X is isomorphic to the homology of N(C (X) t ). Similarly, the homology of the ambienť Cech chain complex C * (X, M) t consisting of linear combinations of subsets σ ⊆ X with the property that σ ⊆ B(p,t) for some p ∈ M is isomorphic to the homology of N(C (X, M) t ). By construction C * (X) t ⊆ C * (X, M) t , and by the triangle inequality By the Nerve Theorem [Hatcher, 2002, Corollary 4G.3], if all non-empty intersections of balls in M are contractible, the geometric realization of the nerve N(C (X, M) t ) of the cover C (X, M) t , consisting of balls in M with center in X of radius at most t, is homotopy equivalent to the union of all balls in M of radius t with center in X. This is the interior of the t-thickening of X in M.
Theorem 2.10 (Relationship between δ -filtered basis andČech complex). Let C be thě Cech cover from Example 2.7, let U be a δ -filtered basis of X and N(C ) and N(U ) be their filtered nerves. Then the persistent homology of N(U ) is multiplicatively (1, 1/δ )-interleaved with the persistent homology of N(C ).
Proof. By definition, the partially ordered set C r is cofinal in C r ∪ U r and U r is cofinal in C δ r ∪ U r . Thus, by Lemma 2.3, the homology H * (N(C r )) is isomorphic to the homology H * (N(C r ∪ U r )) and the homology H * (N(U r )) is isomorphic to the homology H * (N(C δ r ∪ U r )). Now the result follows from functoriality of the nerve construction by considering the composites An easy diagram chase now gives: Corollary 2.11. If t > 0 can be chosen so that in the situation of Theorem 2.10, the Flinear maps H * (N(C δt<t )) and H * (N(C t<t/δ )) are both isomorphisms, then H * (N(C t )) is isomorphic to the image of the homomorphism H * (N(U t<t/δ )).
In [Chazal and Lieutier, 2005, Theorem 1] it is shown that if X is open in M = R d , then the conditions of Corollary 2.11 are satisfied when 2t/δ is smaller than the weak feature size of X. In Chazal and Oudot [2008] these considerations have been extended to similar results when X is a finite subset of a compact subset M of R d . Moreover, [Cohen-Steiner et al., 2005, Homological Inference Theorem] shows similar results for the homological feature size of X.
Next we introduce δ -filtered covers, which do not require the cover to be a basis.
Definition 2.12. Let U be a cover of X, and δ and r be parameters satisfying 0 < δ ≤ 1 and r ≥ 0. We say that U is a δ -filtered cover of X of resolution r if there exists a filtered basis V such that U s is cofinal in V s for all s ≥ r.
Corollary 2.13. Let X be a bounded metric space, r ≥ 0 and U and V be as in Definition 2.12. Then the persistent homology of NU t and the persistent homology of NV t are isomorphic for t ≥ r.
Proof. This is a direct consequence of Lemma 2.3.

Divisive Covers
In this section we discuss an algorithm to construct a δ -filtered cover of a bounded metric space X. First it divides X into two smaller sets. It continues by dividing the biggest of the resulting two sets into two, and then iteratively divides the biggest of the remaining sets in two.
In order to describe the algorithm, we first define diameter and relative radius of a subset of a metric space.
Definition 3.1. Let X be a metric space and let Y be a subset of X.
2. The radius of Y relative to X is defined as Definition 3.2. A δ -division of a subset Y of radius r relative to a bounded metric space X consists of a cover {Y 1 ,Y 2 } of Y consisting of proper subsets of Y with the property that for every y ∈ Y the intersection Y ∩ B(y, δ r) is contained in at least one of the sets Y 1 and Y 2 .
Lemma 3.4. Let U be a δ -divisive cover of resolution r ≥ 0 of a bounded metric space X. If every non-empty subset of U has a minimal element with respect to inclusion, then U is a δ -filtered cover of resolution r.
Proof. Let x ∈ X and let s > r. Let Y ∈ U be minimal under the condition that we have that B(x, δ s) is contained in either Y 1 or Y 2 and Y 1 and Y 2 are proper subsets of Y . This contradicts the minimality of Y .
Corollary 3.5. If U is a finite δ -divisive cover of X, then U is a δ -filtered cover.
There exist many ways to construct δ -divisions. Here is an elementary one: Lemma 3.6. Let Y be a subset of a bounded metric space X and suppose that y 1 and y 2 are points in Y of maximal distance. Given δ with 0 < δ < 1/2 let f = (1 − 2δ )/(1 + 2δ ) and let Y 1 consist of the points y ∈ Y satisfying f d(y, y 1 ) ≤ d(y, y 2 ). Similarly, let Y 2 consist of the points y ∈ Y satisfying f d(y, y 2 ) ≤ d(y, y 1 ).
Proof. Let x ∈ X and let r = r(Y ) be the relative radius of Y . By symmetry we may without loss of generality assume that d(x, y 1 ) ≤ d(x, y 2 ). We will show that if z ∈ B(x, δ r) ∩ Y , then z ∈ Y 1 , that is, that f d(z, y 1 ) ≤ d(z, y 2 ). Since the radius of Y is smaller than or equal to the diameter d(y 1 , y 2 ) of Y it suffices to show that d(x, z) ≤ δ d(y 1 , y 2 ) implies that f d(z, y 1 ) ≤ d(z, y 2 ). However, since Given a bounded metric space X, a method for δ -division and r ≥ 0, we construct in Algorithm 1 a δ -divisive cover U r of X of resolution r. Thus the persistent homology of (U r ) s≥r is δ -interleaved with the persistent homology of (C ) s≥r .

Complexity of the Divisive Cover Algorithm
For the study of complexity of Algorithm 1 we will restrict attention to the situation where X is a finite subset of R d with the L ∞ -metric d ∞ . For 1 ≤ i ≤ d, we let pr i : R d → R be the coordinate projection taking (v 1 , . . . , v d ) ∈ R d to v i .

Algorithm 1: Divisive cover algorithm
Input : A bounded metric space X, a method for δ -division and r ≥ 0 . Let X be a finite subset of R d equipped with the L ∞metric d ∞ and let x 1 and x 2 be points in X of maximal distance. Choose a coordinate projection Proof. Let p ∈ X and let r be the relative radius of X. Note that d(x 1 , x 2 ) = 2r in the situation of the asserted statement. We have to show that the intersection of X with the ball centered in p of radius δ r is contained in one of X 1 and X 2 . Let us for convenience write y 1 = pr i (x 1 ) and y 2 = pr i (x 2 ) and let us assume that y 1 < y 2 . It suffices by construction to show that the interval [pr i (p) − r, pr i (p) + r] is contained in one of the intervals [y 1 , y 1 + (1 + δ )(y 2 − y 1 )/2] and [y 2 − (1 + δ )(y 2 − y 1 )/2, y 2 ]. This follows from the fact that the intersection [y 2 − (1 + δ )(y 2 − y 1 )/2, y 1 + (1 + δ )(y 2 − y 1 )/2] of these intervals has length δ (y 2 − y 1 ) = 2rδ .
Theorem 4.2. Let X be a finite subset of R d equipped with the L ∞ -metric d ∞ and let t > 0. If X has cardinality n, then the cover V of X obtained from Algorithm 1 is constructed in O(2 kd dn) time, where k = ⌈log 1+δ 2 (t/r)⌉. The size of the cover V is at most 2 kd . The nerve of V can be constructed in O(2 2 kd dn) time.
Note that for fixed d and δ , the term 2 kd is polynomial in the ratio r/t between the radius r of X and the threshold radius t.
Let V be as in Theorem 4.2. Given s ≥ t we write V s for the cover of X given by members of V of radius less than s. By construction, for s ≥ t, the inclusion of V s in U s is cofinal. Thus by Lemma 2.3, for filtration values greater than t, the persistent homology of the cover V coincides with the persistent homology of U .
Proof of Theorem 4.2. Note that in the L ∞ -metric, the radius of a subset of R d is given by the maximum of the radii of its coordinate projections to R. A δ -decision division (4.1) reduces the radius of this coordinate projection by the factor 1+δ 2 . Thus the radius of any d-fold δ -divided part of X is at most r 1+δ 2 , where r is the radius of X. If we let k = ⌈log 1+δ 2 (t/r)⌉, then the radius of any kd-fold δ -divided part of X is at most Since each δ -decision division consists of two parts, we conclude that V can be produced by making at most 2 kd δ -decision divisions. Since we work in the L ∞ metric, extremal points can be found by computing min-and max-values for the coordinate projections of points in X. Similarly δ -decision division can be made by computing min-and max-values for the coordinate projections of points in X. Each of these steps require O(nd) time, so the cover is of size at most 2 kd and it can be constructed in O(2 kd nd) time.
Finally, the nerve of the cover V is constructed by calculating intersections of members of V . Calculating the intersection of i ≤ d subsets of X can be done by, for each element x of X, deciding if x is a member of the intersection. The complexity of this is O(ni). Since the cardinality of V is at most 2 kd , independently of n, the time of calculating the nerve is O(2 2 kd n).
We shall use the following result to show that a δ -decision division of X ⊆ R d gives a d −1/p δ -divisive cover of X in the L p -metric. This stems from the fact that all L p -metrics are equivalent.
Proposition 4.3. Let d 1 and d 2 be metrics on X and let α and β be positive numbers such that for all x, y ∈ X the inequality holds. Then every δ -filtered cover of (X, d 1 ) is a δ α/β -filtered cover of (X, d 2 ).
Proof. We emphasize the metrics d 1 and d 2 in the notation by writing U d 1 t and U d 2 t for the covers of X consisting of members of U contained in a closed ball of radius t in (X, d 1 ) and (X, d 2 ) respectively.
By assumption, there are inclusions of balls Given a point x ∈ X and a radius t > 0, we can find a set A ∈ U d 1 t/β such that Thus U is an δ α/β -filtered cover of (X, d 2 ).
In the case where d 1 is the L ∞ -metric and d 2 is the L p -metric on R d the inequalities in Proposition 4.3 hold for α = 1 and β = d 1/p . Thus, if U is a δ -filtered cover of X with respect to the L ∞ -metric, then it is a d −1/p δ -filtered cover of X with respect to the L p -metric. In particular it is δ / √ d-filtered with respect to the Euclidean metric.

Examples
5.1 Generated data 5.1.1 Sphere We used divisive cover with the δ -division of Lemma 3.6 to calculate the persistent homology of a generated sphere. We generated 1000 data points with a radius normally distributed with a mean of 1 and a standard deviation of 0.1 and uniform angle. The top panel of Figure 5.1.2 shows the resulting persistence barcodes.

Torus
We calculated the persistent homology of a generated torus using divisive cover with the δ -division of Lemma 3.6. We generated 400 data points on a torus. The torus was generated as the product space of 20 points each on two circles of radius 1 with uniformly distributed angles. The second panel of Figure 5.1.2 shows the persistence barcodes of the generated torus.

Natural images
The space of 3 by 3 high-contrast patches of natural images has been analysed using witness complexes before [Carlsson et al., 2008]. The authors analysed high-density subsets of 50,000 random 3 by 3 patches from a collection of 4 × 10 6 patches presented in Hateren and Schaaf [1998]. They denote the space X(k, p) of p percent highest density patches using the k-nearest neighbours to estimate density and find that X(300, 30) has the topology of a circle. We repeat this analysis using divisive cover with the δ -division of Lemma 3.6 and show that calculating persistent homology without landmarks is possible for real world data sets. The bottom panel of Figure 5.1.2 show the persistence barcodes of X(300, 30).

Conclusion
Filtered covers as the underlying structure for filtered complexes provides new insights into topological data analysis. It can be used as a basis for new constructions of simplicial complexes that are interleaved with theČech nerve. We are not aware of any previous literature that made use of covers in such a way. Divisive covers are just one possible way to create δ -filtered covers. Many other constructions are available, for example optimized versions of the δ -filteredČech cover we have presented. The idea of a divisive cover is conceptually simple and easy to implement. Compared to theČech nerve, the nerve of a divisive cover can be substantially smaller. On the other hand, the witness complex is often considerably smaller than the divisive cover complex. Although we give theoretical guarantees that are linear in n, in practice persistent homology calculations using the divisive cover algorithm proposed here are not competitive with state of the art approximations to the Vietoris-Rips complex Oudot and Sheehy [2015], Dey et al. [2016]. We see divisive covers as a new class of  Figure 1: Persistence barcodes using divisive cover. All barcodes are shown for relative diameter between 0.4 and 1. The first panel shows the persistence barcodes of a sphere using divisive cover with δ = 0.05 and the second panel shows persistence barcodes of a torus with δ = 0.06. The third panel shows persistence barcodes of X(300, 30), with δ = 0.025 simplicial complexes that can be studied in a fashion similar to Vietoris-Rips filtrations. It is possible to reduce the size of the divisive cover complex, for example by using landmarks. We did not address such improvements in the present paper. It might also be possible to combine a version of the Vietoris-Rips complex for low filtration values and a version of the divisive cover complex for high filtration values. The version of divisive cover we have presented is easy to implement and performs well at large filtration values.