Curvature Sets Over Persistence Diagrams

We study a family of invariants of compact metric spaces that combines the Curvature Sets defined by Gromov in the 1980s with Vietoris-Rips Persistent Homology. For given integers $k\geq 0$ and $n\geq 1$ we consider the dimension $k$ Vietoris-Rips persistence diagrams of \emph{all} subsets of a given metric space with cardinality at most $n$. We call these invariants \emph{persistence sets} and denote them as $\mathbf{D}_{n,k}^\textrm{VR}$. We establish that (1) computing these invariants is often significantly more efficient than computing the usual Vietoris-Rips persistence diagrams, (2) these invariants have very good discriminating power and, in many cases, capture information that is imperceptible through standard Vietoris-Rips persistence diagrams, and (3) they enjoy stability properties. We precisely characterize some of them in the case of spheres and surfaces with constant curvature using a generalization of Ptolemy's inequality. We also identify a rich family of metric graphs for which $\mathbf{D}_{4,1}^\textrm{VR}$ fully recovers their homotopy type by studying split-metric decompositions. Along the way we prove some useful properties of Vietoris-Rips persistence diagrams using Mayer-Vietoris sequences. These yield a geometric algorithm for computing the Vietoris-Rips persistence diagram of a space $X$ with cardinality $2k+2$ with quadratic time complexity as opposed to the much higher cost incurred by the usual algebraic algorithms relying on matrix reduction.

The Gromov-Hausdorff (GH) distance, a notion of distance between compact metric spaces, was introduced by Gromov in the 1980s and was eventually adapted into data/shape analysis by the second author [Mém05,MS04,MS05] as a tool for measuring the dissimilarity between shapes/datasets. Despite its usefulness in providing a mathematical model for shape matching procedures, [MS04,MS05,BBBK08], the Gromov-Hausdorff distance leads to NP-hard problems: [Mém12b] relates it to the well known Quadratic Assignment Problem, which is NP-hard, and Schmiedl in his PhD thesis [Sch17] (see also [AFN + 18]) directly proves the NP-hardness of the computation of the Gromov-Hausdorff distance even for ultrametric spaces. Recent work has also identified certain Fixed Parameter Tractable algorithms for the GH distance between ultrametric spaces [MSW19].
These hardness results have motivated research in other directions: (I) finding suitable relaxations of the Gromov-Hausdorff distance which are more amenable to computations and (II) finding lower bounds for the Gromov-Hausdorff distance which are easier to compute, yet retain good discriminate power.
Related to the first thread, and based on ideas from optimal transport, the notion of Gromov-Wasserstein distance was proposed in [Mém07,Mém11]. This notion of distance leads to continuous quadratic optimization problems (as oposed to the combinatorial nature of the problems induced by the Gromov-Hausdorff distance) and, as such, it has benefited from the wealth of continuous optimization computational techniques that are available in the literature [PCS16, PC + 19] and has seen a number of applications in data analysis and machine learning [VCF + 20, DSS + 20, AMJ18, KM21, BCM + 20] in recent years.
The second thread mentioned above is that of obtaining computationally tractable lower bounds for the usual Gromov-Hausdorff distance. Several such lower bounds were identified in [Mém12b] by the second author, and then in [CM08,CM10a] and [CCSG + 09] it was proved that hierarchical clustering dendrograms and persistence diagrams or barcodes, metric invariants which arose in the Applied Algebraic Topology community, provide a lower bound for the GH distance. These persistence diagrams will eventually become central to the present paper, but before reviewing them, we will describe the notion of curvature sets introduced by Gromov.
Gromov's curvature sets and curvature measures. Given a compact metric space (X, d X ), in the book [Gro07] Gromov identified a class of invariants of metric spaces indexed by the natural numbers that solves the classification problem for X. In more detail, Gromov defines for each n ∈ N, the n-th curvature set of X, denoted by K n (X), as the collection of all n × n matrices that arise from restricting d X to all possible n-tuples of points chosen from X, possibly with repetitions. The terminology curvature sets is justified by the observation that these sets contain, in particular, metric information about configurations of closely clustered points in a given metric space. This information is enough to recover the curvature of a manifold; see Figure 1.
These curvature sets have the property that K n (X) = K n (Y ) for all n ∈ N is equivalent to the statement that the compact metric spaces X and Y are isometric. Constructions similar to the curvature sets of Gromov were also identified by Peter Olver in [Olv01] in his study of invariants for curves and surfaces under different group actions (including the group of Euclidean isometries).
In [Mém12b] it is then noted that the GH distance admits lower bounds based on these curvature sets: for all X, Y compact metric spaces. Here, d H denotes the Hausdorff distance on R n×n with ∞ distance. As we mentioned above, the computation of the Gromov-Hausdorff distance leads in general to NP-hard problems, whereas the lower bound in the equation above can be computed in polynomial time when restricted to definite values of n. In [Mém12b] it is argued that work of Peter Olver [Olv01] and Boutin and Kemper [BK04a] leads to identifying rich classes of shapes where these lower bounds permit full discrimination.
In the category of compact mm-spaces, that is triples (X, d X , µ X ) where (X, d X ) is a compact metric space and µ X is a fully supported probability measure on X, Gromov also discusses the following parallel construction: for an mm-space (X, d X , µ X ) let Ψ (n) X : X ×n −→ R n×n be the map that sends the n-tuple (x 1 , x 2 , . . . , x n ) to the matrix M with elements M ij = d X (x i , x j ). Then, the n-th curvature measure of X is defined as Clearly, curvature measures and curvature sets are related as follows: supp(µ n (X)) = K n (X) for all n ∈ N. Gromov then proves in his mm-reconstruction theorem that the collection of all curvature measures permit reconstructing any given mm-space up to isomorphism. Similarly to (1), [MNO21] proves for each p ≥ 1 that d GW,p (X, Y ) ≥ d GW,p (X, Y ) := 1 2 sup n∈N d W,p (µ n (X), µ n (Y )), where d W,p denotes the p-Wasserstein distance [Vil03] on P 1 R n×n with ∞ distance. Figure 2: The pipeline to compute a persistence diagram. Starting with a distance matrix, we compute the Vietoris-Rips complex and its homology, and produce an interval decomposition. Together, we call these three steps PH VR k .
Persistent Homology. Ideas related to what is nowadays know as persistent homology appeared already in the late 1980s and early 1990s in the work of Patrizio Frosini [Fro90b,Fro99,Fro90a], then in the work of Vanessa Robins [Rob99], in the work of Edelsbrunner and collaborators [ELZ00], and then in the work of Carlsson and Zomorodian [ZC04]. Some excellent references for this topic are [EH10,Ghr08,Car14,Wei11]. In a nutshell, persistent homology (PH) assigns to a given compact metric space X and an integer k ≥ 0, a multiset of points dgm VR k (X) in the plane, known as the k-th (Vietoris-Rips) persistence diagram of X. The standard PH pipeline is shown in Figure 2.
These diagrams indicate the presence of k-dimensional multi-scale topological features in the space X, and can be compared via the bottleneck distance (which is closely related to but is stronger than the Hausdorff distance in (R 2 , ∞ )).
Following work by Cohen-Steiner et al. [CSEH07], in [CCSG + 09] it is proved that the maps X → dgm VR k (X) sending a given compact metric space to its k-th persistence diagrams is 2-Lipschitz under the GH and bottleneck distances.
Algorithmic work by Edelsbrunner and collaborators [ELZ00] and more recent developments [Bau19] guarantee that not only can dgm VR k (X) be computed in polynomial time (in the cardinality of X) but also it is well known that the bottleneck distance can also be computed in polynomial time [EH10]. This means that persistence diagrams provide another source of stable invariants which would permit estimating (lower bounding) the Gromov-Hausorff distance.
It is known that persistence diagrams are not full invariants of metric spaces. For instance, any two tree metric spaces, that is metric spaces satisfying the four point condition [Gro87], have trivial persistence diagrams in all degrees k ≥ 1. It is also not difficult to find two finite tree metric spaces with the same degree zero persistence diagrams. See [LMO20] for more examples and [MZ19] for results about stronger invariants (i.e. persistent homotopy groups).
Despite the fact that persistence diagrams can be computed with effort which depends polynomially on the size of the input metric space [EH10,AW20], the computations are actually quite onerous and, as of today, it is not realistic to compute the degree 1 Vietoris-Rips persistence diagram of a finite metric space with more than a few thousand points even with state of the art implementations such as Ripser [Bau19].
Curvature sets over persistence diagrams. In this paper, we consider a version of the curvature set ideas which arises when combining their construction with Vietoris-Rips persistent homology.
For a compact metric space X and integers n ≥ 1 and k ≥ 0, the (n, k)-Vietoris-Rips persistence set of X is (cf. Definition 3.9) the collection D VR n,k (X) of all persistence diagrams in degree k of subsets of X with cardinality at most n.
In a manner similar to how the n-th curvature measure µ n (X) arose above, we also study the probability measure U VR n,k (X) defined as the pushforward of µ n (X) under the degree k Vietoris-Rips persistence diagram map (cf. Definition 3.16). We also study a more general version wherein for any stable simplicial filtration functor F (cf. Definition 2.22), we consider both the persistence sets D F n,k (X) and the the persistence measures U F n,k (X).

Contributions
We provide a thorough study of persistence sets and in particular analyze the following points.
Computational cost, paralellizability, and approximation: One argument for considering the persistent set invariants D VR n,k (X) as opposed to the standard degree k Vietoris-Rips persistence diagrams dgm k (X) is that while computing the latter incurs cost O(|X| 3(k+2) ) in the worst case, computing the former incurs cost O(n 3(k+2) |X| n ), which is in general (when n |X|) not only significantly smaller but also the associated computational tasks are eminently parallelizable. Furthermore, the amount of memory needed for computing persistent sets is also notably smaller than for computing persistence diagrams over the same data set. See Remark 3.15 for a detailed discussion. In fact, persistent sets are useful as an alternative paradigm for the acceleration of the computation of persistent homology based invariants; cf. Figure 3.
Principal persistence sets, their characterization and an algorithm: Persistence sets are defined to be sets of persistence diagrams and, although a single persistence diagram is easy to visualize, large collections of them might not be so. However, when there is a certain relation between n and k we verify in Theorem 4.4 that there can be at most one point in the degree k persistence diagram of any metric space with at most n points. This means that all persistence diagrams in the principal persistence set D VR 2k+2,k (X) can be stacked on the same axis; see Figure 4. Figure 3: The pipeline to compute D n,k . Starting with a metric space (X, d X ), we take samples of the distance matrix as elements of K n (X), apply PH k to each, and aggregate the resulting persistence diagrams. For example, Theorem 4.4 guarantees that the VRpersistence diagram in dimension k of a metric space with n = 2k + 2 points only has one point. The aggregation in this case means plotting the set D VR n,k (X) by plotting all diagrams simultaneously in one set of axes. In general, the diagrams in D n,k (X) have more than 1 point, so one possibility for aggregation is constructing a one-point summary or an average of a persistence diagram (for instance, a Chebyshev center or an ∞ mean) and then plotting all such points simultaneously. Figure 4: A graphical representation of the principal persistent sets D VR 2k+2,k (X) is obtained by overlaying the persistence diagrams of all samples Y ⊂ X (with |Y | ≤ 2k + 2) into single set of axes. This is made possible since by Theorem 4.4 these diagrams have at most one off diagonal point.
Our main result, Theorem 4.4 furthermore gives a precise representation of the unique point in the degree k persistence diagram of a metric space with at most n k := 2k + 2 points via a formula which induces an algorithm for computing the principal persistence sets. This algorithm is purely geometric in the sense that it does not rely on analyzing boundary matrices but, in contrast, directly operates at the level of distance matrices. For any k, this geometric algorithm has cost O(n 2 k ) ≈ O(k 2 ) as opposed to the much larger cost O(n 3(k+2) k ) ≈ O(2 3k k 3(k+2) ) incurred by the standard persistent homology algorithms; see Remark 4.6.
Propositions 5.13 and 5.17, and Corollary 5.18 provide additional information about higher dimensional spheres. Section 4.2 provides some computational examples including the case of tori.
Our characterization results are in the same spirit as results pioneered by Adamaszek and Adams related to characterizing the Vietoris-Rips persistence diagrams of circles and spheres [AA17]; see also [LMO20].
Additionally, the arguments used to characterize the Persistence Set of the sphere can be generalized to other surfaces. More precisely, we obtained: Theorem 5.16. Let M κ be the 2-dimensional model space with constant sectional curvature κ. Then: The (4, 1)-persistence set of S 1 (with geodesic distance) is the shaded triangular area where the top left and top right points have coordinates ( π 2 , π) and (π, π), respectively, whereas the lowest diagonal point has coordinates ( 2π 3 , 2π 3 ). The figure also shows exemplary configurations X ⊂ S 1 with |X| ≤ 4 together with their respective persistence diagrams inside of D VR 4,1 (S 1 ).
A similar result first appeared in [BHPW20]. The authors explore the question of whether persistent homology can detect the curvature of the ambient M κ . They found a geometric formula to compute dgmČ ech 1 (T ) of a sample T ⊂ M κ with three points, much in the same vein as our Theorem 4.4. They used it to find the logarithmic persistence P a (κ) = t d (T κ,a )/t b (T κ,a ) for an equilateral triangle T κ,a of fixed side length a > 0, and proved that P a , when viewed as a function of κ, is invertible.
The present Theorem 5.16 is in the same spirit. Instead of equilateral triangles, we can use squares with a given t d and minimal t b to find κ. Qualitatively, we can detect the sign of the curvature by looking at the boundary of D VR 4,1 (M κ ): it is concave up when κ > 0, a straight line when κ = 0, and concave down when κ < 0. See Figure 14.
Unlike us, they also detected curvature experimentally. They sampled 1000 points from a unit disk in M κ and were able to approximate κ using the average VR death vectors in dimension 0 and average persistence landscapes in dimension 1 of 100 such samples. For example, one method consisted in finding a collection of landscapes L κ labeled with a known curvature κ, and estimating κ * for an unlabeled L * with the average curvature of the three nearest neighbors of L * . They were also able to approximate κ * without labeled examples by using PCA. See their paper [BHPW20] for more details.
The key ingredient in the proof of Theorem 5.16 is Ptolemy's inequality and its analogues in non-Euclidean geometries [Val70a,Val70b]. We generalize these inequalities to CAT(κ) spaces, albeit with some constraints when κ > 0. In contrast, when κ ≤ 0, we show that the persistence set D VR 4,1 (X) of a CAT(κ) space is contained in the persistence set D VR 4,1 of the corresponding model space M κ .
Stability. We establish the stability of persistence sets and measures under Gromov-Hausdorff and Gromov-Wasserstein distances in Theorems 3.12 and 3.17. Such results permit estimating these distances in polynomial time. As an application, we show that the Gromov-Hausdorff distance between S 1 and S m is bounded below by π 15 when m = 2 (Example 5.19) and by π 8 when m ≥ 3 (Example 5.20). These bounds are not tight; it is known that Concentration results for U VR n,k . Another consequence of the stability of persistence measures is the concentration of U F n,k (X) as n → ∞. More precisely, Theorem 6.3. Let (X, d X , µ X ) be an mm-space and take any stable filtration functor F. For any n, k ∈ N, consider the random variable D valued in D F n,k (X) distributed according to U F n,k (X). Then: • As a consequence, the mm-space D F n,k (X) = D F n,k (X), d B , U F n,k (X) concentrates to a one-point mm-space as n → ∞.
Similar results appear in [BGMP12] [CFL + 15]. The approach in [CFL + 15] is studying the average persistence landscape λ m n of n samples S m 1 , . . . , S m n ⊂ X of size m. They show that this procedure is stable under perturbations of the Gromov-Wasserstein distance, and provide a bound on the expected ∞ distance between the persistence landscape of X and λ m n . These two results are analogous, respectively, to our Theorem 3.17 and to item 1 of 6.3 above. As for [BGMP12], the authors study the statistical robustness of persistent homology invariants. They have two results similar to ours. One is the stability of the measures U F n,k (X) (they write Φ n k (X) instead) under the Gromov-Prokhorov distance (instead of the Gromov-Hausdorff distance). They second is a central limit theorem, where the measures U F n,k (S i ) corresponding to an increasing sequence of finite samples S 1 ⊂ S 2 ⊂ · · · X converge in probability to U F n,k (X).
Persistence sets and measures capture more information than dgm VR k . Evidently, dgm F k (X) ∈ D F n,k (X) when n ≥ |X|. What is interesting is that persistence sets can detect more information than the Vietoris-Rips persistent diagram. See Example 8.5 where G is a graph that consists of a cycle C with 4 edges attached such that D VR 4,1 (G) contains more points than D VR 4,1 (C). As far as the Vietoris-Rips complex is concerned, though, VR r (G) VR r (C).
An application to detecting homotopy type of graphs: In Section 8, as an application, we study a class of metric graphs for which D VR 4,1 , a rather coarse invariant which is fairly easy to estimate and compute in practice, is able to characterize the homotopy type of graphs in this class. In fact, D VR 4,1 detects more features than the Vietoris-Rips persistence diagram of G. See Figure 19 for an example. There, G is a cycle C with 4 edges attached and D VR 4,1 (G) is different from D VR 4,1 (C). In contrast, the Vietoris-Rips complex of both graphs are homotopy equivalent.

Related work
The measures U VR n,k first appeared in a paper in the work by Blumberg et al. [BGMP12] in 2012 and then in print in [BGMP14]. These measures were also exploited a couple years later by Chazal et al. in the articles [CFL + 14, CFL + 15] in order to devise bootstrapping methods for the estimation of persistence diagrams.
The connection to Gromov's curvature sets and measures was not mentioned in either of these two papers. [Mém12b] studied curvature sets and their role in shape comparison and, as a natural follow up, some results regarding the persistence sets D VR n,k and the measures U VR n,k (as well as the more general objects D F n,k and U F n,k ) were first described Banff in 2012 during a conference [Mém12a] by the second author. Then, subsequent develoments were described in 2013 at ACAT 2013 in Bremen [Mém13a] and Bedlewo [Mém13b], and then at IMA [Mém14a] and at SAMSI in 2014 [Mém14b]. In these presentations the second author proposed the invariants D VR n,k as a Gromov-Hausdorff stable computationally easier alternative to the usual Vietoris-Rips persistence diagrams of metric spaces [Mém14c].
In January 2021 Bendich, et al. uploaded a paper to the arXiv [SWB21] with some ideas related to our construction of D F n,k . The authors pose questions about the discriminative power of a certain labeled version of the persistent sets D VR n,k (even though they do not call them that) and also mention some stability and computational properties similar to those mentioned in [Mém12a,Mém13a,Mém14a,Mém14b].
The second author together with Needham [MN18] has recently explored the classificatory power of µ 2 as well as that of certain localizations of µ 2 . In [CCM + 20] the authors identify novel classes of simplicial filtrations arising from curvature sets together with suitable notions of locality. Ongoing work is exploring the classificatory power of µ n for general n [MNO21].
In terms of data intensive applications, the neuroscience paper [SMI + 08] made use of ideas related to U VR n,k and D VR n,k in the context of analysis of neuroscience data.

Background
For us, M and M fin will denote, respectively, the category of compact and finite metric spaces. The morphisms in both categories will be 1-Lipschitz maps, that is, functions ϕ : We say that two metric spaces are isometric if there exists a surjective isometry ϕ :

Metric geometry
In this section, we define the tools that we'll use to quantitatively compare metric spaces [BBI01].
Definition 2.1. For any subset A of a metric space X, its diameter is diam X (A) := sup a,a ∈A d X (a, a ), and its radius is rad X (A) := inf p∈X sup a∈A d X (p, a). Note that rad Definition 2.2 (Hausdorff distance). Let A, B be subsets of a compact metric space (X, d X ).
The Hausdorff distance between A and B is defined as , and only if their closures are equal:Ā =B.
We will use an alternative definition that is useful for calculations, but is not standard in the literature. It relies on the concept of a correspondence. Definition 2.3. A correspondence between two sets X and Y is a set R ⊂ X × Y such that π 1 (R) = X and π 2 (R) = Y , where π i are projections. We will denote the set of all correspondences between X and Y as R(X, Y ).
Definition 2.4 (Proposition 2.1 of [Mém11]). For any compact metric space (X, d X ) and any A, B ⊂ X closed, The standard method for comparing two metric spaces is a generalization of the Hausdorff distance.
Definition 2.5. For any correspondence R between (X, d X ), (Y, d Y ) ∈ M, we define its distortion as Then the Gromov-Hausdorff distance between X and Y is defined as dis(R).

Metric measure spaces
To model the situation in which points are endowed with a notion of weight (signaling their trustworthiness), we will also consider finite metric spaces enriched with probability measures [Mém11]. Recall that the support supp(ν) of a Borel measure ν defined on a topological space Z is defined as the minimal closed set Z 0 such that ν(Z \ Z 0 ) = 0. If ϕ : Z → X is a measurable map from a measure space (Z, Σ Z , ν) into the measurable space (X, Σ X ), then the pushforward measure of ν induced by ϕ is the measure ϕ # ν on X defined by ϕ # ν(A) = ν(ϕ −1 (A)) for all A ∈ Σ X .
Definition 2.6. A metric measure space is a triple (X, d X , µ X ) where (X, d X ) is a compact metric space and µ X is a Borel probability measure on X with full support, i.e. supp(µ) = X. Two mm-spaces (X, d X , µ X ) and (Y, d Y , µ Y ) are isomorphic if there exists an isometry ϕ : X → Y such that ϕ # µ X = µ Y . We define the category of mm-spaces M w , where the objects are mm-spaces and the morphisms are 1-Lipschitz maps ϕ : Many tools in metric geometry have been adapted to study mm-spaces. Our first step is the following definition.
Definition 2.7. Given two measure spaces (X, Σ X , µ X ) and (Y, Σ Y , µ Y ), a coupling between We denote the set of couplings between µ X and µ Y as M(µ X , µ Y ).
Remark 2.8 (The support of a coupling is a correspondence). Notice that, since µ X is fully supported and X is finite, then µ(π −1 1 (x)) = µ X ({x}) = 0 for any fixed coupling µ ∈ M(µ X , µ Y ). Thus, the set π −1 1 (x) ∩ supp(µ) is non-empty for every x ∈ X. The same argument on Y shows that supp(µ) is a correspondence between X and Y . In that regard, couplings are a probabilistic version of correspondences.
There is also a version of the diameter that considers the measure. The p-diameter of a subset A of an mm-space X is defined as for 1 ≤ p < ∞, and set diam X,∞ (A) := diam X (A). We use these concepts to define a probabilistic version of the Hausdorff distance.
Definition 2.9. Given two probability measures α, β on (Z, d Z ) and p ≥ 1, the Wasserstein distance of order p is defined as [Vil03]: In the same spirit, there is a generalization of Gromov-Hausdorff.
Then the Gromov-Wasserstein distance of order p ∈ [1, ∞] between X and Y is defined as [Mém11]:

Simplicial complexes
Definition 2.12. Let V be a set. An abstract simplicial complex K with vertex set V is a collection of finite subsets of V such that if σ ∈ K, then every τ ⊂ σ is also in K. We also use K to denote its geometric realization.
Here we define the simplicial complexes that we will focus on.
Definition 2.14. Fix n ≥ 1. Let e i = (0, . . . , 1, . . . , 0) be the i-th standard basis vector in R n and V = {±e 1 , . . . , ±e n }. Let B n be the collection of subsets σ ⊂ V that don't contain both e i and −e i . This simplicial complex is called the n-th cross-polytope.

Persistent homology
The idea behind persistent homology is to construct a filtration of topological spaces (X t ) t>0 and compute the homology at each time t. We will adopt definitions from [Mém17].
Definition 2.15. A filtration on a finite set X is a function F X : pow(X) → R such that F X (σ) ≤ F X (τ ) whenever σ ⊂ τ , and we call the pair (X, F X ) a filtered set. F will denote the category of finite filtered sets, where objects are pairs (X, F X ) and the morphisms ϕ : (X, F X ) → (Y, F Y ) are set maps ϕ : X → Y such that F Y (ϕ(σ)) ≤ F X (σ).
. Functoriality of F means that for any 1-Lipschitz map ϕ : X → Y , we have F Y (ϕ(σ)) ≤ F X (σ) for all σ ⊂ X. In particular, if X and Y are isometric, F X = F Y as filtrations.
Definition 2.18. Given (X, d X ) ∈ M fin , define the Vietoris-Rips filtration F VR X by setting F VR X (σ) = diam(σ) for σ ⊂ X. It is straightforward to check that this construction is functorial, so we define the Vietoris-Rips filtration functor F VR : M fin → F by (X, d X ) → (X, F VR X ). Our pipeline for persistent homology starts with a filtration functor F. Given a finite (pseudo)metric space (X, d X ), let (X, F F X ) = F(X, d X ). For every r > 0, we construct the simplicial complex L r := σ ⊂ X : F F X (σ) ≤ r 1 , and we get a nested family of simplicial complexes where range(F X ) = {r 0 < r 1 < r 2 < · · · < r m }, and each L r i is, by construction, finite. Taking homology with field coefficents H k (·, F) of the family above gives a sequence of vector spaces and linear maps which is called a persistence vector space. Note that each V r i is finite dimensional in our setting.
One particular type of persistent vector spaces are interval modules where the first F appears at time b, and the last one, at time d. The maps between different occurrences of F are identities, whereas the other maps are 0. Persistence vector spaces admit a classification up to isomorphism wherein a persistence vector space V is decomposed as a sum of interval modules V = α∈A I[b α , d α ) [CdS10]. These collections of intervals are sometimes referred to as barcodes or persistence diagrams, depending on the graphical representation that is adopted [EH10]. We prefer the term persistence diagrams in the present work, and denote by D the collection of all finite persistence diagrams. An element D ∈ D is multiset of points of the form for some (finite) index set A. In short, starting with any filtration functor F, we assign a persistence diagram to (X, d X ) via the composition dgm F k : M fin → D defined by Notice that we could have also started with just a filtered set (X, F X ), instead of a (pseudo)metric space, and obtain a persistence diagram. We will denote that diagram with dgm k (X, F X ).

Stability
The most useful filtration functors enjoy a property known as stability. Intuitively, it means that the persistence diagrams they produce are resistant to noise: if the input (pseudo)metric space is perturbed, the persistence diagram will not change too much. In this section, we will describe the metrics on filtrations and persistence diagrams that we use to measure stability. We start with the bottleneck distance between persistence diagrams D 1 , D 2 ∈ D. Define the persistence of a point P = (x, y) with x ≤ y as pers(P ) := y − x. The total persistence of a persistence diagram D ∈ D is the maximal persistence of its points: pers(D) := max P ∈D pers(P ).
Let D 1 = {P α } α∈A 1 and D 2 = {Q α } α∈A 2 be two persistence diagrams indexed over the finite index sets A 1 and A 2 respectively. Consider subsets B i ⊆ A i with |B 1 | = |B 2 | together with a bijection ϕ : where (B 1 , B 2 , ϕ) ranges over all B 1 ⊂ A 1 , B 2 ⊂ A 2 , and bijections ϕ : B 1 → B 2 . Note that for any D ∈ D and any one- We can also measure the difference between two finite filtered sets (X, F X ) and (Y, d Y ). The idea is to pullback and compare the filtrations in a common set Z. To that end, we define a tripod, which is a triplet (Z, ϕ X , ϕ Y ) consisting of a finite set Z and a pair of surjective maps ϕ X : Z → X and ϕ Y : where the infimum ranges over all tripods (Z, ϕ X , ϕ Y ).
In other words, we pullback the filtrations F X and F Y to a common set Z, where we can compare them using the ∞ norm on pow(Z). The filtration distance is the infimum of this quantity over all choices of tripods (Z, ϕ X , ϕ Y ).
With these tools at hand, we define what we mean by stable functors.
Definition 2.22 (Stable filtration functors). For a given filtration functor F, define its Lipschitz constant L(F) as the infimal L > 0 such that for all X, Y ∈ M fin . If L(F) < ∞, we say that F is stable. In this case, we also say that F is L-stable for all constants L ≥ L(F).
[Mém17] proved the following theorem and its corollary.
Corollary 2.24. For any stable filtration functor F, for all X, Y ∈ M fin and k ∈ N.
Example 2.25. The Lipschitz constant of F VR is 2. Pick any pair of finite (pseudo)metric spaces X and Y and let η > 0 and R ∈ R(X, Y ) be such that dis(R) < 2η. Consider the joint parametrization Z = R, ϕ X = π 1 and ϕ Y = π 2 of X and Y . For any τ ⊂ Z, The constant 2 is tight because X = ( * , 0) and

Curvature sets and Persistence diagrams
Given a compact metric space (X, d X ), Gromov identified a class of full invariants called curvature sets [Gro07]. Intuitively, the n-th curvature set contains the metric information of all possible samples of n points from X. In this section, we define persistence sets, an analog construction that captures the persistent homology of all n-point samples of X. We start by recalling Gromov's definition with some examples, and an analogue of the Gromov-Hausdorff distance in terms of curvature sets. We then define persistence sets and study their stability with respect to this modified Gromov-Hausdorff distance. Additionally, when dealing with metric measure spaces, we can define measures on curvature and persistence sets via the pushforward of the product measure on X n . We also study these measures and prove an appropriate notion of stability.
Definition 3.1. Let (X, d X ) be a metric space. Given a positive integer n, let Ψ (n) X : X n → R n×n be the map that sends an n-tuple (x 1 , . . . , x n ) to the distance matrix M , where , the collection of all distance matrices of n points from X.
Remark 3.2 (Functoriality of curvature sets). Observe curvature sets are functorial in the sense that if X is isometrically embedded in Y , then K n (X) ⊂ K n (Y ).
For n ≥ 2 and 0 < k < n, let x 1 = · · · = x k = p and x k+1 = · · · = x n = q. Define where 1 r×s is the r × s matrix with all entries equal to 1. If we make another choice of x 1 , . . . , x n , the resulting distance matrix will change only by a permutation of its rows and columns. Thus, if we define M Π k (δ) := Ψ (n) X (x 1 , . . . , x n ) = Π T ·M k (δ)·Π, for some permutation matrix Π ∈ S n , then Example 3.5. In this example we describe K 3 (S 1 ), where S 1 = [0, 2π]/(0 ∼ 2π) is equipped with the geodesic metric. Depending on the position of x 1 , x 2 , x 3 , we need two cases. If the three points are not contained in the same semicircle, then d 12 + d 23 + d 31 = 2π. If they are, then there exists a point, say x 2 , that lies in the shortest path joining the other two so that The other possibilities are d 12 = d 13 + d 32 and d 23 = d 21 + d 13 .
y z x Figure 7: The curvature set K 3 (S 1 ).
As we mentioned earlier, curvature sets are a full invariant of compact metric spaces, which means that X Y if, and only if, K n (X) = K n (Y ) for all n ≥ 1. It makes sense to quantitatively measure the difference between two metric spaces by comparing their curvature sets. The following definition of [Mém12b] does what we need.
Here d H denotes the Hausdorff distance on R n×n with ∞ distance.
. A benefit of d GH when compared to the standard Gromov-Hausdorff distance is that the computation of the latter leads in general to NP-hard problems [Sch17], whereas computing the lower bound in the equation above on certain values of n leads to polynomial time problems. In [Mém12b] it is argued that work of Peter Olver [Olv01] and Boutin and Kemper [BK04b] leads to identifying rich classes of shapes where these lower bounds permit full discrimination.
The analogous definitions for mm-spaces are the following.
We also define the modified Gromov-Wasserstein distance between X, Y ∈ M w as Clearly, supp(µ n (X)) = K n (X) for all n ∈ N, and similarly to equation (4), [MNO21] Remark 3.8 (Interpretation as "motifs"). In network science [MP20], it is of interest to identify substructures of a dataset (network) X which appear with high frequency. The interpretation of the definitions above is that the curvature sets K n (X) for different n ∈ N capture the information of those substructures whose cardinality is at most n, whereas the curvature measures µ n (X) capture their frequency of occurrence.

F-persistence sets
The idea behind curvature sets to study a metric space by taking the distance matrix of a sample of n points. This is the inspiration for the next definition: we want to study the persistence of a compact metric space X by looking at the persistence diagrams of samples with n points.
Definition 3.9. Fix n ≥ 1 and k ≥ 0. Let (X, d X ) ∈ M and F : M fin → F be any filtration functor. The (n,k)-F persistence set of X is Remark 3.10 (Functoriality of persistence sets). Notice that, similarly to curvature sets (Cf. Remark 3.2), persistence sets are functorial. If X → Y isometrically, then K n (X) ⊂ K n (Y ), and consequently, D F n,k (X) ⊂ D F n,k (Y ) for all n, k ∈ N.
Remark 3.11. Recall that filtration functors are by definition isometry invariants (see 2.17). This means that we can define the F-persistence diagram of a distance matrix as the diagram of the underlying pseudometric space. More explicitly, let (X, d X ) ∈ M, and take X ∈ X n and M = Ψ . For that reason, we can view the persistence set D F n,k (X) as the image of the map dgm F k : Persistence sets inherit the stability of the filtration functor. Given their definition in terms of curvature sets, the modified Gromov-Hausdorff distance is a natural metric to use.
Theorem 3.12. Let F be a stable filtration functor with Lipschitz constant L(F). Then for all compact metric spaces X and Y , n ≥ 1, and k ≥ 0, one has is an upper bound for the right-hand side, the theorem will follow. Assume The definition of dgm F k on curvature sets (see Remark 3.11) states that D 1 = dgm F k (X ) and In summary, for every D 1 ∈ D F n,k (X), we can find /2, and the same argument works when swapping X and Y . Thus, we let η → d H (K n (X), K n (Y )) to conclude as desired.
Remark 3.14 (Persistent sets are isometry invariant). Note that the persistent sets D F n,k are themselves isometry invariants of metric spaces. As such, they can be regarded, in principle, as signatures that can be used to gain insight into datasets or to discriminate between different shapes.
Remark 3.15 (Computational cost). One thing to keep in mind is that computing the single diagram dgm VR 1 (X) when X has, say, 1000 points is likely to be much more computationally expensive than computing 10,000 VR one-dimensional persistence diagrams obtained by randomly sampling points from X, i.e. approximating D VR n,1 (X) with small n. More specifically, computing the degree k VR persistence diagram of a finite metric space with N points requires knowledge of the k + 1 skeleton of the full simplex over X, each of which is a subset of size k + 2, so the complexity is c(N, k) ≈ O(N ω(k+2) ) [MMS11]. Here, we are assuming that multiplication of m × m matrices has cost 2 O(m ω ). Since there are N n possible n-tuples of points of X, the complexity of computing D VR n,k (X) is bounded by Another point which lends flexibility to the approximate computation of persistence sets is that one can actually easily cap the number of n-tuples to be considered by a parameter M max , and this case the complexity associated to estimating D VR n,k will be O(n ω(k+2) M max ). One can then easily select random n-tuples from the dataset up to an upper limit M maxthis is the pragmatic approach we have followed in the experiments reported in this paper and in the code on our github repository [GM21].
Furthermore, these calculations are of course eminently pararelizable. Furthermore, for n N , the memory requirements for computing an estimate to D VR n,k (X) are substantially more modest than what computing dgm VR k (X) would require since the boundary matrices that one needs to store in memory are several orders of magnitude smaller.
Finally, if one is only interested in the principal persistence set, a much faster geometric algorithm is available, cf. Remark 4.6.
See our github repository [GM21] for a parfor based Matlab implementation.

F-Persistence measures
Much in the same way as curvature measures define probability measures supported over curvature sets, one can consider measures supported on D, called persistence measures, which encode the way mass is distributed on persistence sets.
Definition 3.16. For each filtration functor F, integers n ≥ 1, k ≥ 0, and X ∈ M w , define the (n, k)-persistence measure of X as (cf. Def. 3.7) . We also have a stability result for these measures in terms of the Gromov-Wasserstein distance.
Theorem 3.17. Let F be a given filtration functor with Lipschitz constant L(F). For all X, Y ∈ M w and integers n ≥ 1 and k ≥ 0, and, in consequence, Proof. This proof follows roughly the same outline as that of (3.12).
where · ∞ denotes the ∞ norm on R n×n . It's a basic fact of measure theory that the Thus, a change of variables gives Recall from the proof of Theorem 3.12 that

Thus, the previous integral is bounded above by
Taking the p-th root and letting η

VR-persistence sets
From this point on, we focus on the Vietoris-Rips persistence sets D VR n,k with n = 2k + 2. The reason to do so is Theorem 4.4, which states that the k-dimensional persistence diagram of VR * (X) is empty if |X| < 2k + 2 and has at most one point if |X| = 2k + 2. What this means for persistence sets D VR n,k (X) is that given a fixed k, the first interesting choice of n is n = 2k + 2. We prove this fact in Section 4.1 and then use it to construct a graphical representation of D VR 2k+2,k (X). Section 4.2 presents computational examples.

Some properties of Vietoris-Rips complexes
Let X be a finite metric space with n points. The highest dimensional simplex of VR * (X) has dimension n, but even if VR * (X) contains k-dimensional simplices, it won't necessarily produce persistent homology in dimension k. A good example is n = 3 and k = 1. The only simplicial 1-cycle in a triangle is the union of its three edges. In order for VR r (X) to contain all three edges, we must have r ≥ d X (x i , x j ) for all i = j. However, this condition is equivalent to r ≥ diam(X), which makes VR r (X) isomorphic to the 2-simplex, a contractible complex. In other words, either VR r (X) doesn't contain any 1-cycle (when r < diam(X)) or it is contractible (when r ≥ diam(X)), so the persistence module PH VR 1 (X) is 0. Among other things, X needs more points to produce persistent homology in dimension 1.
The first definition of this section is inspired by the structure of the cross-polytope B m ; see Figure 6. Recall that a set σ ⊂ V = {±e 1 , . . . , ±e m } is a face if it doesn't contain both e i and −e i . In particular, there is an edge between e i and every other vertex except −e i . The next definition tries to emulate this phenomenon in VR * (X). and In a few words, t d (x) ≥ t b (x) are the two largest distances between x and any other point of X. The motivation behind these choices is that if r satisfies t b (x) ≤ r < t d (x), then VR r (X) contains all edges between x and all other points of X, except for v d (x). If this holds for all x ∈ X, then VR r (X) is isomorphic to a cross-polytope. Also, note that t d (x) is the radius rad(X) of X, cf. Definition 2.1. Also note that according to [LMO20,Proposition 9.6], the death time of any interval in dgm * (X) is bounded by rad(X).
Of course, as defined above, v d (x) is not unique in general, but it is well defined in the case that interests us, as we see next.
For the second claim, suppose that v 2 Once v d is well defined, we can produce the desired isomorphism between VR r (X) and a cross-polytope.
Both cross-polytopes and Vietoris-Rips complexes are flag complexes, so it's enough to verify that f induces an isomorphism of their 1-skeleta. Indeed, for any i = 1, . . . , k + 1, ). It turns out that n = 2k + 2 is the minimum number of points that X needs to have in order to produce persistent homology in dimension k, which is what we prove next. The proof is inspired by the use of the Mayer-Vietoris sequence to find H k (S k ) by splitting S k into two hemispheres that intersect in an equator S k−1 . Since the hemispheres are contractible, the Mayer-Vietoris sequence produces an isomorphism H k (S k ) H k−1 (S k−1 ). We emulate this by splitting VR r (X) into two halves which, under the right circumstances, are contractible and find the k-th persistent homology of VR * (X) in terms of the (k − 1)dimensional persistent homology of a subcomplex.
Two related results appear in [Kah09,Ada14,CCR13]. Case (1) in our Theorem 4.4 is a consequence of Lemma 5.3 in [Kah09] and Proposition 5.4 in [Ada14], and the decomposition VR r (X) = VR r (B 0 )∪VR r (B 1 ) (see the proof for the definition of B 0 and B 1 ) already appears as Proposition 2.2 in the appendix of [CCR13]. The novelty in the next Theorem is the characterization of the persistent module PH VR k (X) in terms of t b (X) and t d (X). Theorem 4.4. Let (X, d X ) be a metric space with n points. Here, PH VR k (X) denotes the reduced homology of the VR-complex: H k (VR * (X)). Then: 1. For all integers k > n 2 − 1, PH VR k (X) = 0.
2. If n is even and k = n 2 − 1, then Example 4.5.
Let us consider the case k = 1 and n = 4. Let X = {x 1 , x 2 , x 3 , x 4 } as shown in Figure  8. In order for PH VR 1 (X) to be non-zero, VR r (X) has to contain all the "outer edges" and none of the "diagonals". That is, there exists r > 0 such that In other words, we require that max(d 12 , d 23 , d 34 , d 41 ) < min(d 13 , d 24 ) and, in that case, PH VR However In general, we want to partition X into pairs of "opposite" points, that is pairs x, y such that v d (x) = y and v d (y) = x. Intuitively, this says that the diagonals are larger than every other edge. If not, as in the second case, then no persistence is produced. As for k = 1 and n = 4, we will generally label the points as x 1 , x 2 , x 3 , x 4 in such a way that t b (X) = max(d 12 , d 23 , d 34 , d 41 ) and t d (X) = min(d 13 , d 24 ).
Proof of Theorem 4.4. The proof is by induction on n. If n = 1, VR r (X) is contractible for all r, and so PH VR k (X) = 0 for all k ≥ 0 > n 2 − 1. If n = 2, let X = {x 0 , x 1 }. The space VR r (X) is two discrete points when r ∈ [0, diam(X)) and an interval when r ≥ diam(X). Then PH VR k (X) = 0 for all k ≥ 1 > n 2 −1, and PH VR 0 (X) = I[0, diam(X)). Furthermore, this interval module equals For the inductive step, assume that the proposition holds for every metric space with less than n points. Fix X with |X| = n and an integer k ≥ n 2 − 1. VR r (X) is contractible when r ≥ diam(X), so let r < diam(X) and choose any pair x 0 , x 1 ∈ X such that d X (x 0 , x 1 ) = diam(X). Let B j = X \ {x j } for j = 0, 1 and A = X \ {x 0 , x 1 }. Because of the restriction on r, VR r (X) contains no simplex σ ⊃ [x 0 , x 1 ], so VR r (X) = VR r (B 0 ) ∪ VR r (B 1 ). At the same time, VR r (A) = VR r (B 0 ) ∩ VR r (B 1 ), so we can use the Mayer-Vietoris sequence: where ι j are the maps induced by the inclusions A ⊂ B j . Since |B j | < n, the induction hypothesis implies that PH VR k (B j ) = 0, and so ∂ * is injective for any r. If, in addition, k > n 2 − 1, then PH VR k−1 (A) is also 0 by the induction hypothesis. Thus, H k (VR r (X)) is 0 for r ∈ [0, diam(X)] and, since VR r (X) is contractible when r ≥ diam(X), also for r ∈ [diam(X), ∞). This finishes the proof of case (1).
From this point on, we fix k = n 2 − 1 and focus on case (2). By induction hypothesis, ) or 0 depending on whether t b (A) < t d (A) or not. However, that is not the condition that determines if PH VR k (X) is non-zero. The relevant quantity is the following: We claim that PH In other words, VR r (B 0 ) is a cone C(VR r (A), x 1 ) over VR r (A), so it is contractible. The same holds for VR r (B 1 ), so their homology is 0, and the Mayer-Vietoris sequence gives an isomorphism We now show that H k (VR r (X)) = 0 for any r / . Thus, the composition, which is induced by inclusions, is an isomorphism. This implies that the first map H k−1 (VR r (A)) → H k−1 (VR r (B 1 )) is injective, which, in turn, makes H k−1 (VR r (A)) → H k−1 (VR r (B 0 )) ⊕ H k−1 (VR r (B 1 )) injective.
Since ∂ * in (4.1) is also an injection, H k (VR r (X)) = 0 for r ∈ [t b (A), b). Next, if r < t b (A) or t d (A) ≤ r < diam(X), H k (VR r (A)) = 0, so H k (VR r (X)) = 0 from the Mayer-Vietoris sequence. Lastly, if r ≥ diam(X), then VR r (X) is contractible. Altogether, these cases give PH VR k (X) = I[b, t d (A)). If, on the other hand, b ≥ t d (A), we obtain PH VR k (X) = 0 by using the above cases for r ∈ The last thing left to check is that VR * (X) produces persistent homology precisely when t b (X) < t d (X). So far we have PH VR a, A). In other words, for every A)) for some j = 0, 1. However, we have b ≥ d X (a 0 , x j ) by definition, so b would still be greater than t d (a 0 , X) even if t d (a 0 , X) = t d (a 0 , A). With this in mind, we have two cases depending on whether b = t b (A) or not. If they are equal, notice that t b (a, A) ≤ t b (a, X) for every a ∈ A because t b (a, X) takes the maximum over a larger set than t b (a, A) does. Then Remark 4.6 (A geometric algorithm for computing PH VR k (X) when |X| = n and k = n 2 − 1.). Recall that t b (x) and t d (x) are the two greatest distances from x to every other point in X. Both can be found in at most (n − 1) + (n − 2) = 2n − 3 steps because finding a maximum takes as many steps as the number of entries. We compute both quantities for each of the n points in X, and then find t b (X) = min x∈X t b (x) and t d (X) = min x∈X t d (x) in n steps each. After comparing t b (X) and t d (X), we are able to determine whether PH VR k (X) is equal to I[t b (X), t d (X)) or to 0 in at most n(2n − 3) + 2n + 1 = 2n 2 − n + 1 = O(n 2 ) steps. This is a significant improvement from the bound O(n ω(k+2) ) given in [MMS11] (cf. Remark 3.15). Indeed, using n = 2k + 2, our custom tailored algorithm incurs a cost O(k 2 ) whereas the standard algorithm incurs the much larger cost ≈ O((2k) ω(k+2) ). You can see a parfor based Matlab implementation in our github repository [GM21].

Computational examples
Theorem 4.4 has two consequences for VR-persistence sets. The first is the following corollary.
Corollary 4.7. Let X be any metric space and k ≥ 0. D VR n,k (X) is empty for all n < 2k + 2. This means that the first interesting choice of n is n = 2k+2, and in that case, any sample Y ⊂ X with |Y | = n will produce at most one point in its persistence diagram. What's more, this allows us to visualize D VR 2k+2,k (X) by taking all possible such samples Y ⊂ X and plotting their persistence diagrams in the same axis; see Figure 4. In other words, we plot D VR 2k+2,k (X) as a subset of R 2 where each point (t b , t d ) ∈ D VR 2k+2,k (X) corresponds to a possibly non-unique n-point sample Y ⊂ X such that dgm VR k (Y ) = {(t b , t d )}; see Figure  5 for an example. We can take this one step further and color the graph according to the density of the points to obtain a plot of the persistence measure U VR 4,1 (X). For these reasons, we give a name to this particular persistence set.
Notation: D VR 2k+2,k (X) and U VR 2k+2,k (X) are called, respectively, the principal persistence set and the principal persistence measure of X in dimension k. Figure 9 shows computational approximations to the principal persistence measure U VR 4,1 of S 1 , S 2 , and T 2 = S 1 × S 1 . The spheres are equipped with their usual Riemannian metrics d S 1 and d S 2 respectively. As for the torus, we used the 2 product metric defined as for all (θ 1 , θ 2 ), (θ 1 , θ 2 ) ∈ T 2 . The diagrams were computed using a MATLAB wrapper 3 for Ripser [Bau19] developed by C. Tralie using over 1,000,000 random 4-tuples of points. It should be noted that only about 12% of those configurations generated a non-diagonal point.
We can observe the functioriality property D VR n,k (X) ⊂ D VR n,k (Y ) whenever X → Y in these : From left to right: computational approximations to the 1-dimensional persistence measures U VR 4,1 (S 1 ), U VR 4,1 (S 2 ), and U VR 4,1 (T 2 ). The colors represent the density of points in the diagram. The support of each measure (that is, the colored region) is the persistence set D VR 4,1 of the corresponding metric space. Notice how these results agree with the functoriality property (cf. Remark 3.10): namely, that the persistence set of S 1 is a subset of the respective persistence sets of S 2 and T 2 .
graphs. Notice that S 1 embeds into S 2 as the equator, and as slices S 1 × {x 0 } and {x 0 } × S 1 in T 2 . The effect on the persistence sets is that a copy of D VR 4,1 (S 1 ) appears in both D VR 4,1 (S 2 ) and D VR 4,1 (T 2 ).

VR-Persistence sets of spheres
In this section, we will describe the principal persistence sets D VR 2k+2,k (S 1 ) for all k ≥ 0. After that, we will take advantage of functoriality to find some of the persistence sets of the higher dimensional spheres S m , m ≥ 2, and describe the limitations (if any) to obtain higher principal persistence sets. We begin with a general technical lemma.
Lemma 5.1. Let k ≥ 0 and n = 2k + 2. Let (X, d X ) be a metric space with n points. Then: 3. If X can be isometrically embedded on an interval, then t b (X) ≥ t d (X).

Proof.
1. If t b (X) ≥ t d (X), then pers(dgm VR k (X)) = 0 and items 1 and 2 are trivially true. Suppose, then, t b (X) < t d (X). Choose any Since d X (x 0 , x) ≤ t b (X), we get the coarse bound t d (X) ≤ 2t b (X). 2. The finer bound sep(X) ≥ t d (X) − t b (X) = pers(dgm VR k (X)) follows by taking the minimum of d X (x 0 , x) over x 0 and x.
3 Suppose, without loss of generality, that X ⊂ R and that x 1 < x 2 < · · · < x n . Notice that t d (x k ) = max(x k − x 1 , x n − x k ) and, in particular, 5.1 Characterization of t b (X) and t d (X) for X ⊂ S 1 Now we focus on subsets of the circle. We model S 1 as the quotient [0, 2π]/0 ∼ 2π equipped with the geodesic distance, i.e.
We define a cyclic order ≺ on S 1 by saying that x ≺ y ≺ z if, when viewing x, y, z as elements of [0, 2π], we have one of the three choices of 0 ≤ x < y < z, x < 2π and 0 ≤ y < z, or x < y < 2π and 0 ≤ z. In other words, if we define counter-clockwise to be the increasing direction in [0, 2π], then x ≺ y ≺ z means that the counter-clockwise path starting at x meets y before reaching z. We also use to allow the points to be equal.
Throughout this section, k ≥ 1 and n = 2k + 2 will be fixed. Let X = {x 1 , x 2 , . . . , x n } ⊂ S 1 such that x i ≺ x i+1 ≺ x i+2 for all i (addition is done modulo n). Write d ij = d S 1 (x i , x j ) for the distances, and assume t b (X) < t d (X).

For any i, and t
2. For any X ⊂ S 1 with |X| = 2k + 2, ). By Proposition 4.3, VR r (X) is a cross-polytope with n points. In particular, VR r (X) contains no simplices of dimension k + 1. We claim that this forces t d (x i ) = d i,i+k+1 for all i. Indeed, the shortest path between x i and x i+k+1 contains either the set {x i+1 , . . . , x i+k−1 } or the set {x i+k+2 , . . . , x i−1 } (see Figure 10). For any x j in that shortest path, In particular, VR r (X) doesn't contain the edge [x i , x i+k+1 ]. According to definition 2.14, cross-polytopes contain all edges incident on a fixed point and t b (x i ) = max j =i+k+1 d i,j . Additionally, the shortest path between x i and x i+k contains the set {x i+1 , . . . , x i+k−1 } rather than {x i+k+2 , . . . , x i−1 }, so d i,i+j ≤ d i,i+k for j = 1, . . . , k −1 (otherwise, VR r (X) would contain the k + 2 simplex [x i+k , x i+k+1 , . . . , x i ]). The analogous statement d i,i−j ≤ d i,i−k holds for j = 1, 2, . . . , k − 1. Thus, t b (x i ) = max(d i,i+k , d i,i−k ).
2. These equations follow by taking the maximum (resp. minimum) over all i of the above expression for t b (x i ) (resp. t d (x i )), as per Definition 4.1.
3. As we saw in the proof of item 1, the shortest path from x i to x i+k contains the set {x i+1 , . . . , x i+k−1 }. The length of this path is d i,i+k = d i,i+1 + · · · + d i+k−1,i+k .
x 1 x 2 x 3 x 4 x 5 x 6 t d (x 1 ) Figure 11: Example of a critical configuration for k = 2. The solid blue lines all have length t b (X) = 2π/3, while the dotted red line has length t d (X). Notice that two regular (k + 1)-gons are formed.

Characterization of D VR
2k+2,k (S 1 ) for k even Lemma 5.2 shows that every configuration has t b (X) ≥ k k+1 π. The converse holds in the case that k is even. We obtain the proof by exhibiting configurations such that t b (X) = t b and Proof. We will first construct what we call the critical configurations, those where t b (X) = k k+1 π and t d (X) = t d ∈ (t b (X), π]. Consider the points for i = 1, . . . , n. If i is odd, clearly x i−1 < x i . If i is even, x i − x i−1 = − kπ k+1 + t d > 0 because of Lemma 5.2 item 4 and the assumption that t d > t b . Thus, x 1 < x 2 < · · · < x n . Additionally, since t d ≤ diam(S 1 ), we have x 2k+2 = kπ k+1 + t d ≤ (2k+1)π k+1 < 2π, so we also have Since k is even, i and i + k have the same parity, To find t d (X) = min i d i,i+k+1 , we have two cases depending on the parity of i. If i ≤ k + 1 is odd (and i + k + 1 ≤ 2k + 2 even), and if i ≤ k + 1 is even, Thus, t d (X) = t d . Lastly, we can use these critical configurations to construct X such that t b (X ) = t b > k k+1 π. Let ε := t b − k k+1 π > 0, and take x k+1 = x k+1 + ε and x i = x i for i = k + 1. Write Thus, t b (X ) = max d i,i+k = t b and t d (X ) = min d i,i+k+1 = t d , as desired.

Characterization of D VR 2k+2,k (S 1 ) for k odd
An important difference between even and odd k is that only for even k can we find configurations that have the minimal possible birth time t b (X) = k k+1 π given any t d ∈ (t b (X), π]. The difference is that sequences of the form x i , x i+k , x i+2k , . . . eventually reach all points when k is odd, but only half of them when k is even (see Figure 11). This allows us to separate X ⊂ S 1 into two regular (k + 1)-gons with fixed t b (X) and it still allows control on t d (X), as shown in Proposition 5.3. For odd k, we will instead use an idea from Proposition 5.4 of [AA17]. We won't need the result in its full generality, so we only use part of its argument to provide a bound for t b (X) in terms of t d (X).
x 1 x 2 x 3 x 4 x 5 x 6 x 7 Figure 12: Example of a critical configuration for k = 3 Notice that t b (X) = 2L + s and t d (X) = 2L + 2s.
Theorem 5.5. For odd k, Proof. We got the inequality (k + 1)(π − t b ) ≤ t d in Theorem 5.4 and showed that equality can be achieved in the preceding paragraph. To get a configuration where (k+1)(π −t b ) < t d , construct the set X as above so that t d (X) = t d and t b (X) = π − 1 k+1 t d (X) is the smallest birth time possible with death time t d (X). Pick any t b such that t b (X) < t b < t d (X), and . Because of this, Analogously, x k+2 ≺ x k+2 ≺ x k+3 , and the cyclic ordering is maintained. As for the distances, we have Remark 5.6. The persistence sets of a circle λ π ·S 1 with diameter λ are obtained by rescaling the results of this section. For example, D VR 4,1 ( λ π · S 1 ) is the set bounded by 2(λ − t b ) ≤ t d and t b < t d ≤ λ.
In general, there are multiple configurations with the same persistence diagram, even among those that minimize the death time. The exception is the configuration that has the minimal birth time, as the following lemma shows.
Proposition 5.7. For any k ≥ 0, let n = 2k + 2. If X ⊂ S 1 has n points and satisfies t b (X) = k k+1 π and t d (X) = π, then X is a regular n-gon. As a consequence, the configuration X with n points such that dgm VR k (X) = {( k k+1 π, π)} is unique up to rotations.
Proof. An application of Lemma 5.2 item 3 and the triangle inequality gives: Thus, all intermediate inequalities become equalities, most notably, d i,i+k = k k+1 π for all i, and d j,j+k+1 = k+1 i=1 d i+j−1,i+j = π for all j. Then In other words, X is a regular n-gon.

Characterization of U VR 4,1 (S 1 )
The case of k = 1 in Theorem 5.5 allows us to find a probability density function for U VR 4,1 (S 1 ) with respect to the Lebesgue measure.
Proposition 5.8. Consider (S 1 , d S 1 , µ S 1 ) as an mm-space where µ S 1 is the uniform measure. Then, the measure U VR 4,1 (S 1 ) has probability density function with respect to the Lebesgue measure in R 2 . Here, we view D VR 4,1 (S 1 ) ⊂ R 2 .
Proof. Recall that we are modeling S 1 as the quotient [0, 2π]/0 ∼ 2π. Consider a set X = {x 1 , x 2 , x 3 , x 4 } ⊂ [0, 2π] of four points chosen uniformly at random. Relabel x i as x (j) ∈ [0, 2π] so that x (1) < x (2) < x (3) < x (4) . Consider the image of x (j) under the quotient map [0, 2π] S 1 , and let γ i be the path between x (i) and x (i+1) that doesn't contain any other point x (j) . Set y i = |γ i |. It can be shown that the pushforward of the uniform measure on [0, 2π] 4 into the set is the uniform measure, and the pushfoward of this measure under the map is also the uniform measure. Thus, we will model a configuration of four points in S 1 as the set of distances y 1 , y 2 , y 3 , y 4 instead. We will first find the cumulative distribution function of U VR 4,1 (S 1 ). To do that, we fix a point (t b , t d ) ∈ D VR 4,1 (S 1 ). According to Lemma 5.2, Since ∆ 3 (2π) has the uniform measure, the probability that divided by Vol(∆ 3 (2π)) = (2π) 3 3! . We will find Vol(R(t b , t d )) using an integral with a suitable parametrization of y 1 , y 2 , y 3 .
Assume that t b (X) = y 1 . There are four choices for t d (X), but to start, let t d (X) = y 1 +y 2 . Since y 3 ≤ y 1 by definition of t b (X), we have y 3 + y 2 ≤ y 1 + y 2 , but since y 1 + y 2 = t d (X), we actually have an equality t d (X) = y 1 + y 2 = y 3 + y 2 . Thus, this case is a subset of the case when t d (X) = y 2 + y 3 . Similarly, the case t d (X) = y 1 + y 4 implies t d (X) = y 3 + y 4 . Hence, we only have two possible choices for t d (X). Since they are symmetric, we can choose one of them and account for the symmetry later. Thus, set t d (X) = y 2 + y 3 .
The condition t b (X) = y 1 is equivalent to having y i ≤ y 1 for i = 2, 3, 4. Also, t d (X) = y 2 + y 3 gives y 2 + y 3 ≤ y 3 + y 4 , and so y 2 ≤ y 4 . It can be verified that the set of inequalities is equivalent to t b (X) = y 1 and t d (X) = y 2 + y 3 . By rewriting y 4 as 2π − y 1 − y 2 − y 3 , the inequalities in (6) become and (7) is equivalent to Since we are assuming that t b (X) < t d (X), we also have y 1 < y 2 + y 3 . If we make the substitution s = y 2 + y 3 , we find that (8)-(10) are equivalent to the following system of inequalities: Call the region defined by this system of inequalities R (t b , t d ). Notice that the Jacobian ∂(y 1 ,y 2 ,y 3 ) ∂(y 1 ,y 2 ,s) is 1. Also, there were four choices for t b (X) (all four y i ) and for each, two choices for t d (X) (y 2 + y 3 and y 3 + y 4 in our case). Thus, there were 8 possible choices for t b (X) and t d (X), so 1 dy 2 ds dy 1 .

The mixed derivatives
of both (11) and (12) are 16(π − t d ), so regardless of whether t b ≤ 2π 3 or not. This is the desired probability density function of U VR 4,1 (S 1 ). Example 5.9. Equation (12) gives This is the probability that a set {x 1 , x 2 , x 3 , x 4 } ⊂ S 1 chosen uniformly at random produces persistent homology at all in dimension 1. This is consistent with the 10.98% success rate obtained in the simulations; cf. Section 4.2.

Persistence sets of Ptolemaic spaces
Example 4.5 showed that in a metric space with four points, the birth time of its onedimensional persistent homology is given by the length of the largest side and the death time, by that of the smaller diagonal. In this section, we use Ptolemy's inequality, which relates the lengths of the diagonals and sides of Euclidean quadrilaterals, to bound the first persistence set D VR 4,1 of several spaces and show examples where the bound is attained.
It should be noted that the inequality holds for any permutation of x 1 , x 2 , x 3 , x 4 . Examples of Ptolemaic metric spaces include the Euclidean spaces R n and CAT(0) spaces; see [BFW09] for a more complete list of references. The basic result of this section is the following.
Taking square root gives the result.
Another way to phrase the above proposition is to say that D VR 4,1 (X) is contained in the set A key example where the containment is strict is the following.
Proposition 5.12. Let S 1 E denote the unit circle in R 2 equipped with the Euclidean metric. Then Proof. Observe that the Euclidean distance d E between two points in S 1 is related to their geodesic distance d by d E = f E (d) = 2 sin(d/2). Since f E is increasing on [−π, π], an interval that contains all possible distances between points in S 1 , a configuration X = {x 1 , x 2 , x 3 , x 4 } ⊂ S 1 produces non-zero persistence if, and only if, its Euclidean counterpart X E ⊂ S 1 E does. For this reason, D VR 4,1 (S 1 E ) = f E D VR 4,1 (S 1 ) . From Theorem 5.5, Applying Even though D VR 4,1 (S 1 E ) doesn't attain equality in the bound given by Proposition 5.11, it can be used to show that other spaces do. Two examples are S 2 and R 2 .
. Both R 2 and S 2 E ⊂ R 3 are Ptolemaic spaces, so Proposition 5.11 gives D VR 4,1 (R 2 ) ⊂ P ∞ and D VR 4,1 (S 2 E ) ⊂ P 2 . To show the other direction, notice that R 2 contains circles R · S 1 E of any radius R > 0. By functoriality of persistence sets (Remark 3.10), D VR 4,1 (R · S 1 E ) ⊂ D VR 4,1 (R 2 ) so, in particular, D VR 4,1 (R 2 ) contains the line [ √ 2R, 2R)×2R that bounds D VR 4,1 (R ·S 1 E ) from above (see Figure 13). The inequality . Thus, P ∞ ⊂ D VR 4,1 (R 2 ). The same argument with the added restriction of R ≤ 1 shows that P 2 ⊂ D VR 4,1 (S 2 E ). Two observations summarize the proof of Proposition 5.13: Ptolemy's inequality gives a region P ∞ that contains D VR 4,1 (R 2 ), while the circles in R 2 produce enough points to fill P ∞ . It turns out that this technique can be generalized to other spaces, provided that we have a suitable analogue of Ptolemy's inequality. This is explored in the next section.

Persistence sets of the surface with constant curvature κ
Consider the model space M κ with constant sectional curvature κ. In this section, we will characterize D VR 4,1 (M κ ). Proposition 5.13 already has the case κ = 0, so now we deal with κ = 0. To fix notation, let x, y ∈ R 3 . Define x, y = x 1 y 1 + x 2 y 2 + x 3 y 3 , and x|y = −x 1 y 1 + x 2 y 2 + x 3 y 3 .
We model M κ as In other words, M κ is the sphere of radius 1/ √ κ if κ > 0, and the hyperbolic plane of constant curvature κ < 0 with the hyperboloid model. The geodesic distance in M κ is given by To use the same technique as in Proposition 5.13, we use a version of Ptolemy's inequality for spaces of non-zero curvature.
is non-positive. In particular, Proof. [Val70a] proved that the determinant (14) is non-positive when κ = 1; we obtain the general version by rescaling the distances as follows. Let x i ∈ M κ for i = 1, 2, 3, 4; define y i = √ κx i , and d ij = d M 1 (y i , y j ). Notice that y i , y i = κ x i , x i = 1, so y i ∈ M 1 and, by (13), Then, the determinant is non-positive by Theorem 3.1 of [Val70a] and, by the Corollary following that, we get (15).
Valentine also generalized this result to hyperbolic geometry. The rescaling is analogous to the one in the previous lemma, so we omit the proof.
If κ < 0, let x 1 , x 2 , x 3 , x 4 ∈ M κ , and d ij = d Mκ (x i , x j ). Then the determinant is non-positive. In particular, With these tools, we are ready to prove the main theorem of this section.
Theorem 5.16. Let M κ be the 2-dimensional model space with constant sectional curvature κ. Then: Proof. The case κ = 0 was already done in Proposition 5.13. For κ > 0, let This shows that D VR 4,1 (M κ ) ⊂ P . For the other direction, let 0 < t ≤ 1 and s ∈ [0, π/2], and consider X = {x 1 , x 2 , x 3 , x 4 } where It can be checked that: (1 + sin(s))). Since arccos(t) is decreasing, we have (1 + sin(s))), and Notice that for a fixed t, t b (X) is minimized at s = 0 and the equality in (18) is achieved. Also, t d (X) is maximized at t = 1, at which point t d (X) = π √ κ . Now, let (t b , t d ) ∈ P be arbitrary. If we set t b (X) = t b and t d (X) = t d , we can solve the equations above to get , and Such a t exists because cos( √ κt d ) ≤ 1. As for s, the half-angle identity 1 − cos(x) = 2 sin 2 (x/2) gives the equivalent expression Since (t b , t d ) satisfies (18), the right side is bounded below by 0 and, since t b < t d ≤ π √ κ , it is also bounded above by 1. Thus, there exists an s ∈ [0, π/2] that satisfies the equality. This finishes the proof of P ⊂ D VR 4,1 (M κ ). The proof for κ < 0 proceeds in much the same way. The only major change is in the definition of the points x i when showing P ⊂ D VR 4,1 (M κ ): Other than that, and the fact that M κ is unbounded when κ < 0, the proof is completely analogous.
A related result appears in [BHPW20]. The authors explore the question of whether persistent homology can detect the curvature of the ambient M κ . On the theoretical side, they found a geometric formula to compute dgmČ ech 1 (T ) of a sample T ⊂ M κ with three points, much in the same vein as our Theorem 4.4. They used it to find the logarithmic persistence P a (κ) = t d (T κ,a )/t b (T κ,a ) for an equilateral triangle T κ,a of fixed side length a > 0, and proved that P a , when viewed as a function of κ, is invertible. On the experimental side, they sampled 1000 points from a unit disk in M κ and were able to approximate κ using the average VR death vectors in dimension 0 and average persistence landscapes in dimension 1 of 100 such samples. For example, one method consisted in finding a collection of landscapes L κ labeled with a known curvature κ, and estimating κ * for an unlabeled L * with the average curvature of the three nearest neighbors of L * . They were also able to approximate κ * without labeled examples by using PCA. See their paper [BHPW20] for more details.
Our Theorem 5.16 is in the same spirit. The curvature κ determines the boundary of D VR 4,1 (M κ ), and instead of triangles, we could use squares with a given t d and minimal t b to find κ. Additionally, we can qualitatively detect the sign of the curvature by looking at the boundary of D VR 4,1 (M κ ): it is concave up when κ > 0, a straight line when κ = 0, and concave down when κ < 0. See Figure 14.

Persistence sets of S m for m ≥ 3
General spheres are another example where our strategy provides a characterization of their persistence sets. The next proposition is inspired by the equality condition in Ptolemy's theorem, that is, equality occurs when the four points lie on a circle. We can generalize that argument to higher dimensional spheres.  Proof. S m E contains copies of λ · S n−2 E for λ ∈ [0, 1], so λ∈[0,1] λ · D VR n,k (S n−2 E ) ⊂ D VR n,k (S m E ). For the other direction, notice that a set X ⊂ S m E ⊂ R m+1 with n points generates an (n − 1)hyperplane which intersects S m E on a (n − 2)-dimensional sphere of radius λ ≤ 1. Thus, X ⊂ λ · S n−2 E , so D VR n,k (S m E ) ⊂ λ∈[0,1] λ · D VR n,k (S n−2 E ). In particular, this gives a description of the first principal persistence set of all spheres with the Euclidean metric.
Proof. By Proposition 5.17, for every m ≥ 3, However, it is clear that λ · D VR 4,1 (S 2 E ) ⊂ D VR 4,1 (S 2 E ), as the latter is convex. Thus, and now, Proposition 5.13 gives the result. Given that we know several persistence sets of spheres, we can use them, together with the stability in Theorem 3.12, to find lower bounds for the Gromov-Hausdorff distance between the circle and other spheres.
Clearly, this distance is smallest when D 1 is on the line with equation y = 2(π −x) (case k = 1 in Theorem 5.4). Additionally, the maximum is minimized when |x 1 − x 2 | = |y 1 − y 2 |. If both conditions can be achieved, we will have minimized the ∞ distance. The only possibility, though, is x 2 ≤ x 1 and y 2 ≤ y 1 (if either inequality is reversed, the ∞ distance would be larger because has negative slope). In that case, the solutions to the system of equations x 1 −x 2 = y 1 −y 2 and y 1 = 2(π−x 1 ) are x 1 = 1 3 (2π+x 2 −y 2 ) and y 1 = 2 3 (π−x 2 +y 2 ). Thus, This quantity is positive because x 2 , y 2 is below , that is, y 2 ≤ 2π − 2x. Now fix D 1 as the solution described in the previous paragraph and let D 2 vary. The distance d B (D 1 , D 2 ) can be equal to 1 2 pers(D i ) if that quantity is larger than d ∞ (D 2 , ) for either i = 1, 2. Notice, also, that pers(D 1 ) = pers(D 2 ) because x 1 − x 2 = y 1 − y 2 . If we can find D 2 such that then the maximum will have been achieved. Equation (19) can be simplified to y 2 = − 1 5 x 2 + 4π 5 . The point D 2 = (x 2 , y 2 ) that realizes the Hausdorff distance will be in the intersection of this line and D VR 4,1 (S 2 ) and have maximal persistence. That is achieved in the intersection  Figure 16: The point D 2 that realizes the Hausdorff distance between D VR 4,1 (S 1 ) and D VR 4,1 (S 2 ) with respect to the bottleneck distance. The shaded region is D VR 4,1 (S 1 ) and the black lines outline D VR 4,1 (S 2 ). The blue line is y 2 = − 1 5 x 2 + 4π 5 , the region where 1 2 pers(D 2 ) = d ∞ (D 2 , ), and is the line y = 2(π − x) ⊂ ∂(D VR 4,1 (S 1 )).

Concentration of persistence measures
By paring D F n,k (X) with the persistence measure U F n,k (X), we can view persistence sets as an mm-space . The main result in this section is that D F n,k (X) concentrates to a one-point mm-space * as n → ∞. Since * is generic, we also prove that the expected bottleneck distance between a random diagram D ∈ D F n,k (X) and dgm F k (X), the degree-k persistence diagram of the space X, goes to 0 as n → ∞, effectively showing that D F n,k (X) concentrates to dgm F k (X) when the latter is viewed as a one-point mm-space equipped with the trivial choices of metric and probability measure.
Example 6.1 (The case of an mm-space with two points). Let X = {x 1 , x 2 } be a metric space with two points at distance ε and mass µ X (x 1 ) = α, µ X (x 2 ) = 1−α for some α ∈ (0, 1). For each n ∈ N, the matrices in K n (X) are of the form M 0 = 0 ∈ R n×n + or M Π = Π T M 1 Π for some Π ∈ S n , where For the curvature measure µ n on K n (X), we have w n := µ n (M 0 ) = α n + (1 − α) n . This comes from choosing either all n points to be x 1 or all to be x 2 . The rest of the mass is distributed among the non-zero matrices of K n (X). Notice that w n → 0 as n → ∞.
As for the persistence sets D VR n,k (X), the only interesting case is at k = 0. Here, U VR n,0 is supported on the two point set D VR n,0 (X) = {0 D , (0, ε)}, where 0 D is the empty diagram of D. From the computations above, U VR n,0 (0 D ) = w n and U VR n,0 ((0, ε)) = 1 − w n .
The fact that w n → 0 as n → ∞ means that the mass concentrates on (0, ε), so, as an mm-space, D VR n,0 (X) is converging to the 1-point mm-space where δ (0,ε) is the Dirac delta measure concentrated on δ (0,δ) . This is the persistence diagram dgm VR 0 (X) viewed as a 1-point mm-space.
We now generalize this result.

A concentration theorem
Let (X, d X , µ X ) be an mm-space. Using terminology from [CM10b, Section 5.3], we define the functions f X : R + → R + given by ε → inf x∈X µ X (B ε (x)). Note that f X (ε) > 0 for every ε > 0 since supp(µ X ) = X is compact. Define also The relevant result from that paper is the following: Theorem 6.2 (Covering theorem [CM10b,Theorem 34]). Let (X, d X , µ X ) be an mm-space. For a given n ∈ N and ε > 0 consider the set Then µ ⊗n X (Q X (n, ε)) ≤ C X (n, ε).
We now prove our concentration result.
Theorem 6.3. Let (X, d X , µ X ) be an mm-space and take any stable filtration functor F. For any n, k ∈ N, consider the random variable D valued in D F n,k (X) distributed according to U F n,k (X). Then: • As a consequence, the mm-space D F n,k (X) = D F n,k (X), d B , U F n,k (X) concentrates to a one-point mm-space as n → ∞.
Proof. Fix ε > 0. Let X = (x 1 , . . . , x n ) ∈ X n be a random variable distributed according to µ ⊗n X . Since U F n,k (X) is the push-forward of the product measure µ ⊗n X under the map X (X) . Then, we can make a change of variables to rewrite the expected value of d B (D, dgm F k (X)) as follows: Remark 6.4. We can give an explicit upper bound for E U F n,k (X) d B D, dgm F k (X) in the case that µ X is Ahlfors regular. Given d ≥ 0, µ X is Ahlfors d-regular if there exists a constant C ≥ 1 such that To get the upper bound, set ε = 4C 1/d ln n n 1/d . If µ X is Ahlfors d-regular, and Then,

Coordinates
The objects U F n,k (X) can be complex, so it is important to find simple representations of them. Since these objects are probability measures on the space of persistence diagrams D, we follow the statistical mechanics intuition and probe them via functions. In order to accomplish this, one should concentrate on families of functions ζ α : D → R, for α in some index set A. One example of a family that is compatible with this is the so called maximal persistence of a persistence diagram: . In general, one may desire to obtain a class of coordinates [ACC16,Kal19] that are able to more or less canonically exhaust all the information contained in a given persistence diagram. A further desire is to design the class {ζ α } α∈A in such a manner that it provides stable information about a given measure U ∈ P 1 (D).
A first step in this direction is a stability result. To set up notation, let F be a filtration functor. Let n ≥ 0, k ≥ 0 be integers, and take an mm-space (X, d X , µ X ). Consider a coordinate function ζ : D → R. The pushforward ζ # U F n,k (X) is a probability measure on R. We denote its distribution function by H X (t; n, k, F, ζ) Theorem 7.1. Let ζ : D → R be an L(ζ)-Lipschitz coordinate function, and suppose F is a stable filtration functor. Write H X (t) = H X (t; n, k, F, ζ) to simplify the notation. Then, for any two mm-spaces X and Y , Proof. According to [Mém11,Lemma 6.1], where M U is the set of couplings between U F n,k (X) and U F n,k (Y ). Since ζ is Lipschitz, the right side is bounded above by Theorem 3.17 gives the last bound.

Persistence sets of metric graphs
Let G be a metric graph; see [BBI01,Mug19,MO18] for a definition. The central question in this section is what features of G are detected by D VR 2k+2,k (G). A first setting is the one when G is a tree.
Lemma 8.1. Let k ≥ 1. For any metric tree T and any X ⊂ T with |X| = n, PH k (X) = 0 and, thus, D VR n,k (T ) is empty. In particular, if n = 2k + 2, then t b (X) ≥ t d (X). Proof. Any subset X ⊂ T is a tree-like metric space. By Theorem 2.1 of the appendix of [CCR13], the persistence module PH k (X) is 0 for any k ≥ 1. In particular, if n = 2k + 2, Theorem 4.4 implies that t b (X) ≥ t d (X).
In other words, a metric graph G must have a cycle if D VR n,k (G) is to be non-empty, and even if it does, not all configurations X ⊂ G with |X| = n produce persistence. As we will see in Example 8.4, even if there is no tree X → T → G, X can still be a tree-like metric space. For this reason, it would be useful to have a notion of a minimal graph containing X. At least in the case n = 4, split metric decompositions provide a nice framework for our questions.

Split metric decompositions
We follow the exposition in [BD92]. Let (X, d X ) be a finite pseudo-metric space. Given a partition X = A ∪ B, the split metric δ A,B is defined as the function δ A,B (x, y) := 0, if x, y ∈ A or x, y ∈ B, 1, otherwise. Let and define the isolation index α A,B as Notice that both α A,B and β {a,a },{b,b } are non-negative. If the isolation index α A,B is nonzero, then the unordered partition A, B is called a d X -split. The main theorem regarding isolation indices and split metrics is the following.
Theorem 8.2 ( [BD92]). Any (pseudo-)metric d X on a finite space can be written uniquely as where d 0 is a (pseudo-)metric that has no d 0 -splits (also called split-prime metric), and the sum runs over all d X -splits A, B.
In what follows, we may write δ A,B and α A,B as δ a 1 ,...,an and α a 1 ,...,an , respectively, when A = {a 1 , . . . , a n }, and X and B = X \ A are clear from the context.
Let's focus on the case in which X has 4 points. It has been shown that d X has no split-prime component and, when A = {a, a } and B = {b, b }, α A,B = β A,B . Furthermore, there is at least one such partition for which d X (a, a ) + d X (b, b ) is maximal, which implies α A,B = 0. In that case, X can be isometrically embedded in the graph Γ X shown in Figure  17. We now study the persistence diagram of X.
x 1 x 2 Figure 17: The graph Γ X resulting from the split-metric decomposition of a metric space with 4 points. In this case, α x 1 , Proposition 8.3. Let X ⊂ Γ X be the metric space shown in Figure 17, and let a i = α x i , b = α x 1 ,x 4 and c = α x 1 ,x 2 .
(20) min(a, b), regardless of whether t b (X) < t d (X) or not.
Proof. 1. The desired conclusion is equivalent to v d (x 1 ) = x 3 and v d (x 2 ) = x 4 . Suppose, though, that v d (x 1 ) = x 4 and v d (x 2 ) = x 3 . In particular, this means that d 13 < d 14 and d 24 < d 23 , and these inequalities are equivalent to After rearranging them, we get b < a 4 − a 3 < −b, a contradiction. The case v d (x 1 ) = x 4 and v d (x 2 ) = x 3 follows analogously, so v d (x 1 ) = x 3 and v d (x 2 ) = x 4 . 2. Notice that the inequalities d 23 < d 13 and d 14 < d 24 are equivalent to which, after rearranging terms, result in −b < a 2 − a 1 < b. Using similar combinations, we find that max(d 12 , d 23 , d 34 , d 41 ) < min(d 13 , d 24 ) is equivalent to the system of inequalities in (20). If these hold, then t b (X) = max(d 12 , d 23 , d 34 , d 41 ) < min(d 13 , d 24 ) = t d (X). Conversely, if t b (X) < t d (X), then item 1 and the reasoning above imply (20). 3. If t b (X) ≥ t d (X), the bound is trivially satisfied. Suppose then, without loss of generality, that t b (X) = d 12 . Since a 3 + b + a 4 = d 34 ≤ d 12 = a 1 + b + a 2 , we have On the other hand, d 14 ≤ d 12 and d 23 ≤ d 12 give a 4 + c ≤ a 2 + b and a 3 + c ≤ a 1 + b. Then In min(a, b).
The following examples illustrate uses of Proposition 8.3.
If either X i = ∅, then X is contained in the other λ j π · S 1 , and dgm VR 1 (X) ∈ D VR 4,1 ( λ j π · S 1 ). Suppose, then, that X 1 and X 2 are non-empty. Two cases follow.
First, assume X 2 = {x 4 }, and set X = {p 0 , x 1 , x 2 , x 3 }. Let t = d G (p 0 , x 4 ). For i = 1, 2, 3, and t d (X) ≤ t d (x 1 ) = d 13 ≤ λ 1 , regardless of the position of x 4 . In other words, if the point (t b (X ), t d (X )) ∈ D VR 4,1 ( λ 1 π · S 1 ), then either t b (X) is still smaller than t d (X) and (t b (X), t d (X)) ∈ D VR 4,1 ( λ 1 π · S 1 ), or t b (X) becomes larger than t d (X) and dgm VR 1 (X) is empty. Even if t b (X ) ≥ t d (X ), it might be possible that t b (X) < t d (X). However, several conditions must be met. First, d 13 must be larger than d 12 and d G (x 1 , p 0 ) in order to have v d (x 1 ) = x 3 . In that case, t d (X) ≤ d 13 , so we also need t b (X ) < d 13 . Also, t b (X ) cannot be This point is also contained in D VR 4,1 ( λ 1 π · S 1 ), so no new persistence is generated. For the second case, let X 1 = {x 1 , x 2 } and X 2 = {x 3 , x 4 }. Let a i = d G (x i , p 0 ). Notice that d G (x i , x j ) = a i + a j for i ∈ {1, 2} and j ∈ {3, 4}. Then: In consequence, Analogously, α x 1 ,x 4 = 0 ≤ α x 1 ,x 2 . Then b = 0 in Proposition 8.3 and item 3 gives that dgm VR 1 (X) is the empty diagram. Note, in particular, that Γ X is a tree.
In summary, we've shown that if X 1 and X 2 are both non-empty, then either dgm VR 1 (X) is empty or it is in the union D VR 4,1 ( λ 1 π · S 1 ) ∪ D VR 4,1 ( λ 2 π · S 1 ). Thus, D VR 4,1 (G) = D VR 4,1 ( λ 1 π · S 1 ) ∪ D VR 4,1 ( λ 2 π · S 1 ). Example 8.4 shows that a configuration X ⊂ G produces persistence only if it is close to a cycle. That can happen when either X is contained in a circle λ i π · S 1 , or only one point is outside of λ i π · S 1 . In both cases, the graph Γ X contains a cycle since both a and b are non-zero. If |X 1 | = |X 2 | = 2, on the other hand, then Γ X is a tree. This might lead to conjecture that D VR 4,1 (G) decomposes as the union of D VR 4,1 (C), where C ⊂ G is a cycle. However, the following examples show otherwise.
Example 8.5. Let G be a graph formed by attaching edges of length L to a circle S 1 at the points y 1 ≺ y 2 ≺ y 3 ≺ y 4 . Let X = {x 1 , x 2 , x 3 , x 4 } ⊂ G. If X ⊂ S 1 , then no new persistence is produced, so the points in X have to be in the attached edges. Also, if t b (X) is to be smaller than t d (X), then each x i must be on a different edge. For example, if x 1 and x 2 are on the edge attached to y 1 , and x 3 and x 4 are on the edges adjacent to y 3 and y 4 , respectively, let X = {x 1 , x 2 , y 3 , y 4 }. This X consists of two points inside of a cycle and two points outside, so as we saw in Example 8.4 when |X 1 | = |X 2 | = 2, X is a tree-like metric space, and attaching edges at y 3 and y 4 doesn't change that. Thus, X is also a tree-like metric space.
Observe that D VR 4,1 (G) can have more points than D VR 4,1 (C). For example  Figure 19: A cycle C with four edges of length L=1 attached. This figure was obtained by sampling 100,000 configurations of 4 points from G. About 7.6% of those configurations produced a non-diagonal point.
Remark 8.6. It is curious to note that, in the last example, VR(G) VR(C), but D VR 4,1 (G) = D VR 4,1 (C). In other words, D VR 4,1 detects a feature of G that the Vietoris-Rips complex doesn't. See Figure 19 for an example.
Example 8.7. Let G be the graph with edges of length 1 shown in Figure 20. Let C be the cycle that passes through the vertices 1, 2, 6, 5, 8, 7, 3, 4. C has length 8, but there is no point (2, 4) in D VR 4,1 (G). The reason is that the shortest path between points in C is often not contained in C, and so C is not isometric to a circle. For example, the edge [1, 5] is not contained in C despite it being the shortest path from 1 to 5. 8.2 A family of metric graphs whose homotopy type can be characterized via D VR 4,1 .
Given an isometric embedding λ π · S 1 → G, the upper corner (λ/2, λ) ∈ D VR 4,1 ( λ π · S 1 ) also appears in D VR 4,1 (G). Moreover, (λ/2, λ) is the only point in D VR 4,1 ( λ π · S 1 ) that has t d = 2t b . Compare that to the recent examples in which there is a triangle in D VR 4,1 (G), coming from such an embedding, that is clearly distinguishable from the rest of the diagram. At this point, we can ask if there are conditions on G so that a point (t b , t d ) ∈ D VR 4,1 (G) with t d = 2t b must come from a configuration X ⊂ λ π · S 1 → G. It turns out that this is the right question, but the condition that G has to satisfy is elaborate. Before describing it, we prove a preliminary result.
Proof. If (λ/2, λ) ∈ D VR n,k (X), there exists Y ⊂ X with |Y | = n such that t b (Y ) = λ/2 and t d (Y ) = λ. For any i and x ∈ Y , x = v d (x i ), the definition of t b (Y ) and t d (Y ) gives Hence, d X (x i , v d (x i )) = λ and d X (x i , x) = d X (x, v d (x i )) = λ/2.
In particular, if (λ/2, λ) ∈ D VR 4,1 (G) for a metric graph G, then there exists a "square" X ⊂ G. By square, we mean that if X = {x 1 , x 2 , x 3 , x 4 }, then d G (x i , x i+1 ) = λ/2 and d G (x i , x i+2 ) = λ. It is tempting to suggest that X is contained in a cycle C ⊂ G isometric to λ π · S 1 , but this is not always the case. An example is shown in Figure 21. However, if G satisfies the hypothesis of Theorem 8.10, then at least we can ensure that X lies in a specific subgraph. Before that, we need one more preparatory result which was inspired by Theorem 3.15 in [AAG + 20].
Lemma 8.9. Let G = G 1 ∪ A G 2 be a metric gluing of the graphs G 1 and G 2 such that A = G 1 ∩ G 2 is a closed path of length α. Let j be the length of the shortest cycle contained in G j that intersects A, and set = min( 1 , 2 ). Assume that α < 2 . Then the shortest path γ uv between any two points u, v ∈ A is contained in A. As a consequence, if λ π · S 1 → G is an isometric embedding, then λ π · S 1 is contained in either G 1 or G 2 . Figure 21: A graph G and a set X ⊂ G such that t b (X) = π/2 and t d (X) = π. Notice that the outer black cycle C contains X but is not isometric to a circle. If it were, the shortest path in G between p 1 and p 2 would be contained in C, but that path is the blue edge of length π − ε.   This contradicts the assumption that d G (x 2 , x 4 ) = λ.
Case 2: |X 1 | = |X 2 | = 2. In this case, we have two ways to distribute the points of X, depending on whether we pair together the points that are at distance λ/2 or λ. If we choose the second option, we can write X 1 = {x 1 , x 3 } and X 2 = {x 2 , x 4 }. The path γ 12 ∪γ 23 ∪γ 31 is a cycle in G that intersects both G 1 and G 2 . Let u = γ (1) 12 ∩ A and v = γ (1) 23 ∩ A, and let γ uv ⊂ A be a path between them. By Lemma 8.9, d G (u, v) < |γ 12 ∪ γ uv ∪ γ (1) 23 is a path between x 1 and x 3 with length less than |γ 12 | + |γ 23 | = λ. This is again a contradiction.
On the other hand, since γ 23 is the shortest path between x 2 and x 3 and it passes through v, If there existed a path between v and x 3 of length smaller than d G (v, x 3 ), then the concatenation of that path and γ (1) 23 would give a path between x 2 and x 3 shorter than γ 23 . The same reasoning applies to x 2 and v, so the above equality holds. By a similar argument, we get λ/2 = d G (x 1 , u) + d G (u, x 4 ). Adding these two equations gives and combining this last equation with (21) and (22) produces, respectively, Then, using 24 and 21, we obtain λ ≤ 2d G (u, v).
If u = w 1 , then γ (1) 14 ∪ γ w 1 ∪ γ (1) 12 is a cycle that intersects A of length at most 2λ ≤ 2α. Then, is smaller than 2α by definition. However, this is a contradiction because 3α < by hypothesis. Thus, w 1 = u, and an analogous argument shows that w 2 = v. Since γ 12 is a shortest path between x 1 and x 2 , Thus, x 1 = u and x 2 = v. In other words, X 1 ⊂ A ⊂ G 2 , so X = X 1 ∪ X 2 ⊂ G 2 . Naturally, if γ 34 intersected A instead of γ 12 , then an analogous argument would give X ⊂ G 1 .
This means that there exists a particular number 0 ≤ t ≤ λ/2 such that the distances between points of X and u and v are given by the equations above. With this tool at hand, we now claim that at least one of the paths γ 1 := γ has length less than λ. This would imply that either d G (x 1 , x 3 ) or d G (x 2 , x 4 ) is less than λ, violating the assumption that t d (X) = λ.
Recall that γ uv ⊂ A, the latter of which is a path of length α < 3 , and that is the length of the smallest cycle contained in either G 1 or G 2 that intersects A. Since C i ⊂ G i , we have ν ≤ α < 3 ≤ L 1 + L 2 6 , as desired. This forces d G (x 1 , x 3 ) ≤ |γ 1 | < λ, violating the assumption that t d (X) = λ. This concludes the proof of Case 3.2, and gives the Theorem.
To close up this section, we explore a consequence of Theorem 8.10. Once more, this application is inspired by [AAG + 20], specifically Proposition 4.1. We will assume that all edges have length 1 for the sake of simplicity.
Theorem 8.11. Let T 1 , . . . , T m be a set of trees, and for each k = 1, . . . , n, let C k be a cycle of length L k = 2λ k . Suppose that all λ k are distinct. Let G be a graph formed by iteratively attaching either a tree T i or a cycle C k along an edge or a vertex. Then, the number of points (λ/2, λ) ∈ D VR 4,1 (G) is equal to the number of cycles C k that were attached. Furthermore, if X ⊂ G is a set of 4 points such that t b (X) = λ/2 and t d (X) = λ, then X is contained in a cycle C k and L k = 2λ.  Figure 26: Two examples of admissible graphs G as in Corollary 8.12 and their persistence set D VR 4,1 (G). The red triangles are the boundaries of the sets D VR 4,1 (C) for every cycle C ⊂ D VR 4,1 (G). Left: Two cycles of lengths 1 = 3.5 and 2 = 4.5 pasted over an edge of length α = 0.5 < 1 3 min( 1 , 2 ). Right: A tree of cycles. Each persistence set was found by sampling 100,000 uniform configurations from G.
• Describe D VR 2k+2,k (S m E ) for all k and m: Propositions 5.13 and 5.17 are a step in that direction. In fact, the latter implies that we only need to find D VR 2k+2,k (S 2k E ) to determine D VR 2k+2,k (S m E ) for all spheres with m ≥ 2k + 1. • Stabilization of D VR 2k+2,k (S n E ) : When k = 1, Corollary 5.18 shows that D VR 4,1 (S m ) stabilizes at m = 2 instead of m = 3, as given by Proposition 5.17. The key to the reduction was the use of Ptolemy's inequality as in Theorem 5.16. A natural follow up question, even if it is subsumed by the previous one, is when does D VR 2k+2,k (S m E ) really stabilize for general k.