Stratifying the space of barcodes using Coxeter complexes

Embeddings of the space of barcodes in Euclidean spaces are unstable due to the permutation of the bars of a barcode. We use tools from geometric group theory to produce a stratification of the space Bn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {B}}_n$$\end{document} of barcodes with n bars that takes into account these permutations. This gives insights in the combinatorial structure of Bn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {B}}_n$$\end{document}. The top-dimensional strata are indexed by permutations associated to barcodes as defined by Kanari, Garin and Hess. More generally, the strata correspond to marked double cosets of parabolic subgroups of the symmetric group Symn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Sym}}_{ n }$$\end{document}. This subdivides Bn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {B}}_n$$\end{document} into regions that consist of barcodes with the same averages and standard deviations of birth and death times and the same permutation type. We obtain coordinates that form a new invariant of barcodes, extending the one of Kanari–Garin–Hess. This description also gives rise to metrics on Bn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {B}}_n$$\end{document} that coincide with modified versions of the bottleneck and Wasserstein metrics.


Introduction
Barcodes [10,18,20] are topological summaries of the persistent homology of a filtered space.The barcode B associated to a filtration {X t } t∈R is a multiset of points (b, d) ∈ R 2 .It summarises the creation and destruction of homology classes while varying the parameter t, which is often interpreted as "time".A bar (b, d) ∈ B corresponds to a homology cycle appearing in X b and becoming a boundary in X d .The first element of the pair (b, d) is called the birth and the second one the death.
Persistent homology has applications in many fields, from biology [9,19,26,33] to material science [16,28,34], astronomy [22] and climate science [30].In many of these applications, it is necessary to study statistics on barcodes.Unfortunately, the space of barcodes is not a Hilbert space, which means that it can be difficult to apply statistical methods to it.Several ways to overcome the issue exist, such as the creation of kernels to map barcodes into a Hilbert space [8,11,2,17].
In this paper, we tackle this issue from a different perspective.We use combinatorial tools from geometric group theory to define new coordinates for describing barcodes.These coordinates divide the space of barcodes into regions indexed by the averages and the standard deviations of births and deaths and Figure 1: The permutohedron [32] of order 4 is a polyhedral decomposition of the sphere where each vertex corresponds to an element of the symmetric group Sym 4 .Its 1-skeleton is the Cayley graph of Sym 4 (see also Fig. 5).
by the permutation type of a barcode as defined in [27,14].By associating to a barcode the coordinates of its region, we define a new invariant of barcodes.This opens the door to doing statistics on barcodes using methods from the field of permutation statistics.
Motivation The motivation for this work is to understand the space of barcodes from a combinatorial and geometric point of view.We call a barcode strict if there are no two pairs in it that have the same birth or death.It was observed in [27] that to a strict barcode B = {(b i , d i )} i∈{1,...,n} with n bars, one can associate a permutation σ B ∈ Sym n .It is the permutation such that the bar with the i-th smallest death has the σ B (i)-th smallest birth.This divides the set of strict barcodes with n bars into n! equivalence classes, one for each element of the symmetric group Sym n .Based on this observation, one can study the combinatorial properties of strict barcodes by describing these equivalence classes-or equivalently, the elements of Sym n -and the relations between them.
A first approach to this, taken in [27,14], is to consider the Cayley graph of the symmetric group with respect to the generating set given by adjacent transpositions (i, i + 1).This yields a combinatorial representation of the elements of Sym n .It tells us how a pair of permutations can be transformed into one another using transpositions one step at a time.However, it yields no information about "higher order relations" that exist among larger sets of permutations.
A way to resolve this is to add higher dimensional cells to the Cayley graph and to consider it more geometrically as a cell complex instead of as a (combinatorial) graph.A first approach would be to use that the Cayley graph of Sym n is the 1-skeleton of the permutohedron [32] of order n, see Fig. 1.This observation  embeds the Cayley graph into a polyhedral decomposition of the (n − 2)-sphere.As this is a more geometric object, it allows to continuously "walk" from one permutation to another.The problem is that only the vertices (and not the higher dimensional cells) of the permutohedron have an interpretation in terms of elements of the symmetric group.Furthermore, this representation lacks a notion of "size" for barcodes.For instance, the two barcodes depicted in Fig. 2 lie in the same equivalence class, i.e. have the same associated permutation.
The alternative that we suggest to overcome these problems is to work with Coxeter complexes instead of permutohedra.The Coxeter complex associated to Sym n is the dual of the permutohedron of order n (see Fig. 3).It forms a simplicial decomposition of the (n − 2)-sphere and is well-studied in the context of reflection groups and Tits buildings.For us, it has the advantage that its top-dimensional simplices correspond in a natural way to permutations and only passing through a face of lower dimension changes such a permutation.This allows for a better description of continuous changes between different permutations.It also has the advantage that it comes with an embedding in R n , where the additional two real parameters that are needed to describe positions relative to this (n − 2)-dimensional space have a natural interpretation in terms of the "size" of barcodes.Moreover, using the Coxeter complex description for barcodes allows to define the permutation type of any barcode.For non-strict barcodes, it is defined only up to parabolic subgroups of Sym n , i.e. subgroups that are generated by sets of adjacent transpositions.

Contributions
In this paper, we use Coxeter complexes to develop a description of the set B n of barcodes with n bars with coordinates that have natural interpretations when doing statistics with barcodes.These coordinates define a stratification of B n where the top-dimensional strata are indexed by the symmetric group Sym n .Our main contributions can be summarised as follows.
Theorem 1.1.Let B n denote the set of barcodes with n bars.
1. B n can in a natural way be seen as a subset of a quotient Sym n \R 2n .
2. B n is stratified over the poset of marked double cosets of parabolic subgroups of Sym n .
3. Using this description, one obtains a decomposition of B n into different regions.Each region is characterised as the set of all barcodes having the same average birth and death, the same standard deviation of births and deaths and the same permutation type σ B ∈ Sym n .
4. This description gives rise to metrics on B n that coincide with modified versions of the bottleneck and Wasserstein metrics.
For more detailed and formal statements of these results, see Proposition 4.2, Theorem 4.9, Corollary 4.10 and Proposition 5.2.
To obtain this description of B n we proceed as follows.A barcode is an (unordered) multiset of n pairs of real numbers (births and deaths).It can hence be seen as a point in the quotient space Sym n \(R n × R n ), where the action of Sym n permutes the coordinate pairs.Since the birth is smaller than the death for every barcode, B n is a proper subset of this quotient of R 2n .
The Coxeter complex Σ(Sym n ) associated to Sym n is a simplicial complex whose geometric realisation is homeomorphic to an (n − 2)-sphere.Hence, we can decompose R n as . This decomposition allows one to describe each point x ∈ R n via coordinates x θ , x, v x , where x θ specifies a point on the Coxeter complex, v x is the "cone parameter" and x parametrises the remaining R (for details, see Proposition 3.2, where the naming becomes clear as well).In summary, this describes B n as a subset of We call the coordinates that we obtain from this description Coxeter coordinates.[27,14].The stratification one obtains is induced by the simplicial structure of Σ(Sym n ).
The advantages of these new coordinates are two-fold: Firstly, using points in Coxeter complexes, one obtains coordinates that uniquely specify barcodes and are yet compatible with the combinatorial structure of B n given by permutation equivalence classes.Secondly, one resolves the earlier-mentioned problem that permutation equivalence classes themselves carry no notion of "size": The decomposition of B n into regions subdivides these equivalence classes by also taking into account the averages and standard deviations of births and deaths.This makes these regions a finer invariant than the permutation type.Therefore, they offer a new way to study statistics of barcodes by using both the average and standard deviation of births and deaths, which are commonly used summaries in Topological Data Analysis (TDA), and permutation statistics tools.The latter include the number of descents for instance, or the inversion numbers, which have proven useful for the study of the inverse problem for trees and barcodes [27,14].

Related work
This paper is a follow-up of the work started in [27,14] to study the space of barcodes from a combinatorial point of view.It extends the approach of considering permutations to classify barcodes to a finer classification that also takes into account the average and standard deviation of births and deaths.In [35], the author also observes a connection between barcodes and the symmetric group in a different setting, by studying the space of barcode bases using Schubert cells.Similarly, [24] also studies the space of barcode bases.
The idea of giving coordinates to the space of barcodes is not new [17,25].For example, the space of barcodes was given tropical coordinates in [25].In [3], it is mentioned that the space of barcodes can be identified with the n-fold symmetric product of R 2 , and the authors study the corresponding algebra of polynomials associated to the variety.
Finally, defining a polyhedral structure on a space to study statistics has been done for spaces of (phylogenetic) trees [4,21].The connection between phylogenetic trees, merge trees and barcodes is studied in [14].The polyhedral structure defined in this paper and in [4] seem to be related, but we leave this as future work.

Overview
In Section 2 we review the necessary background on barcodes and on Coxeter complexes.We use a standard way of realising Sym n as a reflection group to explain what we mean with "Coxeter coordinates" on R n in Section 3. We then describe the space B n of barcodes with n bars in terms of Sym n \R n × R n in Section 4.1, before adapting the coordinates of R n to B n in Section 4.2.In Section 4.3, we describe the stratification of B n induced by these coordinates.Corollary 4.10 decomposes the space of barcodes into regions indexed by the average and standard deviation of the births and deaths and the permutation associated to a barcode.Finally, in Section 5, we show that B n can be given metrics inspired by the bottleneck and Wasserstein distances and that it defines an isometry between a subset of Sym n \R n × R n and B n .

Background on TDA
We start by reviewing the necessary background on TDA.For the reader who is completely new to this, we refer to the reviews [10,18,20].Even though this work focuses on the space of barcodes and could be apprehended from a purely combinatorial point of view, we shortly mention where barcodes arise in the field of TDA.This section is not necessary for the understanding of this paper, and we will give the combinatorial definition of barcodes that we use in the next section.
Barcodes are topological summaries of a filtered topological space, i.e. a sequence of spaces ordered by inclusion.To obtain a barcode from a filtered space, one computes homology at each step and considers the maps induced by the inclusions.The output is called a persistence module, and it summarises the evolution of the homology at each step of the filtration.
More precisely, let {X t } t∈R be a filtered topological space, that is, each X t is a topological space and X t ⊆ X t if t ≤ t .The k-th persistence module associated to {X t } t∈R is given by H k ({X t } t∈R ), where H k denotes the k-th homology functor (over a field k).The Crawley-Bovey Theorem [12] states that under mild tameness conditions on {X t } t∈R , the associated persistence module can be decomposed as a direct sum of interval modules j∈J k ⊕nj Ij , where the interval module k Ij is the free k-module of rank 1 on the interval I j ⊆ R, with identity maps internal to I j , and is 0 elsewhere.This decomposition is unique up to reordering.Each interval represents the lifetime of a cycle in the filtered space.For instance, if a 1-cycle (a loop) appears in the topological space X bj for the first time and becomes a boundary (gets "filled in") in X dj , then this 1-cycle will be represented by the interval I j = [b j , d j ).The barcode associated to the persistence module is the multiset where each interval I j appears n j times.Usually, each I j is a half open interval , where b j is called the birth of the homological feature corresponding to I j and d j is called its death.If the interval I j is a half infinite interval, i.e. it is of the form [b i , ∞), it is called an essential class.
In this paper, we will identify such an interval with the pair (b j , d j ), since we are mostly interested in the combinatorics of the pairs and not the corresponding persistence module.Moreover, b j and d j will always take finite values in R.

The space of barcodes
We introduce here the main definitions used in this paper.We start by a more combinatorial definition of barcodes that we will use in this article.Remark 2.2.The reader familiar with persistent homology will notice that we suppose that the bars corresponding to essential classes have finite values instead of being half-open intervals.This is usually the case in practical applications, where such essential classes are given finite values for representing them on a computer.We also assume that every barcode consists of only finitely many bars.
Remark 2.3.The definition of strict barcodes was first introduced in [27] to define the bijection between the symmetric group on n elements and some equivalence classes of barcodes that we introduce in the next section.The setting in this paper is slightly different from [27] and [14], because all the barcodes considered there are specific to merge trees and arise from their 0-th persistent homology.This is why the definition of a strict barcode in [27] and [14] assumes the existence of an essential bar (b 0 , d 0 ) that contains all the others.In this paper however, barcodes can come from arbitrary filtrations in arbitrary dimension, and such a bar (b 0 , d 0 ) need not exist.Therefore we slightly adapt the definition of a strict barcode and the relation to the symmetric group in the next sections.
In practice, for finite barcodes, the indexing set J is commonly the set {1, ..., n}, giving the bars in the barcode an arbitrary but fixed ordering.We will also adopt this convention from now on.Note however that reordering the bars might change the indexing, but not the underlying barcode (see Example 2.4).It can sometimes be convenient to assume that the indexing is such that the births are ordered increasingly b 1 < b 2 < ... < b n , but we do not make this assumption in this paper unless specified.
We often represent a barcode by the set of intervals [b i , d i ] ⊂ R (as in Fig. 4).Another common way to represent barcodes is what is called a persistence diagram, where the pairs (b i , d i ) are represented as points in R 2 (as in Fig. 8).These points lie above the diagonal since b i < d i for all i.
Example 2.4.Fig. 4 shows an example of a strict barcode with two different indexing conventions.where γ runs over all possible matchings, i.e. maps that assign to each bar (b i , d i ) ∈ B either a bar in B or a point in the diagonal ∆, such that no point of B is in the image more than once.Here, Remark 2.6.The permutation γ acts as a "reindexing" of the indices of B and B , and in particular ensures that d B (B, B ) does not depend on any indexing of the bars.
The Wasserstein distance is defined in a similar way by taking the sum over all l 2 -distances between x and γ(x) instead: 2 ).
Remark 2.7.Note that in general, the barcodes B and B need not have the same number of bars.The diagonal allows matchings between barcodes with different number of bars, since "ummatched" bars can be sent to the diagonal.In this paper however, we are study the set of barcodes B n with exactly n bars (for arbitrary, but fixed n) and restrict ourselves to this case.We are mainly interested in B n as a set and the main results we prove do not depend on the metric that is chosen on B n .We will still with a slight abuse of notation mostly talk of B n as a space, without specifying a specific metric on it.An exception to that is Section 5, where we explain how a metric dB on B n , which is closely related to the bottleneck distance, occurs in an alternative description of the set B n that we work with later on.

Relation to the symmetric group
We write Sym n for the symmetric group on n letters, i.e. the group of all permutations of {1, . . ., n}.We usually use the one-line notation for permutations.That is, we specify σ ∈ Sym n by the its image of the ordered set {1, . . ., n}, e.g.we write σ = [132] ∈ Sym 3 if σ(1) = 1, σ(2) = 3 and σ(3) = 2.We make an exception for transpositions to simplify the notation: the transposition that switches i and j is denoted by (i, j).
Similarly, ordering the deaths d j1 < . . .< d jn gives rise to a permutation τ d with τ d (k) = j k .The permutation σ B associated to B is defined as σ B = τ −1 b τ d ; it tracks the ordering of the death values with respect to the birth values.
Remark 2.9.The permutations τ b and τ d both depend on the indexing choice of the b i and d i .However, the permutation σ does not depend on any indexing of the births and deaths, it is intrinsic to the multiset B. Indeed, σ B can be defined directly as the permutation that sends the i-th death (in increasing order) to the σ(i)-th birth (idem).If we assume that the births are ordered increasingly, then τ b = id and σ B can be defined directly by σ B = [j 1 j 2 . . .j n ], the indices of the deaths when they are ordered increasingly.

Background on Coxeter groups and complexes
Coxeter groups Coxeter groups form a family of groups that was defined by Tits in its modern form.They are abstract versions of reflection groups; in fact, the family of finite Coxeter groups coincides with the family of finite reflection groups.Besides their close connections to geometry and topology [15], Coxeter groups have a rich combinatorial theory [6].They appear in many areas of mathematics, e.g. as Weyl groups in Lie theory.We will view Sym n as one of the most basic examples of a Coxeter group.
Usually, one does not consider a Coxeter group W by itself but instead a Coxeter system (W, S), where S is a generating set of W that consists of involutions called the simple reflections.In what follows, we will tacitly assume that such a set of simple reflections is always fixed when we talk about a Coxeter group W .In the case where W = Sym n , we will take S to be the set of adjacent transpositions Coxeter complexes Each Coxeter group W can be assigned a simplicial complex Σ(W ), the Coxeter complex, that is equipped with an action of W .If W is a finite group with set of simple reflections S, the complex Σ(W ) is a triangula-tion of a sphere of dimension |S|−1.Coxeter complexes have nice combinatorial properties and are in particular colourable flag complexes [1, Section 1.6] that are shellable [5].
The top-dimensional simplices of Σ(W ) are in one-to-one correspondence with the elements of the group W . Furthermore, one recovers the Cayley graph of (W, S) as the chamber graph of Σ(W ), i.e. the graph that has a vertex for each top-dimensional simplex of Σ(W ) and an edge connecting two vertices if the corresponding simplices share a codimension-1 face [ ( The group W acts simplicially on Σ(W ) by left multiplication on the cosets, γ • (τ P ) := γτ P .
Remark 2.12.With a slight abuse of notation, we will in what follows often use the cosets τ P to also denote simplices in the geometric realisation of the Coxeter complex.To be coherent with the definition of a stratification (Definition 2.13), we will always consider these simplices to be closed.
The Coxeter complex Σ(Sym n ) For the case W = Sym n that we are interested in, the Coxeter complex Σ(Sym n ) is of dimension n − 2 and is isomorphic to the barycentric subdivision of the boundary of an (n − 1)-simplex.It can be realised geometrically as a triangulation of the (n − 2)-sphere.This complex is the dual to the permutohedron of order n (see Fig. 3).Fig. 6 depicts the Coxeter complex Σ(Sym 4 ).The top-dimensional simplices of Σ(Sym n ) are in one-to-one correspondence with the elements of Sym n .Two such simplices share a codimension-1 face if and only if the corresponding permutations differ by precomposing with an adjacent transposition (i, i + 1), i.e. by exchanging two neighbouring entries of the permutation.As a consequence, if x lies in the interior of a maximal simplex of the geometric realisation of Σ(Sym n ), it can be assigned a permutation τ ∈ Sym n .If x lies on a face of dimension k, then τ is well-defined only up to applying an element of a parabolic subgroup P ≤ Sym n that is generated by |S| − 1 − k = n − 2 − k adjacent transpositions.A concrete embedding of Σ(Sym n ) in R n will be described in more detail in Section 3.
Figure 6: The geometric realisation of the Coxeter complex Σ(Sym 4 ).The permutation corresponding to each triangle of the front of the sphere is indicated in black.The hyperplanes x i = x j depicted in colours correspond to the transpositions (i, j).The hyperplanes corresponding to adjacent transpositions (i, i+1) are in boldface.A detailed description of how to obtain such a geometric realisation of the Coxeter complex can be found in Section 3.
For later reference, we note that the identification S n−2 ∼ = Σ(Sym n ) gives a stratification of the sphere by its simplicial decomposition.The strata are the (closed) simplices of the geometric realisation and the stratification is over the partially ordered set (poset) specified by Eq. (2).Definition 2.13.[7] A set X is stratified over a poset P if there exists a collection of subsets {X i } i∈P of X such that: then it is a union of strata; 4. For every x ∈ X, there exists a unique i x ∈ P such that Xi x X i = X ix .Each X i is called a stratum.

Coxeter complex coordinates on R n
In this section, we describe R n as the product of a cone over the Coxeter complex Σ(Sym n ) with a 1-dimensional space orthogonal to it.This description is obtained by describing a standard way for realising Sym n as a reflection group [1,Example 1.11].In terms of Coxeter groups, this is often called the "dual representation", see e.In what follows, we will consider R n with the l 2 -norm • that is induced by the standard scalar product •, • .We let e 1 , . . ., e n denote the standard basis.The symmetric group Sym n acts on R n by permuting this standard basis.This action can be expressed in coordinates as It is norm-preserving and fixes the 1-dimensional subspace L = e spanned by e := e 1 + • • • + e n = (1, . . ., 1).Hence, there is an induced action on the orthogonal complement V = e ⊥ , which can be described as Note that L is the subspace consisting of all (x 1 , . . ., x n ) ∈ R n where x i = x j for all i, j.So in particular, every (x 1 , . . ., x n ) ∈ R n \ L has at least two coordinates that are different from one another.
The subspace V has a natural structure of a cone over the Coxeter complex Σ(Sym n ) associated to Sym n , see Remark 3.3.The transposition (i, j) ∈ Sym n acts on V by orthogonal reflection along the hyperplane permuting the i-th and j-th coordinates.Let H be the collection of all these hyperplanes, and let S r denote the (n − 2)-sphere of radius r > 0 around the origin in V (with respect to the norm induced by the restriction of the standard scalar product on R n ), i.e. S r = {v ∈ V | v = r}.The set of points x ∈ R n such that all coordinates are different is the configuration space The previous lemma describes how a permutation in Sym n can be associated to each point x ∈ Conf n (R).To understand why this is true, observe that if C is a connected component of S r \ H, then for all (x 1 , . . ., x n ) ∈ C: In particular, there is a unique τ ∈ Sym n such that In other words, the order of the elements x 1 , . . ., x n is given by τ ((1, . . ., n)), see Fig. 6 above for the case n = 4.The connected components of S r \ H are exactly the (interiors of) the maximal simplices of Σ. Sending each such component C to the facet of Σ(Sym n ) that corresponds to the permutation τ defined by Eq. ( 4) gives the desired isomorphism Σ ∼ = Σ(Sym n ).
Using spherical coordinates, we can express every point v ∈ V in terms of a radial component r > 0 and an angular component, which is equivalent to specifying a point v θ ∈ S r (i.e. a point in the geometric realisation of Σ(Sym n )).The upshot of this is that we obtain a new set of coordinates for points in R n \L.Proposition 3.2.Let n ≥ 2. There exist two projection maps Let Sym n act on R n by permuting the coordinates (Eq.(3)) and on the product R × R >0 × Σ(Sym n ) by extending the action on Σ(Sym n ) trivially on the first two factors.Then the map (p R n \L , q) is Sym n -equivariant.Proof.For every x ∈ R n , the orthogonal decomposition R n = e ⊕ V gives a unique way to write x = x • e + v x with x ∈ R and v x ∈ V , where x = e, x e, e = n i=1 We can describe the projection v x = x − x • e ∈ V in spherical coordinates.Its norm (the radius of the sphere) is , so v x is determined by this value together with a point x θ on the (n − 2)-sphere S vx , or equivalently on the geometric realisation of Σ(Sym n ).Notice that x ∈ L if and only if v x = 0, as the line L intersects V at its origin.
We define the map p : R n −→ R × R ≥0 : x → (x, v x ) and the map q : R n \ L −→ S n−2 : x → x θ .The point x θ is well-defined since x / ∈ L and therefore there exist i, j such that x i = x j .It is easy to see that (p The fact that (p R n \L , q) is Sym n -equivariant follows from Lemma 3.1 and because permuting the coordinates of x ∈ R n changes neither the average 1 n i x i nor the standard deviation i |x i − x| 2 1/2 .To summarise, every point x = (x 1 , . . ., x n ) ∈ R n \L determines the following three things: 1. its projection to L, given by x = 1 2. the norm of its projection to V , given by v x = n i=1 |x i − x| 2 1/2 ∈ R >0 ; 3. a point x θ in the geometric realisation of the Coxeter complex Σ(Sym n ) associated to Sym n .
Furthermore, x is uniquely determined by these three coordinates.

Remark 3.3.
There is an isomorphism gives rise to a decomposition R n ∼ = cone(Σ(Sym n ))×R.Indeed, the line L ⊂ R n corresponds to points x ∈ R n with v x = 0, which could be seen as "spheres of radius 0" in the projection q.
Example 3.4.We go through the previous construction in detail for the case of R 3 equipped with the natural action of the symmetric group Sym 3 , illustrating the example in Fig. 7. Consider R 3 = e 1 , e 2 , e 3 .The symmetric group Sym 3 acts on it by permuting the coordinates of each vector (x 1 , x 2 , x 3 ): Each γ ∈ Sym 3 can be written as a product of transpositions (i, j) and its action on R 3 is given by the performing the corresponding sequence of reflections along the hyperplanes x i = x j .The three (2-dimensional) planes corresponding to the equations x 1 = x 2 , x 2 = x 3 and x 1 = x 3 are indicated as lines on the left hand side of Fig. 7 to make the picture clearer.The subspace L that is invariant under this action is spanned by the vector (1, 1, 1) = e, shown in red in Fig. 7.
We can define new coordinates on R 3 , lying in e = L and e ⊥ = V , a 2dimensional subspace whose affine shift is depicted in green in Fig. 7, reflecting the decomposition of R 3 into a product of e and V .A point x ∈ R 3 can now be written as x • e + v x , where x ∈ R and v x ∈ V .We show on the right hand side of Fig. 7 how V , represented as R 2 , has the structure of a cone over a Coxeter complex.The figure shows the projections of the planes x 1 = x 2 , x 2 = x 3 and x 1 = x 3 and the intersection of V with the subspace e (red dot).To obtain the cone structure on V , we give it spherical coordinates (i.e.polar coordinates in this case).The first coordinate is the radius r, which determines a 1-sphere centred at the origin (the black circle).On the circle, a point v x is determined by an angle x θ .Intersecting the circle with the hyperplanes, we decompose it into | Sym 3 | = 6 (coloured) strata indexed by the symmetric group and forget about the angle x θ .For instance, if v = (v 1 , v 2 , v 3 ) with v 2 < v 3 < v 1 , the point v lies in the stratum indexed by [231]; this is the unique region that lies on those sides of the hyperplanes that satisfy x 1 > x 2 , x 2 < x 3 and x 1 > x 3 .
Let γ = (12).It acts on v via γ ).We denote its image by v γ := γ • v.The order of the coordinates of v γ satisfies v γ 1 ≤ v γ 3 ≤ v γ 2 , so v γ lies in the stratum indexed by the permutation [132].The image v γ of v through the action of γ corresponds to the reflection through the hyperplane x 1 = x 2 .
Remark 3.5.There are two special cases in Proposition 3.2, when x i = x j for all i, j, i.e. (x 1 , . . ., x n ) ∈ L and when x i = x j for all i = j, i.e. (x 1 , . . ., x n ) ∈ Conf n (R).For the former, we have p(x) = (x, v x ) = (x i , 0) and x θ is not defined.For the latter, q(x) = x θ lies in the interior of a top-dimensional simplex of Σ(Sym n ).Hence, it determines a unique element τ x ∈ Sym n .In fact, these are just the two extremes of a family of situations that can occur: If x i = x j for some i = j, then x θ lies on the corresponding hyperplane in H and hence on a lower-dimensional face of Σ(Sym n ).There exists a permutation τ ∈ Sym n such that but τ is not unique.It is defined only up to multiplication by the subgroup Note that P is generated by adjacent transpositions (i, i + 1), i.e. it is of the form T , where T ⊂ S is a subset of the set S of simple reflections of Sym n .Hence, it is a parabolic subgroup of Sym n (see Section 2.2).The number of adjacent transpositions in P depends on how many coordinates of (x 1 , . . ., x n ) agree, or, equivalently, the number of hyperplanes in H it lies on.Intuitively speaking, one could phrase this as "the more of the x i 's take the same value, the less 'permutation information' is left".The coset corresponds to the lowest dimensional face of Σ(Sym n ) that x lies on.It depends only on the values of the x i , not on the choice of τ .If x ∈ L, we have τ P = Sym n .This could be interpreted as the degenerate case where x θ lies on the unique (−1)-dimensional face of Σ(Sym n ) (see Definition 2.11).
Remark 4.1.We write X := Sym n \R n × R n to emphasise that Sym n acts from the left on this space.The reason we stress this is that later on, we will combine the statements here with descriptions of the Coxeter complex.There, the simplices are given by cosets τ P and the symmetric group acts on them by left multiplication.
There is a map φ from the space of barcodes with n bars to X given by φ : The image of φ is independent of the choice of indices for the bars of the barcode because the action of Sym n is factored out.The map φ is clearly injective, but it is not surjective as the birth time of a homology class is always smaller than its death time.The image of φ is the subspace Y of X given by For later reference, we note this observation in the following.In Section 5, we equip B n with metrics inspired by the bottleneck and Wasserstein distances.The map φ is an isometry with respect to these metrics.

Coxeter complexes for birth and death
We now introduce the Coxeter complex coordinates for B n .These coordinates are obtained by applying the map (p R n \L , q) of Proposition 3.2 to the two copies of R n in Y .

A stratification of B n
In this section, we describe the stratification that we obtain from the description of B n in terms of Coxeter complexes.
We start by extending Definition 2.8, the permutation assigned to a strict barcode, to the general case of B n .For non-strict barcodes, we cannot uniquely assign a permutation.However, there is a nice description of the set of all possible such permutations in terms of double cosets of parabolic subgroups: The double coset D B associated to B is defined as is the set of all the permutations σ that satisfy that the j-th death (in increasing order) is paired with the σ(j)-th birth.
Recall that the Coxeter complex Σ(Sym n ) is a simplicial complex with simplices given by cosets of parabolic subgroups τ P .This simplicial decomposition gives it the structure of a stratified space over the poset of cosets of parabolic subgroups equipped with reverse inclusion (see Section 2.2).Taking the cone and products of these simplices yields a decomposition of into strata that are compatible with the action of Sym n , i.e. each stratum is sent to another stratum of same dimension by the action of Sym n .This follows from Remark 3.3 and the fact that Σ(Sym n ) is stratified and the map (p R n \L , q) of Proposition 3.2 is Sym n -equivariant.The strata in Eq. ( 5) are indexed by pairs of cosets (τ 1 P 1 , τ 2 P 2 ), where τ 1 , τ 2 ∈ Sym n and P 1 , P 2 ≤ Sym n are parabolic subgroups2 .The partial ordering on these pairs is given component-wise by reverse inclusion (cf.Eq. ( 2)).
It follows that the quotient X = Sym n \R 2n is stratified over the quotient P of this poset by the action of Sym n .More concretely, P can be described as follows: The elements of P are orbits of the form Sym n •(τ 1 P 1 , τ 2 P 2 ), where τ 1 , τ 2 ∈ Sym n and P 1 , P 2 ≤ Sym n are parabolic subgroups.The partial ordering is given by Sym if there is γ ∈ Sym n such that This quotient poset P has a more explicit description in terms of another poset Q, which consists of "marked" double cosets of parabolic subgroups: Definition 4.7.Let Q be the poset consisting of all triples (P 1 , P A very similar poset is also studied as a two-sided version of the Coxeter complex by Hultman [23] and Petersen [31].We remark that Q is different from the poset of all double cosets of the form P 1 σP 2 : There can be P 1 = P 1 , P 2 = P 2 such that P 1 σP 2 = P 1 σP 2 (see [31,Remark 4]).

A metric on B n
In this section, we explain how the description of B n given in Section 4.1 with R n equipped with the l ∞ -norm gives rise to a naturally defined metric dB on B n that is closely related to the bottleneck distance.Similarly, the l 2 -norm on R n leads to a modified Wasserstein distance dW on B n .
To describe dB , we equip R 2n with the metric d ∞ induced by the l ∞ -norm.This metric induces a map X × X → R on the quotient by taking the minimum value over all representatives of the corresponding equivalence classes: We will show that this map restricted to Y agrees with a modified version of the bottleneck distance.
Note that the difference between the modified bottleneck distance and the original bottleneck distance as defined in Definition 2.5 is that for the modified version, one does not allow to match points of the barcodes to the diagonal ∆ (see Fig. 8).Furthermore, dB (B, B ) is well-defined only if both B and B contain the same number of bars, i.e. if they are both elements of the same B n .This is not necessary for the definition of the regular bottleneck distance, cf.Remark 5.3.This follows from simply spelling out the definitions.For points (x, y) and where • ∞ is the l ∞ -norm on R 2 .Combining this with the definition of d on X (see Eq. ( 6)), we obtain d(φ(B), φ(B )) = min  Remark 5.3.Forgetting about the diagonal as done above opens the door to defining new metrics on barcodes by considering distances on R n × R n and then taking the quotient as was done in this section.It could potentially be extended to barcodes with different number of bars.One could for instance imagine a map that forces matchings between as many bars as possible and then adds a positive weight equal to their distance to the diagonal to the unmatched bars if there are any.This is different from the bottleneck distance (or Wasserstein distance), which allows as many matchings as needed with the diagonal, see Fig. 8.When using barcodes to study data, bars close to the diagonal are usually considered as related to noise.However, there are cases where all the bars matter, for instance when the barcode is the one of a merge tree [27,14].In such a case, a new metric that does not take the diagonal into account could turn out useful.We leave this for future work.

Future directions
In this paper, we showed that the space B n of barcodes with n bars is stratified over the poset of marked double cosets of parabolic subgroups of Sym n .A question that arises is how this could be extended to the whole space of barcodes, i.e. to the union n∈N B n .An approach here would be to use appropriate inclusions B m → B n for m ≤ n.Note that on the group level, there are natural injections Sym m → Sym n .On the level of simplicial complexes, Σ(Sym n ) also contains copies of Σ(Sym m ) for m ≤ n.
It was shown in [27,14] that the permutation σ B associated to a strict barcode B gives nice combinatorial insight on the number of merge trees that have the same barcode.This number, called the tree-realisation number (TRN), is derived directly from the permutation.It can also be used to do statistics on barcodes.Our coordinates (Corollary 4.10) firstly extend this work to any (possibly non-strict) barcode and secondly return a finer invariant than just the permutation.A future direction would be to study this finer invariant defined by ( b, d, v b , v d , σ B ).It might be well-suited for studying statistical questions: The first four elements already have descriptions as averages and standard deviations.The behaviour of the permutation σ B could be studied using tools from permutation statistics, such as the number of inversions or descents.
In a different direction, the description of B n in terms of Coxeter complexes allows to rephrase these combinatorial questions in more geometric terms.Using this geometric perspective might give new ways for studying invariants and statistics on barcodes.
It would be interesting to see if the geometric and combinatorial tools developed here can help to understand inverse problems in TDA as the ones in [27,14,13,29].Since the merge tree to barcode problem is related to the symmetric group [27,14], it is also natural to ask whether the stratification that we obtain in Theorem 4.9 can be extended to the space of merge trees with n leaves.
Lastly, the modified bottleneck and Wasserstein distances seem to have a different behaviour than the usual ones.A deeper study of their properties and their potential extension to the space of barcodes (see Remark 5.3) is a natural next step to consider.

Figure 2 :
Figure 2: Two barcodes with the same associated permutation (the identity [1234]) but with large differences in their birth and death values.
turns out that for each barcode, these coordinates are b θ , b, v b and d θ , d, v d , where b and d are the averages of the births and deaths, v b and v d are their standard deviations and the coordinates b θ and d θ describe the permutation equivalence class of the barcode of Each such pair is called a bar ; its first coordinate b i is called the birth (time) and the second one d i is called its death (time).A barcode is called strict if b i = b j and d i = d j for i = j.We let B n denote the set of barcodes with n bars and B st n the set of strict barcodes with n bars.

Figure 4 :
Figure 4: (A) A barcode with 4 bars.(B) The same barcode with a different indexing where the bars are ordered by increasing birth times.
Definition 2.8.[27] Let B = {(b i , d i )} i∈{1,...,n} ∈ B st n be a strict barcode.If we order the births increasingly such that b i1 < . . .< b in , the indexing in {1, ..., n} gives a permutation τ b by τ b Fig. 4B shows the same barcode with the bars ordered by birth times.The corresponding permutations τ b = [1234] and τ d = [4132] are different, but the product σ B = τ −1 b τ d = [4132] is the same, as it does not depend on the indexing of the bars.Further examples are depicted in Fig. 5.We extend Definition 2.8 to non-strict barcodes in Section 4.3.

1 ,
Corollary 1.75].More generally, the set of k-simplices in Σ(W ) is in one-to-one correspondence with the cosets of rank-(|S| − 1 − k) parabolic subgroups of W : Definition 2.11.The Coxeter complex Σ(W ) of the Coxeter system (W, S) is defined as the simplicial complex Σ(W ) = T ⊆S W/P T = {τ P T | τ ∈ W, T ⊆ S}, where each simplex τ P T has dimension 1 dim(τ P T ) = |S \ T | − 1 and the face relation is defined by the partial order τ P T ≤ τ P T ⇔ τ P T ⊇ τ P T .
g. [1, Section 2.5.2].Example 3.4 below goes through the following steps in detail for the case n = 3.

Figure 7 :
Figure 7: Example of the decomposition of R 3 in Coxeter coordinates.

Proposition 4 . 2 .
The map φ defines a bijectionB n → Y ⊂ Sym n \R n × R n .

Theorem 4 . 3 .
Every barcode {(b i , d i )} i∈{1,...,n} ∈ B n such that at least two of the b i and two of the d i are different from each other determines the following five data: 1. its average birth time b = n i=1

Furthermore, these five
data uniquely determine B. Proof.Let B = {(b i , d i )} i∈{1,...,n} be such that at least two b i and two d i are different.By assumption, both (b 1 , . . ., b n ) and (d 1 , . . ., d n ) are points in R n \L.The image of B under φ (Proposition 4.2) is The image of [b 1 , ..., b n , d 1 , ..., d n ] under this bijection is the Sym n -orbit of(p R n \L , q) 2 (b 1 , ..., b n , d 1 , ..., d n ) = ( b, v b , b θ , d, v d , d θ ).The claim now follows since the action of Sym n on ( b,v b , b θ , d, v d , d θ ) is trivial on b, v b , d, vd and is given by the action of Sym n on the Coxeter complex Σ(Sym n ) for b θ , d θ .

Corollary 4 . 10 .
The Coxeter coordinates of Theorem 4.3 decompose the space B n of barcodes with n bars into disjoint regions.The region containing the barcode B = {(b i , d i )} i∈{1,...,n} ∈ B n is defined as the set of all barcodes B such that: 1. its average birth time is the same as that of B, i.e. b = b; 2. its average death time is the same as that of B, i.e. d = d; 3. its birth standard deviation is the same as that of B, i.e. v b = v b ; 4. its death standard deviation is the same as that of B, i.e. v d = v d ; 5. P B b = P B b , P B d = P B d and D B = D B .For strict barcodes, the information of the last Item 5 is equivalent to specifying σ B , the permutation associated to barcodes in Definition 2.8.

Proposition 5 . 2 .
The map d defines a metric on Y with respect to which φ : (B n , dB ) −→ (Y, d) is an isometry.Proof.As observed before in Proposition 4.2, φ maps B n bijectively onto Y .Hence, it is sufficient to show that for arbitrary barcodes B and B , dB (B, B ) = d(φ(B), φ(B )).

Figure 8 :
Figure 8: Two barcodes (red and blue) represented as persistence diagrams in R 2 .A. The matching that minimises the bottleneck or Wasserstein distance matches all the bars to the diagonal, as they are all very close to it.B. If bars are not allowed to be matched with the diagonal, the matching that minimises (b i , d i ) − (b γ(i) , y γ(i) ) ∞ for the bottleneck distance or i (b i , d i )−(b γ(i) , y γ(i) ) 2 respectively for the Wasserstein distance is different.