The Space of Equidistant Phylogenetic Cactuses

An equidistant X-cactus is a type of rooted, arc-weighted, directed acyclic graph with leaf set X, that is used in biology to represent the evolutionary history of a set \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X$$\end{document}X of species. In this paper, we introduce and investigate the space of equidistant X-cactuses. This space contains, as a subset, the space of ultrametric trees on X that was introduced by Gavryushkin and Drummond. We show that equidistant-cactus space is a CAT(0)-metric space which implies, for example, that there are unique geodesic paths between points. As a key step to proving this, we present a combinatorial result concerning ranked rooted X-cactuses. In particular, we show that such graphs can be encoded in terms of a pairwise compatibility condition arising from a poset of collections of pairs of subsets of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X$$\end{document}X that satisfy certain set-theoretic properties. As a corollary, we also obtain an encoding of ranked, rooted X-trees in terms of partitions of X, which provides an alternative proof that the space of ultrametric trees on X is CAT(0). We expect that our results will provide the basis for novel ways to perform statistical analyses on collections of equidistant X-cactuses, as well as new directions for defining and understanding spaces of more general, arc-weighted phylogenetic networks.


Introduction
Currently, there is great interest in developing theory and techniques to understand and construct (rooted) phylogenetic networks.Generally speaking, for a set of species, such a network consists of a rooted, directed acyclic graph and a bijective map from the species to the set of sinks of the graph (in case the graph is a tree, the network is called a (rooted) phylogenetic tree).Phylogenetic networks are important as they can be used to represent the evolutionary history of species that cross with one another (through evolutionary processes such as hybridization and recombination).To date, much of the research on phylogenetic networks has focused on understanding the structure of special types of networks and ways to build them (see [33] for a recent overview of the area).More recently, however, as the theory for phylogenetic networks has developed, there has been growing interest in understanding how to equip collections of phylogenetic networks with suitable metrics, giving rise to so-called network spaces.As has been demonstrated for the intensively studied spaces of phylogenetic trees (cf.e.g.[8,17], and the review [31]), or tree-spaces, this point of view is valuable as it provides insights into statistical approaches to analyze and systematically compare networks.
Network spaces essentially come in two types: discrete and continuous.In discrete spaces, the elements of the space are distinct, non-isomorphic networks, and a metric is commonly given by defining the distance between two networks to be the length of a minimal sequence of local network operations that converts one network into the other.In continuous spaces, the arcs in the networks have non-negative, real-valued lengths and one network can be converted into the other by shrinking or lengthening arcs in a continuous manner.To date, nearly all results on network spaces have concerned discrete spaces (see, for example, [9,16,23], for related results on discrete spaces of unrooted networks see e.g.[22]).Indeed, to the best of our knowledge, very few results have been presented on continuous network spaces except for the recently introduced spaces of (unrooted) circular split networks 1 [15].This is probably in part because the study of phylogenetic networks with arc lengths is somewhat less developed than the study of those without.c, d, e, f } with root ρ that is equidistant since every directed path from ρ to a sink has the same length, namely 13.All incoming arcs at vertices with indegree 2 have length 0 and are drawn horizontally.(b) The rooted X-cactus obtained by lengthening the incoming arc and shrinking the outgoing arcs at vertex v by 1. (c) The rooted X-cactus obtained by continuing the lengthening and shrinking of the arcs at vertex v until both outgoing arcs have length 0, contracting the cycle below v completely.
In this paper, we introduce a new continuous space of phylogenetic networks that can be regarded as a generalization of the τ -space of ultrametric trees that was introduced in [17].For a set X of species, our network space N(X) is comprised of equidistant X-cactuses (see Figure 1(a) for an example of such a network).A rooted X-cactus is essentially a rooted phylogenetic network in which no two distinct cycles in the underlying graph have an arc in common.Note that if all vertices of a rooted X-cactus have indegree at most 1 the network is just a rooted phylogenetic X-tree.The extensively studied class of (rooted) level-1 networks (see e.g.[28]) also provides examples of rooted X-cactuses.Assigning a non-negative real-valued length to each of the arcs in a rooted phylogenetic network, then such a network N is called equidistant if, for any fixed vertex v of N , all directed paths from v to any sink of N have the same length.Algorithms for constructing equidistant phylogenetic networks have been studied in, e.g., [10] and [13].
Following one of the common approaches used to construct tree-spaces, we define equidistant-cactus space N(X) in terms of an orthant space (see e.g.[24]).Basically, an orthant space is a collection of real orthants that are glued together along their boundaries and that is equipped with the metric induced by using the Euclidean metric within each orthant.That is, the distance between two points in the same orthant is the Euclidean distance between these points, and the distance between two points in different orthants is the length of a shortest path, or geodesic path, between these points.The length of such a path is computed by summing the Euclidean lengths of the restrictions of the path to each orthant.In particular, each pair of points in N(X) represents two equidistant X-cactuses, and moving along a geodesic path between the points continuously converts one X-cactus into the other by shrinking and lengthening arcs (see Figure 1(b) and (c)), which may also result in a change of the length of the paths from the root to the sinks.Note that the points of τ -space correspond bijectively to equidistant X-trees and that it can be constructed by gluing together orthants indexed by ranked phylogenetic trees.We take a similar approach to define N(X), indexing orthants instead by ranked X-cactuses, in which a ranking of the vertices that respects the direction of the arcs in the rooted X-cactus is given.We remark that ranked phylogenetic networks have been recently introduced and that research has focused on counting and enumerating certain classes of such networks (see e.g.[7,12] and the references therein).
A critical aspect that influenced our construction of N(X) was that -as has been shown for τ -space [17] -we wanted it to be a CAT(0)-metric space.Being CAT(0) is an important geometrical property that has been exploited in various applications within phylogenetics and beyond (see e.g.[3]).A space being CAT(0) immediately implies that there is a unique geodesic path between any two points, a property that underpins many useful computations that can be performed for tree-and orthant-spaces.More specifically, approximations of the median as well as of the Fréchet mean and variance can be computed in complete CAT(0)-metric spaces, which include CAT(0)-orthant spaces [24,4]; a central limit theorem holds for CAT(0)-orthant spaces [5]; and methods for computing confidence sets [36] and an analogue of partial principal component analysis [26,25] can be directly extended from the unrooted tree space presented in [8] to CAT(0)-orthant spaces.Most of this paper is devoted to proving a crucial combinatorial result concerning rooted X-cactuses (Theorem 11) which implies, via a classical result of Gromov for orthant spaces, that N(X) is CAT(0).In passing, we remark that the space of networks described in [15] is not a CAT(0)metric space.
The rest of this paper is structured as follows.In Section 2, we formally define rooted X-cactuses as well as some related concepts.In Section 3, we then introduce rankings of rooted X-cactuses and equidistant X-cactuses, which are both defined in terms of so-called time-stamp functions.As well as characterizing when a rooted X-cactus admits a ranking of its vertices that is consistent with the direction of its arcs, we make an important observation concerning ranked X-cactuses (Lemma 2), which implies that the maximal chains in a certain poset mentioned in the next paragraph all have the same length, i. e. |X| − 1.In Section 4, we use the simpler case of equidistant X-trees to outline our approach for the construction of a network space that is CAT(0), including a new proof that τ -space is CAT(0).
In Section 5, we describe how ranked X-cactuses give rise to set pair systems as defined in [21] and present the properties that characterize set pair systems that arise from ranked X-cactuses.We also define a binary relation on general set pair systems, and, in Section 6, we establish that this relation yields a bounded graded poset on the set pair systems that arise from ranked X-cactuses.In Section 7, we establish our main combinatorial result (Theorem 11), namely that chains in this poset encode ranked X-cactuses.In simpler terms, this can be regarded as a "pairwise compatibility" result for set pair systems, which is analogous to the well-known Splits Equivalence Theorem for unrooted phylogenetic trees (see e.g.[29,Theorem 3.1.4]).Using our encoding for ranked X-cactuses, in Section 8 we construct the space N(X) of equidistant X-cactuses and show that it is a CAT(0)-metric space.We conclude in Section 9 by mentioning some directions for future work.

Preliminaries
In this section, we define rooted X-cactuses and some related concepts that we use later.We begin by recalling some standard concepts from graph theory.A directed graph N = (V, A) consists of a finite non-empty set V and a subset A ⊆ V × V .The elements of V and A are referred to as vertices and arcs of N , respectively.A directed graph N is acyclic if there is no directed cycle in N .Moreover, a directed acyclic graph (DAG) N is rooted if there exists a vertex ρ ∈ V with indegree 0, called the root of N , such that for every u ∈ V there is a directed path from ρ to u.In a rooted DAG, a leaf is a vertex with outdegree 0, an internal vertex is a vertex with outdegree at least 1, a tree vertex is a vertex with indegree at most 1 and a reticulation vertex is a vertex with indegree at least 2. Note that, by definition, the root of a rooted DAG is a tree vertex.Moreover, in a rooted DAG N , we call a vertex v a child of a vertex u and, similarly, u a parent of v if (u, v) is an arc of N .The set of children of a vertex u is denoted by ch(u).A reticulation cycle {P, P } in a rooted DAG consists of two distinct directed paths P and P such that P and P have the same start vertex and the same end vertex but no other vertices in common.
Let X be a finite non-empty set.A rooted X-cactus N = (N, ϕ) is a rooted DAG N = (V, A) together with a map ϕ : X → V such that (RC1) all vertices of N have indegree at most 2, (RC2) no two distinct reticulation cycles in N have an arc in common, and (RC3) the image ϕ(X) contains all leaves and all tree vertices of N with outdegree 1 of N .
In Figure 2(a) we give an example of a rooted X-cactus.We remark that if |X| = 1 a rooted X-cactus consists of a single vertex only.For better readability, we will often refer to the vertices and arcs of N as the vertices and arcs of N .A rooted X-cactus N is phylogenetic2 if ϕ is a bijection between X and the set of leaves of N .Note that a rooted phylogenetic X-cactus may contain leaves that are reticulation vertices.A rooted X-cactus is binary if it is phylogenetic, all leaves of N are tree vertices, the root has outdegree 2 and every other internal vertex has either indegree 1 and outdegree 2 or indegree 2 and outdegree 1.
A rooted X-cactus N is compressed if ϕ(X) also contains all reticulation vertices with outdegree 1 (see [33, p. 251] for the concept of compression in more general phylogenetic networks).Rooted, compressed, phylogenetic X-cactuses as defined here correspond to 1-nested phylogenetic networks as defined in [21].Note that a rooted, binary X-cactus that contains at least one reticulation vertex cannot be compressed.A rooted X-cactus without any reticulation vertices is called a rooted X-tree.Note that rooted X-trees as defined here are in oneto-one correspondence with the rooted X-trees as defined in [29] where the root is required to have outdegree 1.
In Section 7, we will need to associate with every rooted X-cactus N = ((V, A), ϕ) a rooted, phylogenetic X-cactus N = (( V , A), ϕ) as follows: For every x ∈ X such that ϕ(x) is not a leaf of N or such that there exists some y ∈ X \ {x} with ϕ(y) = ϕ(x) we add a new vertex u to V , add the arc (ϕ(x), u) to A, and put ϕ(x) = u.For all other x ∈ X we put ϕ(x) = ϕ(x).The resulting set of vertices and arcs, respectively, are denoted by V and A (see Figure 2(b)).In addition, we associate with the resulting rooted, phylogenetic X-cactus N the rooted, compressed, phylogenetic X-cactus N * = (( V * , A * ), ϕ * ) obtained by contracting all arcs (u, v) where u has outdegree 1 (see Figure 2(c)).
3 Rankings, time-stamp functions and equidistant X-cactuses In this section, we consider rankings of the vertices of rooted X-cactuses, which are an important part of defining equidistant-cactus space.It is convenient to start with the more general concept of time-stamp functions, which also naturally leads to the definition of equidistant X-cactuses.A time-stamp function on the vertices in a rooted An example of a time-stamp function on the vertices of a rooted X-cactus is given in Figure 3. Integer-valued time-stamp functions are also known as temporal labelings (see e.g.[6]).We call a rooted X-cactus N temporal if there exists a time-stamp function on the vertices of N .Note that not every rooted X-cactus is temporal (for example, the rooted X-cactus in Figure 2(a) is not temporal because ϕ(X) contains an internal vertex that is not a parent of a reticulation vertex).The following lemma characterizes rooted X-cactuses that are temporal (see also [6,Theorem 3] for a characterization that applies to general rooted phylogenetic networks).
Lemma 1.A rooted X-cactus N = ((V, A), ϕ) is temporal if and only if for all vertices u ∈ V the following properties hold: (a) If u ∈ ϕ(X) then either u is a leaf or a parent of a reticulation vertex that is a leaf.
(b) If u has outdegree at least 2 then u is not the parent of a reticulation vertex that is a leaf.
(c) If u is the parent of a reticulation vertex v in a reticulation cycle {P, P } then neither of the directed paths P , P consists of the single arc (u, v).
Proof.First assume that N is temporal.Consider a time-stamp function t on the vertices of N .Assuming that N contains a vertex u that violates one of (a)-(c) immediately yields a contradiction because then t would violate at least one of (TS1)-(TS3).Now assume that (a)-(c) hold for all vertices of N .We construct a timestamp function t on the vertices of N by first putting t(v) = 0 for all v ∈ ϕ(X).In view of (a) and (b), this does not violate (TS1)-(TS3).
Next, consider an internal vertex u that is not a reticulation vertex and also not the parent of a reticulation vertex.Assume that all children w of u have been assigned time-stamps t(w).Then we put t(u) = 1+max w∈ch(u) t(w).Since N is acyclic this does not violate (TS1)-(TS3).
Finally, consider an internal vertex u that is a reticulation vertex.Let p 1 and p 2 denote the two parents of u and assume that all vertices w in have been assigned time-stamps t(w).Then we put t(u) = t(p 1 ) = t(p 2 ) = 1 + max w∈M t(w).Since N is acyclic and in view of (c) this does not violate (TS1)-(TS3).
As indicated in Figure 3, a time-stamp function t on the vertices of a rooted X-cactus N = ((V, A), ϕ) induces non-negative lengths on the arcs of N by putting the length of arc (u, v) to be t(u) − t(v).With these arc lengths, all directed paths from a fixed vertex u to a vertex w ∈ ϕ(X) have the same length, namely t(u).In view of this, we call an ordered pair (N , t) consisting of a rooted, temporal X-cactus N and a time-stamp function t on the vertices of N an equidistant X-cactus.Thus, an equidistant X-cactus can be thought of as a rooted, temporal X-cactus with specific arc lengths assigned, whereas a rooted, temporal X-cactus does not have any specific arc lengths assigned.
We conclude this section by shedding some more light on the combinatorial structure of rooted, temporal X-cactuses.The size σ(t) of a time-stamp function t on the vertices of a rooted, temporal A ranking of a rooted, temporal X-cactus N = ((V, A), ϕ) is a time-stamp function r on the vertices of N with r(V ) = {0, 1, 2, . . ., σ(r)}.See Figure 4(a) for an example.Note that rankings as defined here are a particular type of temporal labeling and are more general than the rankings considered in [7].The value r(v) assigned to vertex v by the ranking r will also be referred to as the rank of vertex v if the ranking referred to is clear from the context.A ranked X-cactus (N , r) consists of a rooted, temporal X-cactus N and a ranking r of the vertices of N .The following lemma gives tight bounds on the size of rankings of rooted, temporal X-cactuses (see Figure 4(b) for an example).For its proof, we will use the fact that any rooted binary X-cactus can be transformed into a rooted binary X-tree by deleting, for every reticulation vertex v, one of the arcs (p, v) from a parent p of v to v and then suppressing the two internal vertices v and p. Proof.By definition, σ(r) ≥ 0.Moreover, if the size of the ranking r is precisely 0 then N must consist of a single leaf v with r(v) = 0 and all elements of X are mapped by ϕ to v.
To establish the upper bound, let i and k denote the number of internal and reticulation vertices, respectively, of the ranked X-cactus (N , r).By definition, σ(r) ≤ (i − 2k).Note that, for fixed X, this expression can only be maximum if N is a rooted, binary X-cactus, because otherwise we can always increase i without increasing k.Hence, it suffices to show that for all rooted, binary Xcactuses we have i − 2k = |X| − 1. Since, as described above, we can transform any such X-cactus into a rooted binary X-tree, we immediately obtain this equation as a consequence of the well-known fact that a rooted binary X-tree has |X| − 1 internal vertices (see e.g.[29, Sec.2.1]).

Equidistant X-trees and τ -space
In this section, we shall briefly recall the concept of an orthant space (see e.g.[24,Sec. 6]) and related concepts.To illustrate the basic idea for constructing our orthant space of equidistant-cactuses, we also consider the simpler case of equidistant-trees (often called ultrametric trees) and explain how the τ -space of ultrametric trees mentioned in the introduction arises as an orthant space.This also yields an alternative proof to the one presented in [17] for the fact that τ -space is a CAT(0)-metric space.

Orthant spaces
An ordered pair (M, F) consisting of a family F of non-empty subsets of a finite non-empty set M is called an abstract simplicial complex if A ∈ F implies that all non-empty subsets of A are also contained in F.An abstract simplicial complex is a flag complex if, for all non-empty subsets A ⊆ M such that all 2-element subsets of A are contained in F, we have A ∈ F. For every map ω : M → R ≥0 we put supp(ω) = {x ∈ M : ω(x) > 0}.The orthant space associated with the abstract simplicial complex (M, F) is • D(x, y) = D(y, x), and hold for all x, y, z ∈ B. The ordered pair (B, D) is called a metric space and the elements of B are called the points of the metric space.A metric D (M,F ) on the orthant space M (M,F ) associated with the abstract simplicial complex (M, F) can be constructed as follows.For every A ∈ F, the set Then, for all ω, ω ∈ M (M,F ) such that there is no orthant O of M (M,F ) that contains both ω and ω we consider finite segmented paths from ω to ω .These are sequences ω 0 , ω 1 , ω 2 , . . ., ω k of elements in M (M,F ) such that ω = ω 0 , ω = ω k and, for all i ∈ {1, 2, . . ., k}, there exists some orthant O i of M (M,F ) that contains both ω i−1 and ω i .The length of such a segmented path is Note that at least one such segmented path always exists in view of the fact that all orthants of M (M,F ) contain the point ω with supp(ω) = ∅, called the origin of M (M,F ) .We define D (M,F ) (ω, ω ) to be the infimum of the length of all segmented paths from ω to ω .It is known (see [24,Sec. 6]) that this construction yields a metric space (M (M,F ) , D (M,F ) ).
Next, we describe a useful property that the metric space (M (M,F ) , D (M,F ) ) may have.A geodesic path between the points p and q in a metric space (B, D) is a map γ : [0, ] → B, for some ≥ 0, with γ(0) = p, γ( ) = q and D(γ is geodesic if there exists a geodesic path between p and q for all p, q ∈ B. A geodesic metric space (B, D) is a CAT(0)-metric space if and only if (see e.g.[11, p. 163 holds for all p, q, r ∈ B and all m ∈ B with D(q, m) = D(r, m) = D(q, r)/2.CAT(0)-metric spaces arise in many applications (see e.g.[3]).They have the important property that geodesic paths are unique [11, Proposition 1.4, p. 160].
It follows from a result in [18] that the orthant space (M (M,F ) , D (M,F ) ) is a CAT(0)-metric space if and only if F is a flag complex (see also [24, Proposition 6.14]).Furthermore, geodesic paths can be computed in polynomial time in CAT(0)-orthant spaces [24,Corollary 6.19].

τ -space revisited
To describe how the τ -space of ultrametric trees arises as an orthant space, we start with a suitably defined abstract simplicial complex.A partition of X is a set P of non-empty and pairwise disjoint subsets of X with X = A∈P A. We denote the set of all partitions of X by B(X) and define a binary relation on B(X) by putting P 1 P 2 if for all A 1 ∈ P 1 there exists some Intuitively, this means that the partition P 1 refines the partition P 2 .
It is well-known that is a partial ordering.Note that the partial ordering is induced by the partial ordering ⊆ on the subsets of X.
Every ranked X-tree with a ranking of size σ gives rise to a sequence of partitions of X.In Figure 5(a) we depict a rooted X-tree with a ranking of size σ = 3 that gives rise to the sequence {{a}, {b}, {c}, {d}} {{a, b}, {c}, {d}} {{a, b}, {c, d}} {{a, b, c, d}} (see also Section 5.1 where we formally define how the partitions arise more generally for ranked X-cactuses).The crucial fact is that this sequence encodes the ranked X-tree.More formally, as we shall prove as a consequence of our results for general ranked X-cactuses in Corollary 14, we have: There is a one-to-one correspondence between (isomorphism classes of ) ranked X-trees and subsets of B(X) that contain {X} and that consist of partitions of X which are pairwise comparable with respect to the partial ordering .
To obtain τ -space as an orthant space, we consider the abstract simplicial complex (B • (X), F( )) with B • (X) = B(X) − {X} and F( ) containing all non-empty subsets of B • (X) whose elements are pairwise comparable with respect to .It follows immediately that (B • (X), F( )) is a flag complex.Note that, more generally, we can associate an abstract simplicial complex that The origin corresponds to the ranked X-tree that consists of a single vertex.
is a flag complex to any partial ordering in an analogous way; for this reason such a complex is known as an order complex (see e.g.[35, p. 248]).
In Figure 6, we illustrate the orthant space M (B • (X),F ( )) of equidistant X-trees for X = {a, b, c} (see Figure 7 for an analogous drawing of the resulting orthant space of equidistant X-cactuses).Note that, by construction, the coordinates of a point in any orthant are obtained as differences between consecutive time stamps in the equidistant X-tree that corresponds to the point.The equidistant X-tree in Figure 5(b), for example, corresponds to the point (ω 1 , ω 2 , ω 3 , ω 4 ) = (0.8, 1.3, 0, 0).More generally, it follows by Theorem 3 that the elements in M (B • (X),F ( )) are in one-to-one correspondence with equidistant X-trees.Moreover, since (B • (X), F( )) is a flag complex, it follows, as mentioned in Section 4.2, that the resulting metric space (M (B • (X),F ( )) , D (B • (X),F ( )) ) is CAT(0).We remark that, by construction, M (B • (X),F ( )) is precisely τ -space, and so we obtain an alternative proof to the one presented in [17] that τ -space is a CAT(0)-metric space.
Before proceeding, we note that in [20] the problem of when a partition of X is compatible with a rooted phylogenetic X-tree is studied.This includes, as a special case, the situation where the vertices of the tree can be ranked in such a way that the partition is among those associated with the resulting ranked X-tree.In addition, in [2] a space, called the Bergman fan of the matroid of the complete graph with vertex set X is studied.This space is a polyhedral fan ans its points are also in one-to-one correspondence with equidistant X-trees.
Although not an orthant space, its cones are in one-to-one correspondence with the orthants of M (B • (X),F ( )) .

An encoding for ranked X-cactuses
To help the reader navigate the remaining sections of this paper, we now briefly summarize how we shall construct the equidistant-cactus space N(X) by applying an analogue of the process described in Section 4.2.
We shall begin by introducing the concept of a polestar system on the set X, which is a collection of ordered pairs of subsets of X, or set pair system for short, with certain properties.As we shall see in Section 5.2, polestar systems can be associated to ranked X-cactuses in a similar way how partitions can be associated to ranked X-trees.We shall also define a binary relation on general set pair systems, and, in Section 6, we will show that yields a partial ordering on the set P(X) of polestar systems on X.In Section 7, we then prove an analogue of Theorem 3, namely, we show that ranked X-cactuses are in oneto-one correspondence with subsets of P(X) that contain the maximum element relative to the ordering and that are pairwise comparable with respect to .In other words, we obtain an encoding of ranked X-cactuses in terms of certain collections of polestar systems.In Section 9, we conclude by constructing the network space N(X) as the orthant space associated to the order complex of the poset (P(X), ).

Set pair systems
Before introducing polestar systems, we recall the concept of a set pair system introduced in [21].To this end, we say that a vertex u in a rooted DAG N is a descendant of a vertex v if there exists a directed path from the root of N to u that contains v.A descendant u of v is a strict descendant if every directed path from the root to u contains v. Otherwise u is called a non-strict descendant of v. Now, given a rooted X-cactus N = ((V, A), ϕ) and a vertex u ∈ V , let C(u) be the set of those x ∈ X with ϕ(x) a descendant of u, S(u) the set of those x ∈ X with ϕ(x) a strict descendant of u and H(u) the set of those x ∈ X with ϕ(x) a non-strict descendant of u in X.For every vertex u of N we call (S(u), H(u)) the set pair associated to u and put For later reference, we state some immediate consequences of the definition of the set pairs in S(N ) for a rooted X-cactus N (see also [21] where these properties have been considered in the context of the slightly more restrictive 1-nested phylogenetic networks): (SH1) For all vertices u of N , we have and S(u) is always non-empty while H(u) may be empty.
(SH2) If (S(u), H(u)) = (S(v), H(v)) for two distinct vertices u and v of N then one of these vertices, say u, is a reticulation vertex with outdegree 1 and v is the single child of u.Note that this situation cannot occur if N is compressed.
(SH3) Let C be the set of vertices in a reticulation cycle of N where u and v are the common start and end vertex, respectively, of the two directed paths that form the reticulation cycle.Then we have H(w) = S(v) if w ∈ C − {u, v} and, for all other vertices w of N , we have H(w ) = S(v).Now, given a ranked X-cactus (N = ((V, A), ϕ), r) we collect, for every i ∈ {0, 1, 2, . . ., σ(r)}, in S i (N ) first those set pairs from S(N ) that correspond to vertices of rank at most i and whose parents (if any) have rank strictly larger than i.We then add some further set pairs that essentially help to keep track of the fact that some of the vertices involved are in a reticulation cycle.More formally, we define V i to be the set that consists of all vertices u ∈ V with r(u) ≤ i and r(p) > i for all parents p of u.Note that, in view of (TS3), V i does not contain any reticulation vertices.Thus, all u ∈ V i have at most one parent.Then we put Note that we always have S σ(r) = {(X, ∅)}.For the rooted X-cactus N in Figure 4(a), for example, we obtain: A collection of ordered pairs (S, H) of subsets of X such that S = ∅ and S ∩ H = ∅ is called a set pair system on X.Note that, by construction, the sets S(N ) and S i (N ), 0 ≤ i ≤ σ(r), associated with a ranked X-cactus (N , r) are non-empty set-pair systems.
It is shown in [21] that, for any set pair system S on X, we obtain a partial ordering ≤ on the set pairs in S by putting (S 1 , H 1 ) ≤ (S 2 , H 2 ) if either (S 1 , H 1 ) = (S 2 , H 2 ) or (S 1 , H 1 ) = (S 2 , H 2 ) and one the following holds: and the set pairs (S 1 , H 1 ) and (S 2 , H 2 ) are distinct.The partial ordering ≤ on set pairs was defined in such a way that we have (S(u), H(u)) ≤ (S(v), H(v)) for two vertices u and v in a rooted X-cactus if and only if u is a descendant of v (see the proof Theorem 5 in [21]).
We use the partial ordering ≤ on set pairs to define a binary relation on set pair systems.More precisely, for set pair systems S 1 and S 2 on X we put S 1 S 2 if (SP1) for all (S 1 , H 1 ) ∈ S 1 there exists some (S 2 , H 2 ) ∈ S 2 with (S Again, we write S 1 ≺ S 2 if S 1 S 2 and S 1 = S 2 .We remark that (SP1) captures the basic idea from Section 4.2 that the partial ordering ≤ on set pairs induces a suitable binary relation on set pair systems (in analogy to how the partial ordering ⊆ induced the binary relation ).( SP2) is an additional technical requirement that will be crucial in our encoding of ranked X-cactuses.
The relation is, in general, not a partial ordering on the set pair systems on a fixed set X because it might neither be antisymmetric nor transitive.For the set pair systems associated with a ranked X-cactus, however, the following holds.
Proof.As noted earlier in this section, S σ(r) (N ) = {({X}, ∅)} follows immediately from the definition of the set pair system S σ(r) (N ).Consider 0 ≤ i < j ≤ σ(r).We first show that S i (N ) S j (N ).So, consider (S, H) ∈ S i (N ).By definition of S i (N ), there must exist a vertex v in N with r(v) ≤ i, r(p) > i for all parents p of v, and either (S, H) = (S(v), H(v)) or (S, H) = (H(v), ∅).Consider a directed path from the root of N to v. On this path there must exist a vertex u with r(u) ≤ j and r(p) > j for all parents p of u.This implies that (S(u), H(u)) ∈ S j (N ).Moreover, in view of the fact that u lies on a directed path from the root of N to v, we must have (S, H) ≤ (S(v), H(v)) ≤ (S(u), H(u)), as required by (SP1).
To establish that also (SP2) is satisfied for S i (N ) and S j (N ), consider (S, H) ∈ S j (N ) with H = ∅.By definition of S j (N ), there must exist a vertex u in N with (S, H) = (S(u), H(u)), r(u) ≤ j and r(p) > j for all parents p of u.Now, if there exists some (S , H ) ∈ S i (N ) with H = H then there exists some vertex v in N with (S , H ) = (S , H) = (S(v), H(v)), r(v) ≤ i and r(p) > i for all parents p of v.This implies that u and v must be vertices in the same reticulation cycle of N .Moreover, we can choose v such that v is a descendant of u, implying that (S , H ) = (S(v), H(v)) ≤ (S(u), H(u)) = (S, H), as required.
It remains to show that S i (N ) = S j (N ).By the definition of a ranked Xcactus, there must exist a vertex u ∈ V with r(u) = j.Without loss of generality we may assume that u is not a reticulation vertex.If (S(u), H(u)) ∈ S i (N ) we are done.So, assume for a contradiction that (S(u), H(u)) ∈ S i (N ).In view of i < j we have u ∈ V i .Thus, there exists some If Case (i) holds then, in view of (SH3), v must be a vertex in a reticulation cycle with end vertex u and (S(u ), H(u )) = (H(v), ∅) = (S(u), H(u)).Since u is not a reticulation vertex, it follows, by (SH2), that u is the single child of u .Consequently, i = r(v) > r(u) = j, a contradiction.Similarly, if Case (ii) holds then, again by (SH2), it follows that u is a reticulation vertex and v is the single child of u, a contradiction.

Polestar systems
A set pair system S on X is partition-like if Lemma 5. Let (N , r) be a ranked X-cactus.Then S i (N ) is a polestar system for all 0 ≤ i ≤ σ(r).
Proof.Fix some i ∈ {0, 1, . . ., σ(r)} and consider two distinct vertices Recall from the definition of the set V i that both u 1 and u 2 have rank at most i while the ranks of their parents are strictly larger than i.Thus, up to switching the roles of u 1 and u 2 , one of the following must hold: • Neither of u 1 and u 2 is a descendant of the other and there is no reticulation cycle in N that contains both u 1 and u 2 .Consequently, (S(u 1 ) ∪ H(u 1 )) ∩ (S(u 2 ) ∪ H(u 2 )) = ∅.Thus, the sets S(u 1 ), H(u 1 ), S(u 2 ) and H(u 2 ) are pairwise disjoint.
• Both u 1 and u 2 are contained in the same reticulation cycle in N but neither is a descendant of the other.Consequently, H(u 1 ) = H(u 2 ) = H = ∅ and the sets S(u 1 ), S(u 2 ) and H are pairwise disjoint.
It follows from this case analysis that (PL1) and (PL2) hold for S i (N ).
To see that also (PL3) holds, consider a set pair (S, H) ∈ S i (N ) with H = ∅.By the definition of S i (N ) there must exist a vertex u in N with (S(u), H(u)) = (S, H) such that r(u) ≤ i and r(p) > i for all parents p of u.In view of H(u) = H = ∅, vertex u must be contained in a reticulation cycle C but cannot be the common start or the common end vertex of the two directed paths that form C. Note that C contains a unique vertex v = u with r(v) ≤ i and r(p) > i for all parents p of v.Moreover, v cannot be the common start or the common end vertex of the two directed paths that form C. Since u and v are both contained in C, we have H(u) = H(v) = H.Moreover, by (SH3), there are no other vertices w in N with H(w) = H, r(w) ≤ i and r(p) > i for all parents p of w.Finally, by construction, we also have (H, ∅) = (H(u), ∅) ∈ S i (N ).
We denote by P(X) the set of polestar systems on the set X. Note that, even for the set pair systems in P(X), (SP1) in the definition of the binary relation does not imply (SP2), as can be seen from the set pair systems We conclude this section with two technical lemmas stating some properties of the relations ≤ and that will be used in Sections 6 and 7.In particular, Lemma 6 establishes that, up to a specific exception, distinct set pairs within a single polestar system are incomparable with respect to the partial ordering ≤ and the binary relations ≤ and are consistent.In our encoding of ranked Xcactuses this exception corresponds to the set pairs associated with reticulation vertices.Lemma 6.Let S 1 , S 2 ∈ P(X) with S 1 S 2 .Then, for all (S 1 , H 1 ) ∈ S 1 and . Thus, we must have H 2 = ∅.Consequently, S 2 ⊆ H 1 , and, therefore, S 2 = H 1 , as required.
In each case, we have S ⊆ S for some S ∈ P(S 2 ) and, in view of (PL1), S is unique, as claimed.This implies that we obtain a map q : S 1 → S 2 by assigning to each (S , H ) ∈ S 1 the unique (S , H ) ∈ S 2 with S ⊆ S .In particular, we have ) with H ∈ H(S 2 ).Note that, in view of (PL3), for each such H , there exist precisely two set pairs (S 1 , H 1 ), (S 2 , H 2 ) ∈ S 1 with H 1 = H 2 = H and, in view of S 1 ≺ S 2 , there must exist some S ∈ P(S 2 ) with S 1 ∪ S 2 ∪ H ⊆ S .This implies k ≥ 2 1 .Thus, letting 2 denote the number of H ∈ H(S 2 ) with H ∈ H(S 1 ), we have ).This implies, in view of S 1 ≺ S 2 , that we cannot have P(S 1 ) = P(S 2 ), that is, we must have k > 0 and, thus, we also obtain |P( Next consider the case that there exists (S , H ) ∈ S 2 with |q −1 (S , H )| ≥ 3 and H = ∅ for all (S , H ) ∈ q −1 (S , H ). Then we select two distinct (S 1 , ∅), (S 2 , ∅) ∈ q −1 (S , H ) and put The remaining case to consider is that there exists (S , H ) ∈ S 2 such that |q −1 (S , H )| ≥ 4 and there are three distinct (S 1 , H 1 ), (S 2 , H 2 ), (S 3 , H 3 ) ∈ q −1 (S , H ) with H 1 = ∅ and H 2 = H 3 = S 1 .Then we put In each case, by construction, we immediately have S 1 ≺ S 3 ≺ S 2 .
6 The poset (P(X), ) In this section, we prove that is a partial ordering on P(X).We also give a formula for counting the number of elements in the resulting poset (P(X), ).
We first recall some standard poset concepts (see e.g.[34]).Proof.We first show that (P(X), ) is a poset.It follows immediately from the definition of the binary relation that it is reflexive.Moreover, in view of Lemma 7, we cannot have two distinct S 1 , S 2 ∈ P(X) with S 1 S 2 and S 2 S 1 , implying that is also antisymmetric.
That (P(X), ) is a graded poset with height function h is now an immediate consequence of Lemma 7 in view of h({({x}, ∅) : The next corollary describes the relationship between (P(X), ) and the poset (B(X), ) of partitions of X.Two posets (M 1 , R 1 ) and (M 2 , R 2 ) are isomorphic if there exists a bijective map f : Corollary 9.The restriction of the poset (P(X), ) to those S ∈ P(X) with H(S) = ∅ is isomorphic to the poset (B(X), ) of partitions of X.
Proof.We map any S ∈ P(X) with H(S) = ∅ to the partition P(S) ∈ B(X).This map is bijective.Moreover, for S 1 , S 2 ∈ P(X) with H(S 1 ) = H(S 2 ) = ∅ we have S 1 S 2 if and only if for all A 1 ∈ P(S 1 ) there exists some A 2 ∈ P(S 2 ) with A 1 ⊆ A 2 , as required.
In the remaining part of this section, we give a formula for the number λ n = |P(X)| of polestar systems on a set X with n ≥ 1 elements.The values of λ n for n = 1, 2, . . ., 8 are 1, 2, 8, 45, 277, 1853, 14065, 122118.For k ∈ {1, 2, . . ., n}, we denote by α n,k the Stirling number of the second kind, that is, the number of partitions of X into k subsets.In addition, for ∈ {0, 1, . . ., k 3 }, we denote by β k, the number of partitions of a set with k elements into subsets with three elements and k − 3 subsets with one element.It is known [30] that Proposition 10.For all n ≥ 1 we have Proof.Let X be a set with n ≥ 1 elements.Consider S ∈ P(X) and put k = |P(S)|.By the definition of a polestar system, S arises from P(S) by forming, for some ∈ {0, 1, . . ., k 3 }, a partition Π(P(S)) of P(S) into subsets with three elements and k − 3 subsets with one element.Each 1-element set {S} ∈ Π(P(S)) yields the set pair (S, ∅).For each 3-element set {S 1 , S 2 , S 3 } ∈ Π(P(S)) we select i ∈ {1, 2, 3} and obtain the three set pairs (S i , ∅), (S j , S i ), j ∈ {1, 2, 3} − {i}.
Formula (1) directly reflects the process described above for obtaining a polestar system from a fixed partition of X into k subsets.In view of the fact that every partition of X yields a different collection of polestar systems on X, we form the outer sum over the values of k.The inner sum then accounts for the number of polestar systems that arise from any fixed partition of X into k subsets.

Encoding ranked X-cactuses
In this section, we show in Theorem 11 that we can encode (isomorphism classes) of ranked X-cactuses in terms of the chains in the poset (P(X), ).We begin by giving a precise statement of this result.We call two equidistant X-cactuses Note that this definition includes isomorphisms between ranked X-cactuses as a special case.For rooted X-cactuses without a time-stamp function to be isomorphic, condition (IC2) is not required.We now state the aforementioned result.
Theorem 11.There is a one-to-one correspondence between chains in the poset (P(X), ) that contain the maximum element {(X, ∅)} and (isomorphism classes of ) ranked X-cactuses.The length of the chain equals the size of the ranking of the corresponding ranked X-cactus.Maximal chains correspond to binary ranked X-cactuses with rankings of size |X| − 1.
To prove this theorem, note that by Lemmas 4 and 5, every ranked Xcactus corresponds to a chain C in (P(X), ) with {(X, ∅)} ∈ C.Moreover, by Lemma 2, we have |C| ≤ |X| − 1 for such a chain with equality holding if and only if the ranked X-cactus is binary.Thus, to prove Theorem 11, it suffices to show that for all chains C in (P(X), ) with {(X, ∅)} ∈ C there exists, up to isomorphism, a unique ranked X-cactus (N , r) with C = {S i (N ) : 0 ≤ i ≤ σ(r)}.This follows immediately from Lemmas 12 and 13 below, and will be done in two steps.First, for any chain C ⊆ P(X) with {(X, ∅)} ∈ C, we form the set pair system S(C) = S ∈C S consisting of all set pairs that occur in the polestar systems in C and construct a suitable rooted, compressed, phylogenetic X-cactus N (C) (see Lemma 12).Second, we perform some technical modifications on N (C), if necessary, to obtain N and then construct a suitable ranking r (see Lemma 13).
As an immediate consequence of Theorem 11 we obtain Theorem 3, which we restate in the following corollary using poset terminology.
Corollary 14.There is a one-to-one correspondence between chains in the graded poset (B(X), ) that contain {X} and isomorphism classes of ranked X-trees.
Proof.In view of the fact that a rooted X-cactus N is a rooted X-tree if and only if the associated set pair system S(N ) does not contain a set pair (S, H) with H = ∅, it follows by Theorem 11 that ranked X-trees correspond to chains C in the poset (P(X), ) with {(X, ∅)} ∈ C and H(S) = ∅ for all S ∈ C.This implies, by Corollary 9, that ranked X-trees correspond to chains in the poset (B(X), ) that contain the partition {X}.

The space of equidistant X-cactuses
We now define equidistant-cactus space, N(X), and show that it is a CAT(0)metric space.The construction of N(X) follows the outline presented at the start of Section 5.More specifically, we put P • (X) = P(X) − {{(X, ∅)}} and let F( ) denote the set of chains in the subposet (P • (X), ) of the poset (P(X), ).We then define N(X) to be the orthant space of the order complex of (P • (X), ). Figure 7 gives an example of the structure of N(X) for X = {a, b, c}.Theorem 15.N(X) = (M (P • (X),F ( )) , D (P • (X),F ( )) ) is a CAT(0)-metric space whose points are in one-to-one correspondence with isomorphism classes of equidistant X-cactuses.
Proof.As an immediate consequence of the definition of a chain as a set of pairwise comparable elements in a poset, we have that (P • (X), F( )) is a flag complex (cf.Section 4.1).Hence, (M (P • (X),F ( )) , D (P • (X),F ( )) ) is a CAT(0)metric space.
It remains to show that the points of M (P • (X),F ( )) are in one-to-one correspondence with isomorphism classes of equidistant X-cactuses.Every ω ∈ M (P • (X),F ( )) corresponds, up to isomorphism, to a unique equidistant Xcactus (N , t) as follows.Put σ = |supp(ω)| and C = supp(ω) ∪ {{(X, ∅)}}.Note that C is a chain in the poset (P(X), ).Consider the sequence of the set pair systems in C. By Theorem 11, there exists, up to isomorphism, a unique ranked X-cactus (N = ((V, A), ϕ), r) with σ(r) = σ and S i = S i (N ) for  ω(S i ) if r(v) > 0 for all v ∈ V .Note that every ω ∈ M (P • (X),F ( )) with ω = ω and supp(ω ) = supp(ω) yields the same ranked X-cactus (N , r) but a time-stamp function t = t on the vertices of N .Also note that every equidistant X-cactus (N , t) arises from some ω ∈ M (P • (X),F ( )) as described above.
• It is known that the link of the origin of phylogenetic tree space as defined in [8] has the homotopy type of the wedge of spheres.It would be interesting to work out the homotopy type of the link of the origin of N(X), and also what other properties it might enjoy (for example, is it Cohen-Macaulay as with the tree-space defined in [8]?) • As was pointed out in [14], there is a connection between the space of circular split collections defined in [15] and a certain type of unrooted phylogenetic networks called level-1 networks.Since these unrooted level-1 networks can be regarded as unrooted X-cactuses, it would be interesting to investigate if there are some connections between N(X) and the space of circular split collections.
• It would be interesting to define and understand the geometry of spaces of more complicated phylogenetic networks with arc lengths.Two obvious candidates for such an investigation are rooted level-2 networks and tree-child, time consistent networks (see [33,Chapter 10] for definitions).Moreover, one could try to relax the requirement that the phylogenetic networks are equidistant.
• How does the distance between equidistant X-cactuses in N(X) compare to other distance measures between phylogenetic networks?For example, it was shown in [1] that the weighted Robinson-Foulds distance between phylogenetic trees [27] is a √ 2-approximation of the distance between phylogenetic trees in the tree space defined in [8].

Figure 1 :
Figure 1: (a) An X-cactus for X = {a, b, c, d, e, f } with root ρ that is equidistant since every directed path from ρ to a sink has the same length, namely 13.All incoming arcs at vertices with indegree 2 have length 0 and are drawn horizontally.(b) The rooted X-cactus obtained by lengthening the incoming arc and shrinking the outgoing arcs at vertex v by 1. (c) The rooted X-cactus obtained by continuing the lengthening and shrinking of the arcs at vertex v until both outgoing arcs have length 0, contracting the cycle below v completely.

Lemma 2 .Figure 4 :
Figure 4: (a) A ranking of size 4 of a rooted X-cactus with X = {a, b, c, . . ., j}.Vertices of the same rank are drawn on the same horizontal line.(b) A ranking of a rooted, binary X-cactus with X = {a, b, c, d, e}.The ranking has size 4 which is the maximum size over all rooted, temporal X-cactuses with |X| = 5.

Figure 5 :
Figure 5: (a) A ranked X-tree with X = {a, b, c, d}.Any cut along one of the dotted horizontal lines yields a partition of X (for example, the dotted line labeled with 1 yields the partition {{a, b}, {c}, {d}}).(b) An equidistant X-tree with X = {a, b, c}.

Figure 6 :
Figure6: The orthant space M (B • (X),F ( )) for X = {a, b, c}.By construction, each axis represents a partition of X distinct from {X}.The axes labeled ω 1 and ω 2 , for example, represent the partitions {{a}, {b}, {c}} and {{a, b}, {c}}, respectively.The three 2-dimensional orthants are drawn shaded.All points in the interior of these 2-dimensional orthants correspond to the same isomorphism class of binary ranked X-trees.The rankings for them are not shown because they are unique.Points on the axes correspond to non-binary ranked X-trees.The origin corresponds to the ranked X-tree that consists of a single vertex.
S 2 )| − |H(S 2 )| < |P(S 1 )| − |H(S 1 )|, as required.Now assume that (|P(S 1 )| − |H(S 1 )|) − (|P(S 2 )| − |H(S 2 )|) ≥ 2. First consider the case that there exist two distinct (S 1 , H 1 ), (S 2 , H 2 ) ∈ S 2 with |q −1 (S i , H i )| ≥ 2, i ∈ {1, 2}.Then we put A (finite) poset (M, R) consists of a finite non-empty set M and a binary relation R ⊆ M × M on M that is reflexive, transitive and antisymmetric.An element m ∈ M is minimum (maximum) if (m, a) ∈ R ((a, m) ∈ R) holds for all a ∈ M .A poset is bounded if it has a minimum and a maximum element and these elements are then necessarily unique.Two elements a, b ∈ M are comparable if (a, b) ∈ R or (b, a) ∈ R. A chain C is a non-empty subset of M of pairwise comparable elements.The length of a chain C is |C| − 1.A chain is maximal if it is not contained in some strictly longer chain.A poset is graded if every maximal chain has the same length.The height function 3 h of a graded poset (M, R) assigns to every element a ∈ M the length h(a) of a longest chain C with (b, a) ∈ R for all b ∈ C. Proposition 8. (P(X), ) is a bounded graded poset with minimum element {({x}, ∅) : x ∈ X} and maximum element {(X, ∅)}.The height function of this poset is h : P(X) → {0, 1, . . ., |X| − 1} with h(S) = |X| − |P(S)| + |H(S)|.

Figure 7 :
Figure7: The structure of N(X) for X = {a, b, c}.The six 2-dimensional orthants are drawn shaded.Each of these 2-dimensional orthants corresponds to an isomorphism class of binary ranked X-cactuses.Each axis corresponds to the indicated polestar system on X.

Figure 8 :
Figure8: The graph that determines the structure of the link of the origin L (P • (X),F ( )) for X = {a, b, c, d}.The oval vertex is adjacent to all other vertices.The ranked X-cactus displayed for each vertex corresponds to the chain {S, {(X, ∅)}} for each S ∈ P • (X).
Figure3: A rooted X-cactus N on X = {a, b, c, d, e} with a time-stamp function t on its vertices.For all vertices v the value t(v) is given by the real number to the left of the horizontal line through v.In addition, for each arc of N , the length of the arc induced by t is given.