Tropical Geometric Variation of Phylogenetic Tree Shapes

We study the behavior of phylogenetic tree shapes in the tropical geometric interpretation of tree space. Tree shapes are formally referred to as tree topologies; a tree topology can also be thought of as a tree combinatorial type, which is given by the tree's branching configuration and leaf labeling. We use the tropical line segment as a framework to define notions of variance as well as invariance of tree topologies: we provide a combinatorial search theorem that describes all tree topologies occurring along a tropical line segment, as well as a setting under which tree topologies do not change along a tropical line segment. Our study is motivated by comparison to the moduli space endowed with a geodesic metric proposed by Billera, Holmes, and Vogtmann (referred to as BHV space); we consider the tropical geometric setting as an alternative framework to BHV space for sets of phylogenetic trees. We give an algorithm to compute tropical line segments which is lower in computational complexity than the fastest method currently available for BHV geodesics and show that its trajectory behaves more subtly: while the BHV geodesic traverses the origin for vastly different tree topologies, the tropical line segment bypasses it.


Introduction
Phylogenetic trees are discrete mathematical objects that capture the evolutionary behavior of biological processes; they have been extensively used and studied both as data structures as well as symbolic objects. Phylogenetic trees are able to encode vastly different evolutionary patterns through various branching configurations, which define the tree's shape or combinatorial type-more formally, the tree's topology. In this paper, we focus on phylogenetic trees as discrete geometric objects and study the behavior of their topology using tropical geometry. Specifically, we propose the tropical line segment as a framework to study how tree topologies vary. Our main contributions are a detailed study of the occurrence and behavior of tree topologies and a method to compute tropical line segments in tropical geometric phylogenetic tree space. Our study is conducted relative to the current standard for the space of phylogenetic trees, proposed by Billera, Holmes, and Vogtmann (Billera et al., 2001), and referred to as BHV space (after the authors' initials). BHV space is a moduli space of phylogenetic trees whose geometry is characterized by unique geodesics and whose structure is defined by tree topologies.
We prove a combinatorial theorem that describes all tree topologies that exist along the tropical line segment between any two given trees, as well as a framework for the notion of invariance of tree topologies on a tropical line segment. We also give an algorithm that computes tropical line segments and compare tropical line segments to BHV geodesics. We find that the tropical setting results in a more subtle behavior of trajectories between vastly different tree topologies.
The remainder of our paper is organized as follows. In Section 2, we give formalities on phylogenetic trees and tree spaces; we also provide the basics of tropical geometry and discuss the coincidence of tropical geometry and tree space. A review on the literature on comparisons between trees is also given, as well as a detailed overview of BHV space and its geometry. In Section 3, we define the tropical line segment and characterize the geometry of phylogenetic tree space in the tropical geometric setting. In Section 4, we provide our main results on tree topologies on tropical line segments, which are a combinatorial theorem that provides all tree topologies occurring along a tropical line segment and a notion of invariance of tree topologies on tropical line segments. We also present an algorithm to compute tropical line segments and compare their behavior relative to BHV geodesics in terms of trajectories across tree space and between different tree topologies. We show that the tropical line segment does not cross the origin, which differs from the trajectory of BHV geodesics between trees with vastly different tree topologies. We close with a discussion in Section 5.
2 Trees, Tree Spaces, and Tropical Geometry In this section, we present the basics of tropical geometry, provide background on phylogenetic trees and tree spaces, and discuss the coincidence between tropical geometry and tree spaces. We overview how trees are compared within tree spaces, focusing on the setting of BHV space and its geometry. Finally, we define the tropical line segment and discuss some of its geometric properties.

Tropical Algebra and Tropical Geometry
Tropical algebra is based on the following semiring with linearizing operations given as follows. Existing literature on tropical geometry specifies a min-plus semiring, where the addition of two elements is given by their minimum and denoted by ⊕, rather than their maximum: (R ∪ {∞}, ⊕, ) (Speyer and Sturmfels, 2009). The min-plus and max-plus semirings are isomorphic. In the context of phylogenetic trees, the max-plus semiring is a more appropriate convention to adopt and consistent with existing literature on tropical geometric methods in tree spaces (e.g., Yoshida et al., 2019;Monod et al., 2021;Tang et al., 2020).
The tropical operations are commutative and associative, and multiplication distributes over addition. Tropical subtraction is not defined, which gives a semiring rather than a ring. Tropicalization refers to replacing classical arithmetic operations with the tropical versions. Tropical geometry is the study of the geometry of nonlinear loci of polynomial systems defined in the tropical semiring.

Defining Phylogenetic Trees
Phylogenetic trees are the fundamental mathematical model for biological evolution. They are constructed from molecular sequence data of a finite number of species, and graphically represent the evolutionary phylogeny of the species.
Definition 2. A phylogenetic tree T = (V, E) is an acyclic connected graph with at most one vertex of degree 2. V is a set of vertices that are labeled terminal nodes with degree 1 called leaves; non-leaf vertices have degree greater than 2. E is a set of nonnegative length edges or branches that represent evolutionary time. Edges that connect to leaves are called external or pendant edges; otherwise, they are known as internal edges.
When there is a common ancestor from which all leaves evolve, the tree is called rooted and the root is a unique node of degree 2. In rooted trees, the evolution progresses from the common ancestor (root) by a series of bifurcations (edges, E) and ends in the terminal nodes (leaves, V ). When there is no common ancestor among the leaves, the tree is unrooted.
Remark 3. In the case of rooted trees, one may also imagine an edge extending from the root to a leaf labeled 0. In this case, the interior vertex connecting to the root (leaf label 0) will have degree 3, however, by convention, rooted trees are not depicted this way, which is why the root in such trees appears to be a node of degree 2.
Methods for reconstructing phylogenetic trees from molecular sequence data are generally either distancebased or statistical. Distance-based methods entail specifying distance matrices between the sequences via a genetic distance-such as Hamming distance-and grouping sequences that are closely related under the same node, with branch lengths representing the observed distances between sequences. For a survey on distance-based tree reconstruction methods, see for example Peng (2007).
Statistical reconstruction methods entail specifying some classical statistical criterion, such as likelihood or parsimony, and then optimizing (e.g., Fitch, 1971;Felsenstein, 1981). The criteria are defined on the principle that in DNA evolution, nucleotides are substituted following a continuous-time Markov chain (or, more generally, an independent time-reversible model for finite sites (Tavaré, 1986)). The motivation for statistical methods for tree reconstruction arises from the uncertainty of the "true" phylogenetic tree, since different choices of molecular sequences (which may be due to choice of gene or coding region) leads to different gene trees (e.g., Holmes, 2003). Additionally, there is an extremely high number of possible tree topologies (Schröder, 1870): the number of tree topologies for a rooted, binary tree (i.e., a bifurcating tree with exactly two descendants stemming from each interior node) with N leaves is Previous work has shown that solutions to statistical optimization problems are tractable under certain conditions and assumptions. For example, under uniform distributivity, the optimization of parsimonybased objective functions is known to be an NP-complete Steiner tree problem (Foulds and Graham, 1982). Various restrictions of the Steiner tree problem (e.g., the minimum spanning tree problem) can be solved in polynomial time (Juhl et al., 2018). In biological applications where the data and specific problem of study allow certain distributional assumptions (e.g., identifiability of mixture distributions), statistical methods can be easy and computationally efficient to implement (e.g., Allman and Rhodes, 2008;Long and Sullivant, 2015;Rhodes and Sullivant, 2012). The focus of this paper, however, is on the number of tree topologies (1) and their occurrences within tree spaces.
Definition 4. For a phylogenetic tree T with N leaves labeled by [N ], and with branch length b e ∈ R ≥0 associated to each edge e in T , its tree metric is the map where P { i,j} is the unique path between leaves i and j.
Tree metrics are metric representations of phylogenetic trees in terms of pairwise distances between leaves. Tree metrics may also be represented as cophenetic vectors (Cardona et al., 2013), where the entries are sorted lexicographically.
Definition 5 (Four-point condition, Buneman (1974)). A tree metric satisfies the four-point condition and hence defines a tree if and only if the maximum among the Plücker relations, is attained at least twice for 1 ≤ i < j < k < ≤ N , or, equivalently, if for all distinct i, j, k, ∈ [N ].
The above technical condition characterizes phylogenetic trees; in particular, the space of phylogenetic trees with N leaves, T N , is the collection of all n-tuples {w(i, j)} 1≤i<j≤N that satisfy the four-point condition (4), or equivalently, where the maximum among (3) is achieved at least twice. For examples and counterexamples of the four-point condition with illustrations, see Monod et al. (2021).
The coincidence between the space of phylogenetic trees and tropical geometry arises through the fourpoint condition as follows. Speyer and Sturmfels (2004) identify a homeomorphism between T N and a tropical version of the Grassmannian of 2-planes in N dimensions: the Grassmannian may be mapped to a projective variety via the Plücker embedding, which, when interpreted tropically, gives the homeomorphism by Speyer and Sturmfels (2004), by recovering the four-point condition defining phylogenetic trees. This endows the space of phylogenetic trees with a tropical geometric structure; in particular, the space of phylogenetic trees is a tropical variety.
The four-point condition may be strengthened to define an important subclass of trees as follows.
Definition 6 (Three-point condition, Jardine et al. (1967)). A tree metric satisfies the three-point condition and hence defines a tree ultrametric if and only if the maximum among is attained at least twice for 1 ≤ i < j < k ≤ N .
The space of tree ultrametrics with N leaves, U N , is the collection of all n-tuples {w(i, j)} 1≤i<j≤N satisfying the three-point condition. A rooted phylogenetic tree T is equidistant if the distance from every leaf to its root is constant.
Proposition 7 (Proposition 12, Monod et al. (2021)). A tree metric w for a phylogenetic tree T is a tree ultrametric if and only if T is equidistant.

The Tropical Projective Torus
For equidistant trees, since the distance between the root and every leaf is constant, this distance (or the tree's height) may always be normalized to 1. This idea may be generalized to unrooted trees to normalize the evolutionary time between trees by considering an equivalence relation for tree metrics represented by cophenetic vectors x, y ∈ R n : x ∼ y ⇔ x 1 − y 1 = x 2 − y 2 = · · · = x n − y n , meaning that x ∼ y if and only if all coordinates of their difference x − y are equal. This equivalence relation generates a quotient space known as the tropical projective torus, denoted by R n /R1, which is the ambient space of tree space, U N ⊂ T N ⊂ R n /R1. The tropical projective torus R n /R1 may be embedded into R n−1 by considering representatives of the equivalence classes with first coordinate equal to zero, (x 2 − x 1 , x 3 − x 1 , . . . , x n − x 1 ) ∈ R n−1 . See Maclagan and Sturmfels (2015); Monod et al. (2021); Lee et al. (2021) for more detail.

Comparing Trees via Metrics
Metrics are a natural tool for comparing trees and providing a quantitative measure of similarity between trees. Various metrics have been proposed over the past several decades during which quantitative and computational tree studies have been an active research interest.
Some of the most popular metrics between trees are those that maintain characteristics of Euclidean distance in the inherently non-Euclidean setting of tree space, such as an inner product structure. Some examples of well-known inner product distances between trees are the path difference, quartet distance, and Robinson-Foulds distance (Robinson and Foulds, 1981). However, these are known to suffer from structural errors, since many pairs of trees measure the same distance apart under these metrics; as well as interpretive errors, since the existence of large distances between trees does not necessarily mean that there are large differences between shared ancestry of leaves (Steel and Penny, 1993). Other metrics use the cophenetic vector representation of tree metrics, treating them simply as points in Euclidean space, and then using the ∞ distance in R n (Cardona et al., 2013).
Remark 8. Notice that the image of the linear mapping from R N to R n given by for all pairs i < j, is isomorphic to the tropical projective torus. Under such a mapping, it is then possible to work in R n equipped with the ∞ metric as in Ardila (2005); Bernstein and Long (2017); Bernstein (2020).
Other popular and well-known metrics include the nearest neighbor interchange metric (Waterman et al., 1976), subtree transfer distance (Allen and Steel, 2001), and variational distance (Steel and Székely, 2006). There also exist metrics that connect the subfield of mathematical phylogenetics to various other subfields of pure mathematics, such as the algebraic metric (Alberich et al., 2009), which is based on a group structure; Munch and Stefanou (2019) also show that the ∞ metric applied to trees (Cardona et al., 2013) is in fact an interleaving distance from applied and computational topology.
BHV Space and the Geodesic Metric. We focus and provide more detail on the geometric interpretation of tree space proposed by Billera et al. (2001) for two main reasons. First, the tropical geometric approach is the most comparable to the BHV setting since it is also geometric in nature and also defines a moduli space. Second, previous work which this paper builds upon is in direct comparison to BHV space; we follow suit for consistency.
In BHV space, T BHV N , phylogenetic trees are represented as Euclidean vectors; the entries in the vectors are given by internal edge lengths of the trees, external edge lengths are disregarded. For a phylogenetic tree with N leaves, there are at most 2N − 2 edges: N terminal edges connecting to leaves, and at most N − 2 internal edges. Trees in BHV space are therefore represented as vectors in R N −2 , for a given tree topology. The space of phylogenetic trees is then a collection of (2N − 3)!!-many Euclidean orthants; each orthant may be regarded as the polyhedral cone of R N −2 with all nonnegative coordinates, which correspond to the internal edge lengths in a tree.
On the orthant boundaries, there is at least one zero coordinate that occurs; orthant boundaries represent trees with collapsed internal edges. The orthants are joined along the orthant boundaries. In general, tree topologies of the orthants determine the adjacency: boundary trees from two different orthants (i.e., trees with two different topologies) may characterize the same polytomic topology (or split) so these two orthants are grafted together along this boundary when the trees from each topology with collapsed internal edges coincide. Two adjacent orthants, therefore, represent similar, yet distinct, tree topologies. Orthants are grafted at right-angles, resulting in a stratified space.
The right-angle grafting has a direct implication on the curvature of BHV space; BHV space satisfies the CAT(0) property of flag complexes, and geodesics are thus unique. The geodesic characterizes a metric on BHV space, where the length of the geodesic between any two trees is the distance. Geodesics are computable on BHV space; the geodesic between any two trees represented by their internal edge lengths is first computed, external edges are then factored in afterwards to compute the overall distance. For two trees in the same orthant, the shortest path between them is simply the straight line measured by the Euclidean distance between them. For trees in different orthants, the difficulty arises in establishing the sequence of orthants to traverse to give the shortest distance between the trees. For trees that are not in neighboring orthants and especially when the tree topologies are very different, the geodesic often passes through the origin (or star tree, where all internal edges are collapsed); paths that traverse the origin are referred to as cone paths. For trees with four leaves, the sequence of orthants containing the shortest path between two trees can be systematically computed by a grid search, but for larger trees, such a search is intractable. Owen and Provan (2011) give the fastest available algorithm to date to find the geodesic path between any two trees in BHV space, which runs in quartic time in the number of leaves N . Monod et al. (2021) present an alternative geometric construction of phylogenetic tree space based on tropical geometry and study its analytic and topological properties with the aim of statistical inference and data analysis in mind. Trees are represented by cophenetic vectors (2); external edge lengths are thus included. Endowed with the tropical metric, the resulting metric moduli space is referred to as palm tree space (tropical tree space).

Palm Tree Space
The tropical metric is a generalized Hilbert projective metric function and arises in other tropical geometric settings (e.g., Akian et al., 2011;Cohen et al., 2004). It has also been used in previous work studying phylogenetic trees (Lin et al., 2017;Lin and Yoshida, 2018;Yoshida et al., 2019). The tropical metric is combinatorial in nature and is given by the difference between the maximum and minimum of the differences between the tree coordinates; it is a proper metric . This metric also enjoys other interpretations following the relationship of isomorphism discussed in Remark 8; in particular, restricting to the image of the map (5), the tropical metric is precisely the ∞ metric.
Under the tropical metric, properties that are desirable for statistical inference (such as hypothesis testing) and exact probabilistic studies (such as concentration inequalities and convergence studies) are satisfied and well-defined. Geodesics, however, are not unique; there are infinitely many geodesics between any two points. This indicates a more complex geometry than the CAT(0) structure of BHV space. Lee et al. (2021) use optimal transport theory to define and study the Wasserstein distances on the tropical projective torus and thus give an algorithm to compute the set of all infinitely-many geodesics on the tropical projective torus.
3 Structure and Geometry of Tropical Geometric Tree Space BHV space and palm tree space are both moduli spaces and are both inherently geometric constructions. In this section, for comparative purposes and to present the framework of our main results, we study the structure and geometry of the tropical geometric interpretation of phylogenetic tree space.

Polyhedral Structure
In the same way that T BHV N is constructed as the union of polyhedra, where each polyhedron corresponds to one distinct tree topology, the structure of T N interpreted tropically is also given by such a union.
As in BHV space, in tropical geometric tree space, each polyhedron may be considered as a polyhedral cone of R N −3 with all nonnegative coordinates corresponding to the cophenetic vector representation of trees. Here, the cone in consideration is the usual convex cone as a subset of a vector space on an ordered field, closed under linear combinations with positive coefficients (or equivalently, the set spanned by conical combinations of vectors). Example 10. When N = 5, T 5 has 5!! = 15 cones, which correspond to the edges of the Petersen graph, depicted in Figure 1. Here, the root is considered as a leaf (see Remark 3 above), and the other leaf labels are numbered 1 through 4. The Petersen graph here is illustrative and represents the configuration of the polyhedra according to tree topologies; the edges represent the cones corresponding to a tree topology, the nodes of the graph represent the commonality between the coinciding edges (i.e., between two tree topologies).
To interpret the correspondence to trees in this Petersen graph, consider the leftmost upper graph vertex on the outer hexagon and notice that there are three edges (cones) that meet at this vertex. The figures of the trees associated with these cones (illustrated in the circles) share the property that the edge to leaf 1 is the longest edge, and the remaining three leaves are permuted. Now, consider the graph vertex at the very center of the graph, inside the hexagon: the property common to the trees associated with the three cones meeting at this vertex is that the pair of leaves labeled 1 and 3 are coupled, symmetric, and are joined at the same internal node of the tree that is not the root. The remaining graph vertices may be interpreted in a similar manner, in the sense that they all share one of these two commonalities, under symmetry of and up to leaf labeling scheme. Thus, the 10 vertices of the Petersen graph here are given by the number of ways to choose a label for the longest branch length in a tree that connects directly to the root, among 4 choices of leaf labels (i.e., 4 2 ); and the number of ways to choose pairs of leaf labels that are coupled, symmetric, and correspond to the same internal node of the tree that is not directly linked to the root (i.e., 4 1 ), so 4 2 + 4 1 = 10. The intuition of the graph edges as cones lies in the property of closedness under scaling of branch lengths, for each tree type associated with each graph edge in the figure (illustrated in the circles).
The Petersen graph also coincides in the context of BHV space when considering N = 4 leaves, as the so-called link of the origin; see Billera et al. (2001) for further details.

Geometry of the Space of Tree Ultrametrics
We now focus on the case of rooted equidistant trees as in Billera et al. (2001). We begin by providing definitions of tropical geometric line and set objects and outlining important properties of these objects in the space of tree ultrametrics.
Notation. For a positive integer N , let p N be the set of all pairs in [N ]. For convenience, we denote a tree ultrametric with N leaves by (w p ) p∈p N , where for p = {i, j}, w p = w(min(i, j), max(i, j)).
Definition 11. Consider the subspace of L N ⊆ R n defined by the linear equations for 1 ≤ i < j < k ≤ N in tree metrics w. For the linear equations (6) cutting out L N , their (max-plus) tropicalization is w ij w ik w jk : recall that under the trivial valuation, all coefficients are disregarded when tropicalizing. This tropicalization of L N is denoted by Trop(L N ) ⊆ R n /R1 and is referred to as the tropical linear space with points (w p ) p∈p N where max w ij , w ik , w jk is obtained at least twice for all triples This is equivalent to the three-point condition for ultrametrics given in Definition 11. An observation from tropical geometry gives a correspondence between the tropical linear space Trop(L N ) and the graphic matroid of a complete graph with N vertices.
We have the following geometric characterization of the space U N and corresponding characterizations of tropical line segments between ultrametrics.
Theorem 12 (Ardila and Klivans (2006)). The image of U N in the tropical projective torus R n /R1 coincides with Trop(L N ). That is, Trop(L N ) = U N .
Definition 13. For x, y ∈ R n /R1, the tropical line segment with endpoints x and y is the set Here, max-plus addition for two vectors is performed coordinate-wise.
The tropical line segment between any two points in R n /R1 is unique and it is a geodesic .
Definition 14. Let S ⊂ R n . If a x b y ∈ S for all x, y ∈ S and all a, b ∈ R, then S is said to be tropically convex.
The tropical convex hull or tropical polytope of a given subset V ⊂ R n is the smallest tropically-convex subset containing V ⊂ R n ; it is denoted by tconv(V ). The tropical convex hull of V may be also written as the set of all tropical linear combinations: tconv(V ) = {a 1 v 1 a 2 v 2 · · · a n v n | v 1 , . . . , v n ∈ V and a 1 , . . . , a n ∈ R}.
Proposition 15. For two tree ultrametrics T 1 , T 2 ∈ U N , the tropical line segment generated by T 1 and T 2 , is contained in U N . In other words, U N is tropically convex.
Proof. Since T 1 is a tree ultrametric, a T 1 remains a tree ultrametric; this is also true for b T 2 . Thus, we may assume a = b = 0. Suppose It suffices to show that for any 1 ≤ i < j < k ≤ n, we have that the maximum among z {i,j} , z {i,k} , z {j,k} is attained at least twice in order for T 1 T 2 to be a tree ultrametric. Let M be this maximum, and set M : This result generalizes outside the context of trees; in general, tropical linear spaces in the tropical projective torus are tropically convex (see Proposition 5.2.8 of Maclagan and Sturmfels (2015)).

Tree Topologies on Tropical Line Segments
In this section, we present our main results, which include a combinatorial study of the variation of tree topologies as well as a notion of invariance within the framework of the tropical line segment. We also give an algorithm for computing tropical line segments.
Note that our study is conducted in the setting of rooted equidistant trees (ultrametrics). The relevance for a study dedicated to ultrametrics specifically in this paper is twofold. First, it allows for a parallel geometric comparison between BHV space and the tropical interpretation of phylogenetic tree space: As mentioned above in Section 2.4, the geometric significance of BHV space lies in its construction when only internal edges are considered. Its structure is based on the union of orthants, where each orthant corresponds to a specific rooted tree topology. In other words, the definition of BHV space inherently relies on tree topologies. The tropical construction of tree space, while also polyhedral (see Proposition 9), has a more complex algebraic structure, which we explore here by studying rooted equidistant tree topologies and their occurrence within the tropical construction of tree space.
The second motivation for studying ultrametric tree topologies lies in the context of applications: ultrametrics correspond to coalescent processes, which model important biological phenomena, such as cancer evolution (Kingman, 2000). In phylogenomics, the coalescent model is often used to model gene trees given a species tree (see e.g., Knowles (2009);Rosenberg (2003); Tian and Kubatko (2014) for further details). The coalescent model takes two parameters, the population size and species depth (i.e., the number of generators from the most recent common ancestor (MRCA) of all individuals at present). The species depth coincides with the height of each gene tree from the root (that is, the MRCA) to each leaf representing an individual in the present time. The output of the coalescent model is a set of equidistant gene trees, since the number of generations from their MRCA to each individual in the present time are the same by model construction. Understanding the structure of ultrametrics is an important step towards the modeling and analysis of coalescent biological processes.

Tree Variation on Tropical Line Segments
The tropical linear space coincides with the space of tree ultrametrics, and hence, that tropically-convex sets (and therefore tropical line segments) are also fully contained in the space of tree ultrametrics. This endows the space of tree ultrametrics with a tropical structure, and now allows us to study the behavior of points (trees) along tropical line segments. In particular, this allows us to characterize ultrametric tree topologies geometrically, thereby providing a description of the tropically-constructed tree space that is comparable to the geometry of BHV space for rooted trees and zero-length external edges (as originally described by Billera et al. (2001)).
The strategy that we implement is largely combinatorial. We first formalize the definition of a tree topology as a collection of subsets of leaves, and use these subsets to define notions of size, and in particular, largest and smallest subsets. Given these upper and lower bounds, we then define an equivalence relation and a partial order that allow us to iteratively and combinatorially partition and compare leaf subsets and tree topologies. This gives us a framework to study shapes of trees: specifically, Theorem 30 is a combinatorial theorem that describes the possible tree topologies that exist along a tropical line segment.
, where 2 ≤ |S| ≤ N −1 and for any two distinct clades S 1 , S 2 ∈ F , exactly one of the following nested set conditions holds: Clades always belong to [N ] rather than [N ] ∪ {0}, since we may always choose the clade excluding the root (i.e., the leaf with label 0). In this manner, they allow for an alternative representation of trees over tree metric vectors w or matrices W . Example 17. For the tree in Figure 2, there are two ways to express this tree: 1. As a tree ultrametric in U 5 : (16,40,40,40,40,40,40,20,20,10) 2. As a vector in an ambient space, in terms of lengths of internal edges: In general, using the internal edges to represent trees allows for an iterative construction of a family of clades satisfying one of the nested set conditions (7). In the case of Figure 2, this family is {A, B}, {C, D, E} and {D, E}.
The following lemma provides intuition on the definition of full dimensionality given above in Definition 16.
Lemma 18. Let N ≥ 2 be an integer. For any tree topology F on a ground set of N elements, we have |F | ≤ N − 2.
Proof. We proceed by induction on N . When N = 2, since 2 > N − 1, F is necessarily empty because clades S cannot satisfy 2 ≤ |S| ≤ 1. When N = 3, all clades of F must have cardinality 2, because the cardinality is at least 2 and at most 2, and thus are among {1, 2}, {1, 3}, {2, 3}. But any two of these clades do not satisfy one of the nested set conditions (7). Hence |F | ≤ 1, and the base case for N = 2, 3 holds.
Next, suppose Lemma 18 holds for 3 ≤ N ≤ m where m ≥ 3. Consider the case when N = m + 1: We take all clades S belonging to F that are maximal in terms of inclusion. There are two cases: (i) There is a unique maximal clade S max in F : |S max | ≤ m and all clades of F are subsets of S max . If |S max | ≤ 2, then F has a unique clade, which is S max and therefore |F | = 1 ≤ m − 1 = N − 2; otherwise |S max | ≥ 3. Consider the family F \{S max } on the ground set S max . For any S ∈ F \{S max }, since S is a proper subset of S max , we have 2 ≤ |S| ≤ |S max | − 1. Since F \{S max } is still a nested set, it is a tree topology on S max . By the induction hypothesis, |F \{S max }| ≤ |S max | − 2 ≤ m − 2. Thus, |F | = |F \{S max }| + 1 ≤ m − 1 = N − 2 and Lemma 18 holds for N = m + 1.
(ii) There are at least two maximal clades in F : Let these clades be S 1 , S 2 , . . . , S k , with k ≥ 2, then Let c i be the number of proper subsets of S i that belong to F . Then |F | = k + k i=1 c i . Notice that c i is also the cardinality of a tree topology on S i , so by the induction hypothesis, c i ≤ |S i | − 2. Hence Lemma 18 thus holds for N = m + 1.
This concludes the transition step, and the proof.
Conversely, now, we consider the minimal clades of a tree topology F . Proof. Suppose p ∈ p N and F (p) = ∅. For any two clades S 1 , S 2 ∈ F (p), since p ⊆ S 1 , S 2 , then S 1 and S 2 cannot be disjoint. Thus, by the nested set condition in Definition 16, either S 1 contains S 2 or vice versa. This means that all clades in F (p) form a completely ordered set with respect to set inclusion. Since F (p) is finite, it has a minimal element that must be contained in all other elements. This minimal element is the intersection of all clades in F (p).
Definition 20. The minimal element that is the intersection of all clades S ∈ F (p) given in Lemma 19 is called the closure of p in F . We denote this closure by cl F (p). If F (p) = ∅, we set cl F (p) to be [N ]. Given the definition of a tree topology F , and notions of maximal and minimal clades of F , we now proceed to study the behavior of varying tree topologies. We shall construct the setting for such a study via the definitions of an equivalence relation and a partial order on a tree topology F in terms of pairs p ∈ p N as follows.
Definition 22. Let F be a tree topology on [N ]. We define an equivalence relation = F on p N by and a partial order < F on p N by for all pairs p 1 , p 2 ∈ p N . Proof. Suppose for contradiction that Lemma 23 does not hold for some distinct i, j, k. Then F ({i, j}) and F ({i, k}) do not contain each other and there exist S 1 , S 2 ∈ F such that S 1 ∈ F ({i, j})\F ({i, k}) and S 2 ∈ F ({i, k})\F ({i, j}). Thus j ∈ S 1 \S 2 and k ∈ S 2 \S 1 , contradicting that F is a nested set. Hence Lemma 23 holds.
Given this framework to compare pairs of elements, we now give two notions of bipartitioning of trees that are closely related. It turns out, as we will see in Theorem 26, in the context of tree topologies, they coincide.
Definition 24. A rooted phylogenetic tree T is said to be binary if every vertex of T is either a leaf or trivalent.
Definition 25. Let F be a tree topology on [N ] andF = S ∈ F | |S| ≥ 3 ∪ [N ]. F is said to be bifurcated if for every S ∈F , exactly one of the following holds: (a) there exists a proper subset S ⊂ S such that S ∈ F and |S | = |S| − 1; or (b) there exist two proper subsets S , S ⊂ S such that S , S ∈ F and S ∪ S = S.
Note that in (b), we must have that S ∩ S = ∅.
A binary tree is a bifurcating tree that has exactly two descendants stemming from each interior node.
Theorem 26. Let F be a tree topology on [N ]. The following are equivalent: (1) F is full dimensional; (2) for 1 ≤ i < j < k ≤ N , two of the pairs {i, j}, {i, k}, {j, k} are = F , and the third pair is < F than the other two F -equivalent pairs; (3) F is bifurcated; (4) every phylogenetic tree with tree topology F is binary. Proof.
(3) ⇒ (2): Suppose F is bifurcated and consider distinct elements i, j, k ∈ [N ]. By Lemma 23, any two of {i, j}, {i, k}, {j, k} are comparable with respect to = F or < F . If the three pairs are all = F , then by Definition 22, their closures in F are equal. Let this closure be S ∈ F ∪ {[N ]}, then i, j, k ∈ S and |S| ≥ 3, thus S ∈F . Since F is bifurcated, condition (a) or (b) in Definition 25 holds for S. If (a) holds, then there exists S ∈ F such that S is a proper subset of S with |S | = |S| − 1. In this case, at least two of i, j, k belong to S , and the closure of the pair formed by these elements is contained in S -a contradiction. So we may assume {i, k} < F {i, j}. We now need to show F -equivalence between {i, j} and {j, k}: by Definition 22, there exists S 1 ∈ F such that {i, k} ⊆ S 1 but {i, j} ⊆ S 1 . Then i, k ∈ S 1 and j / ∈ S 1 . Now for any S 2 ∈ F , if {j, k} ⊆ S 2 , then S 2 and S 1 are not disjoint. Since j ∈ S 2 \S 1 , we must have S 1 ⊆ S 2 . Then i ∈ S 2 , and {i, k} ⊆ S 2 . Since {j, k} ⊆ S 1 , by definition, {i, k} < F {j, k}. In addition, if an element of F is a superset of {i, j}, then it also contains k and thus is also a superset of {j, k}. Conversely, being a superset of {j, k} implies that it also contains i, and thus is also a superset of {i, j}. Therefore {i, j} = F {j, k}, and (2) holds.
(2) ⇒ (3): Suppose (2) holds for F . We will show that F is bifurcated: for any subset S ∈F , consider all maximal proper subsets M 1 , . . . , M m of S that are clades in F . For any two such maximal subsets, since neither can be a subset of the other by definition, they must be disjoint. If m ≥ 3, we can choose i, j, k ∈ [N ] from M 1 , M 2 , M 3 respectively. Then cl F {i, j} = cl F {i, k} = cl F {j, k} = S and thus {i, j} = F {i, k} = F {j, k}-a contradiction. Therefore, m must be either 1 or 2.
Suppose (1) ⇒ (3) holds for eligible integers less than N . Let F be a full-dimensional nested set on [N ], then |F | = N − 2. Consider the maximal elements S 1 , . . . , S k ∈ F , k ≥ 1, with respect to set inclusion. Then by the case (ii) in the proof of Lemma 18, |F | ≤ N − k. Hence k ≤ 2.
If k = 2, all equalities hold in (8), so there are two maximal elements S 1 , S 2 ∈ F with S 1 ∩ S 2 = ∅ and |S 1 | + |S 2 | = N . So condition (b) holds for [N ]. Let F i be the set of the proper subsets of S i that belong to F for i = 1, 2. Then both F i are full-dimensional tree topologies on their respective ground sets S i . By the induction hypothesis, both F i are bifurcated and all elements inF i satisfy either condition (a) or (b). Note hence F is also bifurcated, which completes the transition step.
(3) ⇒ (1): We proceed by induction on N . When N = 3 and F is bifurcated, then condition (a) holds for {1, 2, 3} and F only contains one 2-element subset, so F is full dimensional.
Suppose (3) ⇒ (1) holds for all eligible integers less than N . For any bifurcated nested set F on [N ], either condition (a) or (b) holds for the set [N ]. If (a) holds, then there exists S ∈ F such that |S| = N −1. Then all elements in F \{S} are proper subsets of S and they thus form a nested set on the ground set S. This nested set is also bifurcated, by the induction hypothesis, so it is full dimensional. So |F \{S}| = |S| − 1 = N − 2 and |F | = |F \{S}| + 1 = N − 1, F is full dimensional.
If condition (b) holds, then there exist disjoint S 1 , S 2 ∈ F such that S 1 ∪S 2 = [N ]. Let F i be the elements of F that are proper subsets of S i for i = 1, 2. Then F = F 1 ∪ F 2 ∪ {S 1 , S 2 }. Each F i is a nested set on the ground set S i (and may be empty when |S i | = 2); it is still bifurcated. By the induction hypothesis, F i is full dimensional and |F i | = |S i | − 2. Then |F | = (|S 1 | − 2) + (|S 2 | − 2) + 2 = N − 2, so F is still full dimensional. This completes the transition step.
(3) ⇒ (4): Suppose F is bifurcated and a rooted phylogenetic tree T has tree topology F . Let v be a non-leaf node. It suffices to show that v has degree 3. We consider two cases: (i) Suppose v is not the root of T . Then there is a unique path from the root of T to v. Along the path, there is an edge connecting v, and this edge corresponds to a clade S in F . Since F is bifurcated, S satisfies either conditions in Definition 24. If there exists a proper subset S such that S is also a clade of F and |S | = |S| − 1, then all other edges connecting v include one edge connecting to the leaf in S\S and one edge corresponding to S . Otherwise there exist clades S , S of F such that S ∪ S = S, then all other edges connecting v include the two edges corresponding to S and S . In either case, v is trivalent.
(ii) Suppose v is the root of T . Then v is connected to the virtual leaf 0. Since [N ] ∈F , [N ] satisfies either conditions in Definition 24 and v is connecting to two other edges in either case, so v is also trivalent.
(4) ⇒ (3): Suppose a binary rooted tree T has tree topology F . For any clade S ∈F , S corresponds to an edge e of T . Let v be the vertex of e with greater distance to the root of T . Since T is binary, v is trivalent and connects to other two edges e and e . Each leaf in S has a unique path to v, which must contain e or e . This admits a partition of S into two nonempty subsets. If both subsets have cardinality of at least 2, then they are both clades of F , and we have S ∩ S = S. Otherwise one of them is a singleton, say S , and thus S is a clade of F with |S | = |S| − 1. Thus S satisfies the condition in Definition 24. For [N ] ∈F , since T is binary, the root of T is also connected by two edges other than the edge to the virtual leaf. The reasoning above applies to [N ], and F is bifurcated.
These concepts of bipartitioning in terms of tree topologies are important in understanding the combinatorial aspects of tree ultrametrics. In addition to the fact that tree ultrametrics are equidistant trees, it is also true that every point along any tropical line segment between two equidistant trees is also itself an equidistant tree. The equivalence relation = F and partial order < F completely define the set of all ultrametrics for a given tree topology F , which we present and formalize in the following results.
Proposition 27. Given a tree topology F on [N ], let ut(F ) be the set of all ultrametrics in U N corresponding to a tree with tree topology F . Then Proof. Fix a tree topology F on [N ] and a corresponding equidistant tree T . For each internal edge of T indexed by a clade S, let (S) > 0 be its length. For the external edge of T , connecting the ith leaf to the root of T , let i be its length. Let h be the height of T . Then, by definition, the distance of each leaf to the root of T is h. There exists a unique path from the ith leaf to the root, consisting of the external edge connecting the ith leaf, and some internal edges: an internal edge indexed by S appears on this path if and only if the ith leaf and the root are separated by the internal edge itself. This necessarily means that i ∈ S.
Next we consider the path connecting the ith and jth leaves. This path consists of the two external edges and some internal edges. An internal edge indexed by S appears on this path if and only if the ith leaf and the jth leaf are separated by the edge. Equivalently, this means that |{i, j} ∩ S| = 1. Hence Therefore w satisfies the condition defining the set (9).
Conversely, suppose a vector w ∈ R n /R1 satisfies the conditions defining the set (9). Then the system of linear equations has a solution such that h ∈ R and x S > 0 for all S ∈ F . By a translation in R, we may choose a sufficiently large h such that all i in (10) are positive. Then w is the ultrametric of an equidistant tree with external edge lengths i and internal edge lengths x S , whose tree topology is F .
Corollary 28. Let F be a tree topology on [N ] and w ∈ ut(F ). Then for any pairs p, q ∈ p N , if p ∩ q = ∅, then w p = w q implies p = F q and w p < w q implies p < F q.
Proof. If p = q, then w p = w q holds and p = F q also holds. Otherwise we may assume p = {i, j} and q = {i, k}. By Lemma 23, one of the following relationships holds: p < F q, or p = F q, or q < F p. By Proposition 27, these three relationships imply w p < w q , or w p = w q , or w p > w q respectively. Hence the converse implications also hold.
So far, we have introduced means to study tree structures by studying subsets of leaves and iteratively dividing and comparing these subsets. We have also determined the set of tree ultrametrics defined by the comparison framework set up in Definition 22. Given this framework, we now determine when we have compatibility of sets in order to present our combinatorial result that gives all possible tree topologies along a tropical line segment.
We note that the geometric and combinatorial procedure we present is a natural approach that has been previously implemented in other tree settings and, more generally, in finite metric spaces, e.g., by Bandelt and Dress (1992); Dress (1984). Our approach differs in two important ways: firstly, we consider the infinite space of all sets of phylogenetic trees, and secondly, our study is fundamentally tropical, since we use the framework of tropical line segments.
Definition 29. Let F 1 and F 2 be tree topologies on [N ]. Define the set of compatible tree topologies of F 1 , F 2 to consist of tree topologies F where there exist tree ultrametrics w 1 ∈ ut(F 1 ) and w 2 ∈ ut(F 2 ), such that the tree ultrametric w 1 w 2 ∈ ut(F ). We denote this set by C (F 1 , F 2 ).
Theorem 30. Let F 1 , F 2 , F be full-dimensional tree topologies on [N ]. If F ∈ C(F 1 , F 2 ), then any equivalence class C ⊆ p N with respect to = F is contained in an equivalence class with respect to either = F1 or = F2 .
Put differently, for each S ∈F , there exists either S 1 ∈ F 1 such that for p ∈ p N , if cl F (p) = S, then cl F1 (p) = S 1 ; or S 2 ∈ F 2 such that for p ∈ p N , if cl F (p) = S, then cl F2 (p) = S 2 .
Proof. Suppose F 3 ∈ C(F 1 , F 2 ). There exist ultrametrics w 1 , w 2 , w 3 such that for p ∈ p N , w 3 p = max(w 1 p , w 2 p ) and w i ∈ ut(F i ) for i = 1, 2, 3. Choose S ∈F = S ∈ F | |S| ≥ 3 ∪ [N ]. By Theorem 26, F is bifurcated, so condition (a) or (b) of Definition 25 holds for S. If (a) holds, we set X = S and Y = S\S ; if (b) holds, we set X = S 1 and Y = S 2 . Then for for all i ∈ X and j ∈ Y . Consider a complete, bipartite graph G := K |X|,|Y | with vertices X ∪ Y . Recall that the vertices of a bipartite graph can be partitioned into two disjoint, independent sets where every graph edge connects a vertex in one set to one in the other. Thus, for (i, j) ∈ X × Y , if w 1 {i,j} = M , then we call the edge (i, j) of G pink; if w 2 {i,j} = M , then we call the edge (i, j) of G purple. Then, each edge in G is pink, purple, or both pink and purple. We claim that in fact, either all edges of G are pink, or all edges of G are purple.
Suppose there exists a non-pink edge (i, j) ∈ X × Y in G. Then this edge is purple: M = w 2 {i,j} = w 1 {i,j} . By (11) {j,j } = w 2 {j,j } = M and w {j,j } = M . By Corollary 28, this means that {i, j} = F {i, j } = F {j, j }, which contradicts that F is full dimensional. Therefore (i, j ) must be purple. Symmetrically, for any i ∈ X with i = i, (i , j) must be purple. Then for such i and j , we have Finally, if all edges in G are pink, then all w 1 {i,j} are equal for (i, j) ∈ X × Y . By Corollary 28, all these {i, j} belong to the same equivalence class with respect to = F1 ; symmetrically, if all edges in G are purple, then all w 2 {i,j} are equal for (i, j) ∈ X × Y and all these {i, j} belong to the same equivalence class with respect to = F2 .
Example 33. Let N = 5 and choose the following two full-dimensional tree topologies on [5]: Then the full-dimensional tree topologies in C(F 1 , F 2 ) are F 1 , F 2 themselves, and three others: does not belong to C(F 1 , F 2 ), because {1, 2} and {1, 3} form an equivalence class with respect to = F , but {1, 3} < Fi {1, 2} for i = 1, 2. This gives an example of the non-existence of certain types of trees.

Symmetry on Tropical Line Segments
In contrast to the previous result, which dealt with how tree topologies vary along tropical line segments, we now turn our focus to understanding when and how invariance arises in the space of ultrametrics. To do this, we define the notion of symmetry on ultrametrics in terms of leaf relabeling. The natural setting for such a study is the action of the symmetric group Sym(N ) on N labels on [N ], given by permuting the coordinates (positions) of the labels of the leaves. In our study, we consider the map Σ : where w is an ultrametric and σ ∈ Sym(N ) is an N th-order permutation of the symmetric group.
Definition 35. Let T be an equidistant tree with N leaves and let w T ∈ U N be a tree metric associated with T . Define the equivalence relation ∼ σ between equidistant trees T and T with N leaves by T ∼ σ T if and only if T and T have the same tree topology and branch lengths, but the labels of leaves in T are permuted by σ ∈ Sym(N ). Example 36. Let T 1 and T 2 be the equidistant trees shown in Figure 4. Here, T 1 ∼ σ T 2 , where σ = (2, 3, 1, 4).
Theorem 40. Suppose T 0 and T 0 are equidistant trees with N leaves such that T 0 = σ T 0 . Also, suppose T and T are equidistant trees with N leaves such that T = σ T . Then Proof. Since T 0 ∼ σ T 0 , we have that the differences w T − w T0 and w T − w T 0 are equal after ordering the coordinates of w T − w T0 and w T − w T 0 from the smallest and largest. The remainder of the proof follows the proof of Proposition 39.
Remark 41. Haar measures are a natural generalization of Lebesgue measures on spaces with a specified group structure; they are relevant to fundamental studies in probability theory. By our specification in Definition 35 and Theorem 40, and the existence of probability measures on the tropical geometric interpretation of tree space , there exist Haar measures: the group structure is given by the symmetric group Sym(N ) and the invariance is on tropical line segments between ultrametrics.
Corollary 42. For T 0 invariant under σ, let T and T be equidistant trees with N leaves such that T ∼ σ T . Then Σ(Γ(w T , w T0 ), σ) = Γ(w T , w T0 ) and Σ(Γ(w T , w T0 ), σ −1 ) = Γ(w T , w T0 ). Example 43. Let T 1 , T 2 be equidistant trees in Figure 5. Here, σ = σ −1 = (4, 3, 2, 1). Notice that Σ(Γ(w 1 , w 0 ), σ) = Γ(w 2 , w 0 ) and Σ(Γ(w 2 , w 0 ), σ −1 ) = Γ(w 1 , w 0 ). In this figure, the black points represent the trees T 1 and T 2 . These trees have the same tree topology and the same branch lengths, but leaves are labeled differently. In BHV space, T 1 and T 2 are distinct trees, and thus belong to different orthants in this example, since there are only two internal edges in these trees: recall that in BHV space, within each orthant, trees are stored by their internal edge lengths, which represent coordinates. The BHV distance between T 1 and T 2 is then the unique geodesic that is the cone path, traversing the origin, illustrated by the red line. (See Figure 1 for the configuration of BHV 5 and to see that T 1 and T 2 are not in neighboring orthants.) We have T 1 ∼ σ T 2 , and the tropical line segment is illustrated by the blue line. See the details in Example 45 for an explicit computation of the tropical line segment between these trees. The orange points represent the three remaining trees in the figure: notice that in these trees, there is only one internal edge length, and thus the orange points lie on the 1-dimensional strata partitioning the quadrants. The line segments connecting the points traverse the quadrants, and at every point along the blue lines, there are two internal edge lengths until the subsequent orange point is reached, where one internal edge contracts completely into one internal node.
Notice in this example that the bending points of the tropical line segment (the orange points) exhibit interesting tree topologies, including one (2, 2, 2, 0.8, 2, 2) that exhibits sister taxa. Recall from Section 2.4 above that orthant boundaries in BHV space that boundaries correspond to trees with a collapsed internal edge between two orthants with similar tree topologies; this tree then lies on a BHV orthant boundary. Depending on conventions adopted, however, it may not always be a tree topology; if tree topologies are required to be full dimensional, then this tree is not a valid tree topology in BHV space.

Computing Tropical Line Segments
We now focus our study on the tropical line segment and give an algorithm for its computation. We begin with a remark from Example 43, where we see that the tropical line segment is not a cone path; it does not traverse the origin (or star tree), whereas BHV geodesics are often cone paths when orthants are far apart. This observation leads to the following proposition, which uncovers a more subtle behavior of tropical line segments compared to BHV geodesics.
Proposition 44. For two points u and v in general position in the tropical projective torus R n /R1, the tropical line segment between u and v does not contain the origin.
Proof due to Carlos Améndola. Let u = (u 1 , . . . , u n ) and v = (v 1 , . . . , v n ) be two points in general position in R n /R1. The tropical line segment consists of the concatenation of m ordinary line segments, each one having a direction of a zero-one vector (Maclagan and Sturmfels, 2015, Proposition 5.2.5). The bending points (i.e., the points at which the segments are concatenated) can be computed explicitly via the entries of the difference vector λ = v − u. Up to reordering, they are given by λ i u v for i = 1, . . . , n (which include the endpoints u and v). From these expressions we see that in order for 0 to be contained in one of the ordinary line segments that comprise the tropical segment from u to v, the bending points would need to be in a particular arrangement (one must be a scalar multiple of the other), which does not happen because u and v are in general position.
Notice that Proposition 44 also follows from the fact that there is a unique tropical line segment passing through two points in general position: We may assume without loss of generality that one of the points is 0 and set the other to be u, let L be the unique tropical line segment passing through 0 and u. Then for any v ∈ L, there is no tropical line segment containing u, v, and 0. Hence, the tropical line segment between u and v cannot contain 0.
We now give an algorithm for computing tropical line segments between equidistant trees with N leaves. The algorithm takes as input two ultrametrics: w 1 = w 1 {1,2} , . . . , w 1 {N −1,N } associated with an equidistant tree T 1 with N leaves; and w 2 = w 2 {1,2} , . . . , w 2 {N −1,N } associated with an equidistant tree T 2 with N leaves. It returns the tropical line segment between T 1 and T 2 in U N .
We remark here that there is a similarity in approach between our algorithm and various results by Bernstein (2020) that combinatorially computes ultrametrics. Theorem 3.6 in Bernstein (2020) gives a combinatorial description of a finite set of ultrametrics with a tropical convex hull that itself is a set of of ultrametrics; it begins with an ultrametric that is coordinate-wise larger than a vertex of a particular tropical polytope and also "slides" internal nodes down until another candidate vertex is attained, which, however, may not necessarily be a tropical vertex (Yu, 2020). In our algorithm, various tree topologies are obtained at intermediate steps also by "sliding" internal nodes.

Discussion
In this work, we considered the space of phylogenetic trees in the context of tropical geometry as an alternative framework to BHV space. The construction of BHV space is based on tree topologies, where a Euclidean orthant is assigned to each tree topology. In our paper, we study tree topologies and their occurrence in the tropical geometric phylogenetic tree space. In particular, we use the tropical line segment as a framework for our study. For any two given trees, we compute the tropical line segment between them and prove a combinatorial theorem that describes all tree topologies that occur on this tropical line segment between the two trees. We also provide a notion of invariance on a tropical line segment, based on a permutation of leaf labels on trees; this construction has implications in tropical geometric applications to probability theory, since it provides the existence of Haar measures on phylogenetic tree space. We also give an algorithm to compute the tropical line segment between any two trees, which has a lower computational complexity than the current state-of-the-art for computing geodesics in BHV space. We also show that the tropical line segment does not pass through the origin, whereas in BHV space, for orthants that are far apart from one another, the geodesic is often a cone path. This implies a more subtle and intricate geometry than that of BHV space, which is an interesting avenue for further study.
Our work lays foundations for future studies in both theoretical and applied directions. The behavior of the tropical line segment inspires further questions concerning the geometry of tropical geometric phylogenetic tree space. One example is the study of the curvature of the space; given that general geodesics are not unique in palm tree space (i.e., the tropical geometric tree space endowed with the tropical metric), it differs from the CAT(0) geometry of BHV space. The variation of tree topologies in tropical geometric phylogenetic tree space may also be used for statistical studies. For example, when different tree topologies arise via a random data generating processes, an interesting question is to ask whether the the difference in topologies is due to a difference in distributions. Palm tree space is a well-defined probability space , so this question may be posed in terms of a statistical hypothesis test. A natural way to measure differences between two objects is via a metric; Colijn and Plazzotta (2017) propose a metric on phylogenetic tree shapes, which may be used to define a test statistic.