Ranked Subtree Prune and Regraft

Phylogenetic trees are a mathematical formalisation of evolutionary histories between organisms, species, genes, cancer cells, etc. For many applications, e.g. when analysing virus transmission trees or cancer evolution, (phylogenetic) time trees are of interest, where branch lengths represent times. Computational methods for reconstructing time trees from (typically molecular) sequence data, for example Bayesian phylogenetic inference using Markov Chain Monte Carlo (MCMC) methods, rely on algorithms that sample the treespace. They employ tree rearrangement operations such as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{SPR}$$\end{document}SPR (Subtree Prune and Regraft) and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{NNI}$$\end{document}NNI (Nearest Neighbour Interchange) or, in the case of time tree inference, versions of these that take times of internal nodes into account. While the classic \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{SPR}$$\end{document}SPR tree rearrangement is well-studied, its variants for time trees are less understood, limiting comparative analysis for time tree methods. In this paper we consider a modification of the classical \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{SPR}$$\end{document}SPR rearrangement on the space of ranked phylogenetic trees, which are trees equipped with a ranking of all internal nodes. This modification results in two novel treespaces, which we propose to study. We begin this study by discussing algorithmic properties of these treespaces, focusing on those relating to the complexity of computing distances under the ranked \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{SPR}$$\end{document}SPR operations as well as similarities and differences to known tree rearrangement based treespaces. Surprisingly, we show the counterintuitive result that adding leaves to trees can actually decrease their ranked \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{SPR}$$\end{document}SPR distance, which may have an impact on the results of time tree sampling algorithms given uncertain “rogue taxa”.


Introduction
Phylogenetic trees are used to display evolutionary relationships, for example between organisms, species, or genes, and are usually inferred from DNA, RNA, amino acid, or other types of sequence data.Typical applications include reconstructing the evolutionary history of a set of species, e.g. to construct the tree of life (Hug 2016), analysing transmission patterns of viruses (Rambaut et al. 2001;Turakhia 2022), and investigating cancer evolution (Alves et al. 2019).The goal is the same in all applications: finding a phylogenetic tree that best explains evolutionary relationship between given (sequence) data, represented by the leaves of the tree.
Many methods for inferring phylogenetic trees from sequence data use tree sampling or treespace search algorithms.These include Bayesian methods, for example the software packages BEAST (Drummond and Rambaut 2007), BEAST2 (Bouckaert et al. 2014), MrBayes (Huelsenbeck and Ronquist 2001), and RevBayes (Höhna et al. 2016), and maximum likelihood methods like IQ-TREE (Nguyen et al. 2015), PhyML (Guindon et al. 2010), or RAxML (Stamatakis 2014).All these methods rely on tree proposals: for a given tree, a new tree similar to the current tree is proposed and is accepted if it fulfils certain conditions.If accepted, the current tree is updated to be this new tree and the procedure repeated.Tree rearrangement operations, which apply local changes to a tree, are commonly used for these tree proposals.Most popular are the Subtree Prune and Regraft (SPR) and Nearest Neighbour Interchange (NNI) tree rearrangements, the latter being a version of SPR restricted to being more local.
An SPR operation (or move) cuts an edge of a phylogenetic tree and reattaches the thereby detached subtree at a different position in the tree.This tree rearrangement (or a version of it) is implemented in many tree inference software packages, including maximum likelihood methods (Guindon et al. 2010;Stamatakis 2014;Price et al. 2010), Bayesian inference methods (Drummond and Rambaut 2007;Höhna et al. 2016;Ogilvie et al. 2017) and also recent parsimony-based methods that are able to infer large-scale phylogenies (Ye et al. 2022).One reason for the popularity of SPR moves in tree search algorithms is that, unlike very local NNI moves, SPR moves can be used to jump across wider regions of treespace, preventing tree search algorithms from getting stuck in local optima (Guindon et al. 2010;Nguyen et al. 2015).Similarly, it has been shown for Bayesian inference methods that SPR moves, or modifications thereof, can speed up convergence of Markov chains and improve their mixing when used in combination with other operators (Höhna et al. 2008).There is however a major obstacle for interpreting how well treespace is traversed when using SPR moves for tree search and tree sampling algorithms: computing the SPR distance, i.e. the minimum number of SPR moves required to transform one tree into another, is N P-hard (Hickey et al. 2008;Bordewich and Semple 2005).Despite this N P-hardness result, fixedparameter tractable algorithms for computing SPR distances exist (Whidden et al. 2013), which can be used for analysing modes of posterior distributions (Whidden and Matsen 2015) or for computing supertrees (Whidden et al. 2013;Whidden and Matsen 2019).Understanding properties of the SPR treespace, which can be viewed as a graph where vertices represent trees that are connected by edges if the trees are connected by an SPR operation, has proven useful for getting a better understanding of phylogenetic inference methods (Whidden and Matsen 2019).
Analysing the geometry of the classic SPR treespace has lead to numerous results for SPR on rooted and unrooted trees (Whidden andMatsen 2017, 2019).Although SPR operations are defined in the same way for both types of trees and the two resulting treespaces share most of their properties, the techniques to prove those properties differ significantly.N P-hardness of the problem of computing distances, for example, has first been shown for the rooted SPR (Bordewich and Semple 2005) and only three years later for the unrooted case (Hickey et al. 2008).An important mathematical structure for both of these proofs is the Maximum Agreement Forest (MAF).A MAF for two trees results from deleting a minimum number of edges in each tree so that the two resulting graphs (forests) are isomorphic.The number of connected components of a MAF for two rooted trees coincides with their rooted SPR distance (Bordewich and Semple 2005).For unrooted trees, however, this relationship between the SPR distance and the number of connected components of the MAF does not hold (Allen and Steel 2001;Whidden and Matsen 2019), which was the reason some erroneous proofs of N P-hardness made it to the literature.It also has been found that the maximum distance under both rooted and unrooted SPR is linear in the number of leaves (Ding et al. 2011;Atkins and McDiarmid 2019), while the number of trees in the 1-neighbourhood, i.e. the number of trees resulting from one SPR move, is quadratic in the number of leaves (Song 2003;Allen and Steel 2001).The latter is an important property that has been used to determine the curvature of the treespace by Whidden and Matsen (2017) in order to analyse mixing properties of Markov Chain Monte Carlo methods, which are used for Bayesian tree inference.
The majority of these results have been derived for trees that do not contain timing information of evolutionary events.We refer to trees where branch lengths represent times, meaning that the evolutionary events represented by internal nodes are dated, as time trees.These time trees are of interest in many applications.Software packages like BEAST (Drummond and Rambaut 2007) and BEAST2 (Bouckaert et al. 2014), for example, infer time trees and rely on tree proposals that incorporate timing information of evolutionary events.A version of SPR moves for time trees has been introduced by Höhna et al. (2008).The authors analysed the suitability of this move as a tree proposal operator for Bayesian inference using MCMC algorithms and showed it performed better than some of the previous operators in isolation but it was still better to use a combination.As of May 2023, this move is default in BEAST. 1 A guided version of this proposal was introduced in Höhna and Drummond (2012), where guiding the choice of destination improved the acceptance ratio.Another version of SPR moves, which works simultaneously on a species tree and multiple gene trees, can be found in the Stacey package for BEAST2 (Jones 2017).Even though SPR moves are widely used as tree proposals and have been adapted to work for time trees, little research has gone into SPR treespaces that take times of evolutionary events into account.
A version of SPR moves for time trees that has been studied mathematically is the one introduced by Song (2006), where ranked trees are considered.Ranked trees (which are called ordered trees in Song (2006)) are rooted phylogenetic trees with internal nodes ordered according to times of corresponding evolutionary events and leaves are assumed to be sampled at the same time (ultrametric).The SPR moves defined by Song (2006) can move a subtree to a different place in the tree under the condition that the rank of the reattachment node is greater than the rank of the root of the moved subtree.Bounds for neighbourhood sizes and diameters are provided (Song 2006), but no further results are known.Despite these efforts to investigate SPR moves for ranked trees, there is still a gap for analysing tree inference methods using SPR for time trees, as the version of SPR moves as defined by Song (2006) is currently not used in phylogenetic inference.
In this paper we fill this gap by considering an alternative definition of SPR moves for ranked trees, where we require the height (i.e.rank) at which a subtree is cut and reattached to be the same.This restriction on the rank reattachment is inspired by ranked SPR moves in phylogenetic inference software: The Fixed Node Prune and Regraft move introduced in Höhna et al. (2008) as tree proposal for Bayesian inference methods is the extension of our ranked SPR moves to time trees.We call these tree rearrangements Horizontal SPR (HSPR) moves and the corresponding treespace HSPR.Motivated by the importance of this move in computational phylogenetics, we study mathematical properties of this treespace focusing on the metric space of ranked trees that is given by the HSPR move.Studying this treespace helps us understand its fundamental properties, which is important to analyse tree inference algorithms using HSPR for tree proposals as well as posterior distributions of time trees output by BEAST or BEAST2, similar to how it has been done by Whidden and Matsen (2015).By additionally allowing rank moves, which swap the order of two nodes in a ranked tree, we define a further metric space called RSPR (Ranked SPR).We include rank moves in the same way as it has been done in a variation of NNI introduced in Gavryushkin et al. (2018) for ranked trees (RNNI) that allows distances to be computed in polynomial time (Collienne and Gavryushkin 2021), unlike in the classical NNI treespace, where this problem is N P-hard.This suggests that the complexity of computing distances can be different in classical SPR and its ranked version.
This paper is structured as follows.We introduce notations and define the new treespaces in Sect. 2 before discussing some of its fundamental properties in Sect.3.This especially includes the cluster property, which a treespace possesses if shortest paths between trees preserve shared information in the form of common clusters.This property has proven to be important for the polynomial time algorithm in RNNI (Collienne and Gavryushkin 2021) and for fixed-parameter tractable algorithms in SPR (Whidden and Matsen 2019;Linz and Semple 2011).The discussion of the cluster property is followed by some observations on the shape of shortest paths and the relationship of RSPR and HSPR shortest paths (Sect.4).Finally, we establish a surprising result on how the distance between two trees can change after adding a new leaf to them (Sect.5).We provide an open source implementation of horizontal SPR moves (Collienne 2023), which we use to show some properties of ranked SPR spaces computationally.

Preliminaries
A rooted binary phylogenetic tree is a pair T = (t, φ) where t is a rooted binary tree and φ : L(t) → X is a bijective map from the set of leaves L(t) of t to a set of labels Fig. 1 Two rooted SPR moves.The cross marks the edge that is cut to prune the subtree T | v of T with leaf set {l 1 , l 2 }.In the tree on the left this subtree is reattached on the edge highlighted by a circle, connecting l 5 with its parent, and in the tree on the right the pruned subtree is reattached as child of a newly introduced root ρ X = {l 1 , l 2 , . . ., l n }.Throughout this paper, we refer to rooted binary phylogenetic trees simply as rooted trees, and we assume that all trees have n leaves, unless stated otherwise.Let T be a rooted tree that contains an edge e = (u, v) and let T | v be the subtree of T rooted in v.A subtree prune and regraft (SPR) move on e in T transforms this tree into a new rooted binary rooted phylogenetic trees by the following three steps: (i) Prune the subtree T | v : delete the edge e from T , resulting in two connected components T | v and T ρ , where T ρ contains the root ρ of T .(ii) Suppress the resulting node of degree two in T ρ (node u), so that T ρ is a rooted tree.(iii) Reattach T | v : either by introducing a new node w on an edge f in T ρ and adding an edge (w, v), or by adding a new root ρ and adding edges (ρ , ρ) and (ρ , v) (Fig. 1).

Theorem 1 The decision problem SPR:
Instance: Two rooted trees T and R and an integer k Question: A proof for this theorem can be found in Bordewich and Semple (2005).This proof relies on the equivalence of the SPR distance between two trees and the size of a maximum agreement forest.A maximum agreement forest (MAF) of two rooted trees T and R can be interpreted as a forest that results from cutting the minimum number of edges from both T and R to result in the same forest, and its size |MAF(T , R)| is defined as the number of its connected components.For a formal definition, see Bordewich and Semple (2005).The equality |MAF(T , R)| = d SPR (T , R) implies that the same edge is only pruned once on a shortest SPR path.Furthermore, distance computation can be broken down into smaller problems if two trees share some information in the form of common clusters (Linz and Semple 2011), which we will formally define later.

Ranked Trees
A ranked binary phylogenetic tree is a pair T = (T u , rank) consisting of a rooted tree T u and a function rank : V → {0, 1, . . ., n − 1}, where V is the set of nodes of T u , such that: (i) rank(v) = 0 if and only if v is a leaf, (ii) rank(v) = rank(w) for all internal nodes v = w, and (iii) rank(v) < rank(w) if w is on the path from the root of T u to v. We refer to rank(v) as the rank of v, and we denote node of rank i in a ranked binary phylogenetic tree T by (T ) i .For simplicity of notation, we say tree or ranked tree to refer to ranked binary phylogenetic trees, unless stated otherwise.The assumption that all leaves have rank 0 can be interpreted as requiring ranked trees to be ultrametric.
If T = (T u , rank) is a tree, we call the rooted tree T u the unranked version of T , as it can be interpreted as T without ranks, but with leaf labels.Two trees T and R are identical if there is a graph isomorphism between them that preserves leaf labels and ranks.We then write T R, and if the trees are not identical we write T R.An example of a tree T with annotated ranks and its unranked version T u is given in Fig. 2.
A subtree T | v of a tree T is a tree rooted in a node v of T that contains exactly the nodes that descend from v in T , annotated by the same ranks as in T .We then say that T | v is induced by the node v.Note that by this definition of subtrees, the subtree of a ranked tree is not necessarily a ranked tree itself.We denote the parent of a node v by parent(v) and if a subtree T | v is induced by v, we call parent(v) parent of the subtree T | v .To emphasise that we are considering the parent of a subtree T | v in a tree T we might write parent T (T | v ).
Since we consider trees where internal nodes are assigned unique ranks and leaves are assigned unique leaf labels, we can uniquely identify nodes in a tree by their labels, using the label function Throughout this paper we assume {1, . . ., n − 1} ∩ {l 1 , . . ., l n } = ∅, which results in l being a bijective function.We can therefore uniquely identify a node v by its label l(v), so we will refer to a node v simply by l(v).For example, we might refer to the internal node with rank i as node i and to the leaf with label l j as node l j or leaf l j .We define a strictly partial order ≺ T on the co-domain of l so that l(u) ≺ T l(v) if rank T (u) < rank T (v).If it is clear that we consider the tree T we might simply write ≺ for ≺ T .
Using the label function (1) allows us to represent edges (u, v) in a tree as (l(u), l(v)).We call the set E(T ) = {(l(v), l(w)) | (v, w) is an edge in the tree T } the edge set of T .E(T ) uniquely defines T .We say that an edge (l A cluster C of a tree T is the set of leaves descending from an internal node v in T .We then say that the node v induces the cluster C. A cherry is a subtree consisting of one internal node with two leaves as children, and we refer to a cherry by its cluster containing these two leaves, e.g.{c 1 , c 2 }.If the internal node of such a cherry has rank i, we say that the cherry {c 1 , c 2 } has rank i.Given a set of leaves S and a tree T , the subtree T | S of T induced by S is the subtree of T with minimum number of leaves that contains all leaves of S. If S is a cluster of T , T | S contains exactly the leaves of S. We can uniquely represent a tree T by a list of its clusters sorted according to increasing rank.This representation is called cluster representation and has been introduced by Collienne and Gavryushkin (2021).The leftmost tree in Fig. 3 for example has cluster representation [{l 1 , l 2 }, {l 3 , l 4 }, {l 1 , l 2 , l 5 }, {l 1 , l 2 , l 3 , l 4 , l 5 }].We include the cluster induced by the root in the cluster representation of a tree, even though this cluster is simply the set of all leaf labels {l 1 , l 2 , . . ., l n } for all trees on n leaves.

Subtree Prune and Regraft for Ranked Trees
A horizontal SPR move (HSPR move) on an edge e of a tree T , which cannot be incident to the root, transforms this tree into a tree T by shifting the top node of e horizontally to another branch.Formally, the move is performed in the following three steps: (i) prune the subtree T | v : delete the edge e to obtain two subtrees T | v and T ρ , where T ρ has the same root ρ as T , (ii) suppress the resulting node of degree 2 in T ρ (top node of edge e), whose rank we denote by i, (iii) reattach T | v on an edge f : reattach the root of T | v to a newly introduced node v at rank i on an edge f that covers rank r in T ρ .
Since the changes done to the tree T move node i, we call this an HSPR move at rank i.A rank move on a tree T can be applied to two nodes with consecutive ranks if they are not connected by an edge, and swaps the ranks of these two nodes.An example of an HSPR move and a rank move can be found in Fig. 3.Note that HSPR moves and rank moves are reversible and in contrast to SPR moves for rooted trees, HSPR moves do not allow subtrees to be reattached above the root of a tree.
We are now ready to introduce our main objects of study in this paper, two treespaces extending the classical SPR treespace to ranked trees.
Fig. 3 Rank move on the left, swapping the ranks of the nodes highlighted the leftmost tree.On the right an HSPR move at rank 2 is illustrated, moving the subtree induced by {l 1 , l 2 } by cutting the edge highlighted by a cross and re-attaching it at the edge highlighted by a circle The Ranked SPR (RSPR) space is a graph where vertices represent trees on n leaves that are connected by an edge if one tree can be transformed into the other by an HSPR move or a rank move.The Horizontal SPR (HSPR) space is a graph where vertices represent trees on n leaves that are connected by an edge if one can be transformed into the other by an HSPR move.
We refer to the moves allowed in RSPR space, i.e. rank moves and HSPR moves, as RSPR moves.
A path between two trees in HSPR (RSPR) is a sequence of trees p = [T 0 , T 1 , . . ., T d ] such that T i and T i+1 are connected by an HSPR (RSPR) move for all i.If a path p contains d + 1 trees, we say it has length | p| = d.A shortest path between trees T and R is a path of minimal length connecting T and R. The length of such a path is called the distance between trees T and R, and we refer to this distance as d HSPR (T , R) in HSPR space and d RSPR (T , R) in RSPR space.

HSPR moves on edge sets
Let us consider how an HSPR move changes the set of edges of a tree.By the definition of HSPR moves, there are four edges involved in an HSPR move at rank i on a tree T : The three edges incident to the node of rank i, and the edge f on which the pruned subtree gets reattached.All other edges stay the same in the tree T and its HSPR neighbour T .Let e = (i, l) be the edge that is cut by the HSPR move between T and T , let (i + , i) and (i, i − ) be the other two edges incident to i, and let f = ( j + , j − ) be the reattachment edge in T .Then the HSPR move between T and T changes these edges as follows: (i) prune the subtree T | v : delete the edge e = (i, l) (ii) suppress the resulting node of degree 2: replace edges (i + , i) and (i, i − ) by an edge (i + , i − ) (iii) reattach T | v : replace edge f = ( j + , j − ) by ( j + , i) and (i, j − ) and add edge (i, l).
An illustration of the trees T and T with these node labels is provided in Fig. 4. The difference between the edge sets of T and T can be summarised to Fig. 4 HSPR move pruning the subtree T | v , moving it to the edge (i + , i − ).Dotted edges represent parts of the tree that potentially contain further nodes Conversely, if an edge set E(T ) can be described in this way, then T and T are connected by an HSPR move: Theorem 2 Let E(T ) be the set of edges of a tree T .A tree T with is HSPR neighbour of T if and only if (i + , i), (i, i − ), ( j + , j − ) ∈ E(T ) and j − ≺ i ≺ j + .Note that j − ≺ i ≺ j + is required in Theorem 2, as the reattachment edge needs to cover the rank i of the HSPR move.
Using Theorem 2, we can define an HSPR move between trees T and T by providing the difference in edge sets E(T ) and E(T ).We say that a change in edge sets describes a valid HSPR move, if the edges that are removed from E(T ) fulfil the conditions of Theorem 2.

HSPR moves on cluster representation
Similar to using edge sets to describe HSPR moves, we can also use the cluster representation to describe an HSPR move between two trees.Let T and R be trees connected by an HSPR move at rank i as described in Theorem 2 and let the clusters induced by i − , T | v (the subtree that is moved), and j − be A, B, and C, respectively.Any of A, B, and C could be just a leaf, e.g.A = {l k } for some k, but for simplicity we refer to those sets as clusters, too.Let furthermore T = [C 1 , C 2 , . . ., C i−1 , A ∪ B, C i+1 , . . ., C n−1 ] be the cluster representation of T .The HSPR move described in Theorem 2 then creates a new cluster B ∪ C at rank i, as in T the node i has T | v and j − as children.All clusters induced by nodes with rank less than i remain unchanged between T and T .Since the subtree induced by B becomes sibling of the subtree induced by C in T , the move between T and T removes B from every cluster of T that contains B but not C and is induced by a node with rank greater than i.On the other side, B is added to every cluster induced by a node with rank greater than i that contains C in T .All remaining clusters induced by nodes with rank greater than i that do not contain B or C remain unchanged between T and T .
We can summarise this to describe T by its cluster representation with Conversely, if the difference between the cluster representation of T and T can be described in this way, T and T are connected by an HSPR move: is connected to T by an HSPR move if and only if

Basic Properties of HSPR and RSPR
The first question we want to answer is whether our newly defined treespaces are connected (Theorem 4), that is, whether a tree can be transformed into any other tree by a sequence of moves in HSPR or RSPR.Connectedness is essential for these treespaces, as tree rearrangements are used for tree proposals in (MCMC) random walks, which should be able to reach any tree in treespace from any starting tree.For developing and interpreting such random walks it is furthermore important to know how many trees have distance one from a given tree, as well as the maximum distance between any two trees.We establish neighbourhood size (Theorem 5) and maximum distance (diameter) in RSPR and HSPR in Sect.3.1.We then investigate the cluster property for RSPR and HSPR and explain the significance of neither of the two spaces having this property (Sect.3.2).

Theorem 4
The treespaces HSPR and RSPR are connected.Proof We show that any pair of trees T and R are connected by a path of only HSPR moves.As all HSPR moves are RSPR moves, it then follows that both spaces are connected.
We construct an HSPR path from T to R by the following bottom up approach, iterating through ranks k = 1, . . ., n − 1 of R. In every iteration k, we perform HSPR moves so that all nodes with ranks less than or equal to k induce the same clusters in the tree after iteration k and R. Let T be the tree before iteration k, let R| i and R| j be the subtrees that are children of the node of rank k in R, and let T | l and T | m be the subtrees that are children of the node of rank k in We can assume that R| i and R| j are subtrees in T , because we use a bottom up approach that results in all nodes of rank less than k inducing the same cluster in R and the tree T before iteration k.Note that in the first iteration k = 1, R| i and R| j will contain a single leaf only.Consider the following path transforming T into a tree T 2 with parent First, perform an HSPR move at rank k that prunes T | l from T and reattaches it on the edge between R| i and parent T (R| i ).In the resulting tree T 1 , parent In a second step, we perform an HSPR move pruning the subtree R| i from T 1 and reattaching it on the edge between R| j and parent T 1 (R| j ), resulting in a tree T 2 with parent T 2 (R| i ) = parent T 2 (R| j ) = k.Since all moves between T and T 2 are HSPR moves at rank k, no clusters induced by nodes less than k change between T and T 2 , while the cluster induced by the node of rank k changes so that it coincides between T 2 and R. Hence, all cluster induced by nodes with rank less than or equal to k are identical in T 2 and R. Note that if {R| i , R| j } and {T | l , T | m } intersect, the trees T 1 or T 2 might be equal to T , and fewer than two steps are required to reach T 2 .At the end of iteration k, we update T := T 2 and continue with the next iteration k + 1.After iteration n − 2, we reach R, as after each iteration k all nodes with rank less than or equal to k induce the same clusters in T and R.
This procedure can be applied to any two trees to compute a path connecting them by a sequence of HSPR moves, which proves the theorem.
The algorithm used to prove Theorem 4 produces a path between any two trees in HSPR, and can hence be used to approximate HSPR distances.An implementation can be found on GitHub (Collienne 2023).
Since HSPR and RSPR are connected undirected graphs, we obtain the following corollary.
Corollary 1 d HSPR and d RSPR are metrics.
The (1-)neighbourhood of a tree T in a treespace with distance measure d is defined as N H(T ) := {T | d(T , T ) = 1} and a tree T ∈ N H(T ) is called neighbour of T .We use N H HSPR (T ) and N H RSPR (T ) to refer to the neighbourhood of T in HSPR and RSPR, respectively.Because tree inference algorithms often require sampling tree neighbourhoods, it is important to know the number of neighbours of a tree T under a tree rearrangement, i.e. |N H(T )|.In rooted and unrooted SPR the number of neighbours of a tree is quadratic in the number of leaves n (Song 2003;Allen and Steel 2001).
For counting the number of neighbours of a tree in RSPR, we need the following notion: If a tree T has two nodes r and r + 1 with rank difference one that are not connected by an edge, we say that [r + 1, r ] is a rank interval.The leftmost tree T in Fig. 3 for example has two rank intervals: [3, 2] and [2, 1].We now show that the number of neighbours in RSPR and HSPR is quadratic in the number of leaves, with the number of neighbours in RSPR naturally depending on the shape of the tree.We derive both of these numbers explicitly.

Theorem 5 The number of neighbours of a tree T with k rank intervals is
Proof The number of neighbouring trees resulting from a rank move on a tree with k rank intervals is k, as there is one unique rank move for every such interval.
We now count the number of HSPR moves at rank i for i ∈ {1, . . ., n − 1}.Since every node has two children, two different subtrees can be pruned by an HSPR move at rank i.There are n − 1 − i edges that cover rank i, excluding the one on which the node of rank i is placed in T , so there are n − 1 − i potential reattachment edges for the pruned subtree.This gives 2(n − 1 − i) RSPR neighbours of T resulting from an HSPR move at rank i.And because HSPR moves can be performed at any rank between 1 and n − 1, the number of RSPR moves possible on T is: Since all rank moves and HSPR moves result in different trees, T has (n −1)(n −2) neighbours in HSPR and (n − 1)(n − 2) + k neighbours in RSPR.

Diameter
The maximum distance between any two trees in a treespace with distance measure d, max ), is called the diameter of the treespace.When measuring the similarity of trees using a distance metric, knowing the maximum possible distance is essential for interpreting distances.The diameter of both unrooted and rooted SPR space is n − ( √ n) (Ding et al. 2011;Atkins and McDiarmid 2019) and hence linear in n.We show in this section that the diameters of HSPR space and RSPR space are linear in the number of leaves n, too, by establishing lower and upper bounds.

Corollary 2
The diameters of HSPR space and RSPR space have an upper bound of 2(n − 2).
Proof We can use the algorithm introduced in the proof of Theorem 4 to compute a path between any two trees T and R in HSPR.Since every HSPR move is an RSPR move, the length of this path is an upper bound of the diameter of both HSPR and RSPR space.The algorithm uses a bottom-up approach that, starting at tree T , constructs in iteration k = 1, . . ., n − 2 the cluster induced by the node of rank k in R, using at most two HSPR moves.After iteration n − 2, we receive the destination tree R after at most 2(n − 2) HSPR moves.Because the path p constructed by this algorithm has length at most 2(n − 2), this provides an upper bound to the diameter of HSPR and RSPR.
It is important to note that the algorithm described in the proof of Theorem 4 approximates HSPR distances and does not compute the exact distance for all pairs of trees, which we can show using our implementations (Collienne 2023).
We can also prove a lower bound for the distance between any two trees in HSPR, but first we need the following lemma.

Lemma 1 Let T and R be trees containing x leaves whose parents have different ranks in T and R
Proof As described in the technical introduction, an HSPR move at rank i between trees T and T leads to the following difference in edge sets for some edges (i + , i), (i, i − ), ( j + , j − ) ∈ E(T ) where ( j + , j − ) covers rank i: Therefore, the only nodes whose parents change by this HSPR move are the nodes i, i − , and j − .Not all of them need to have different a parent after the move, since for example i + = j + results in the parent of i having the same rank in T and T .With (i + , i) and (i, i − ) being edges in T , it follows that i is an internal node, so only i − and j − can be leaves.Therefore, an HSPR move can change the parents of at most two leaves.
Since the ranks of x parents of leaves differ between T and R and any HSPR move can fix at most two of those, there are in total at least x 2 HSPR moves needed to connect T and R.
The lower bound given in Lemma 1 can be tight.For example, the two leftmost trees in Fig. 6 are connected by one HSPR move and the parents of x = 2 leaves (l 2 and l 5 ) have different ranks in the two trees.
Theorem 6 There are trees T and R with distance d HSPR (T , R) ≥ n 2 .
Proof Let T and R be the following caterpillar trees: The parent of l 1 and l 2 has rank one in T , but not in R, where l 3 and l 4 are children of the node of rank one.For all other leaves l i with i ∈ {5, . . ., n}, the rank of the parent also is different in Therefore, the parents of all n leaves have different ranks in T and R, and by Lemma 1 it follows By Corollary 2 and Theorem 6, the diameter of the HSPR space is linear in n.We will see later (Corollary 6) that Theorem 6 also applies to RSPR.

No Cluster property
Two trees T and R share a cluster C if both of them contain C as cluster.If on a path p every tree contains the cluster C, we say that p preserves the cluster C. We moreover say that a treespace has the strong cluster property (simply called cluster property by Collienne et al. (2021)) if for two trees sharing a cluster C, every shortest path between them preserves C. In other words, if two trees share some evolutionary information in form of a cluster, this information is preserved along every shortest path between them if the treespace has the cluster property.If for any two trees sharing a cluster C there exists a shortest path that preserves C in a tree space, we say that this treespace has the weak cluster property.Note that the difference between the weak and the strong cluster property is that for the strong cluster property we require all shortest paths to preserve clusters, while for the weak cluster property only one shortest path needs to preserve a shared cluster.
The classic rooted (unranked) version of SPR space has the weak cluster property, which is shown as part of the proof of Theorem 2.2 in Linz and Semple (2011), where the problem of computing the SPR distance is split into the problem of computing distances for subtrees induced by shared clusters.The weak cluster property of SPR is essential for fixed-parameter tractable algorithms (Linz and Semple 2011;Whidden et al. 2013) and facilitates proofs for N P-hardness of computing SPR distances, as it is related to the formulation of this problem as an agreement forest problem.
Here we show that RSPR and HSPR space have neither the weak nor the strong cluster property.This observation is important as it suggests that the proving technique for N P-hardness for rooted (unranked) SPR, which uses maximum agreement forests, cannot be used for ranked trees.
Theorem 7 HSPR space and RSPR space do not have the weak cluster property.
Proof We prove this theorem by considering trees T and R that share a cluster but have no shortest path between them that preserves this cluster.Let T and R be the following two trees (see Fig. 6): On any path between T and R preserving the shared cluster {l 4 , l 5 }, the rank of the node inducing this cluster needs to decrease from 3 in T to 1 in R. Note that originating from T , no HSPR move can decrease the rank of this cluster, the only possible moves preserving {l 4 , l 5 } are rank moves.There is hence no shortest path between T and R in HSPR that preserves {l 4 , l 5 }, i.e.HSPR does not have the weak cluster property.
In RSPR, two rank moves are necessary to decrease the rank of the node inducing Since R is not identical to R, further RSPR moves are needed to receive R, resulting in a path of length greater than two.
There is however a path from T and R with only two HSPR moves, where first the leaf l 1 is pruned and moved to the edge (parent(l 5 ), l 5 ) and then l 5 is pruned and moved to the edge (parent(l 4 ), l 4 ) (see Fig. 6).Since this path is shorter than any path in RSPR preserving the cluster {l 4 , l 5 }, RSPR space does not have the weak cluster property.

HSPR Shortest Paths
Shortest paths between trees in HSPR are generally not unique, and we show here that they can be arranged so that the rank at which subtrees are cut does not decrease on a shortest path (Theorem 8).We then use this result to study scenarios at which clusters are preserved along shortest paths: First, we show that if the cherry at rank one is identical in two trees, it is preserved on every shortest path (Theorem 9).Then, by further generalising this result, we show in Corollary 4 that for two trees with identical clusters up to a certain rank, all shortest paths between them preserve these clusters.These results provide insights into the shape of shortest paths in HSPR, and in future research we hope to leverage these results to prove the complexity of computing distances in this treespace.
To prove that there is a shortest path in HSPR on which ranks of HSPR moves do not decrease, we need the following lemma.
Lemma 2 Let p = [T , T , R] be a path in HSPR such that T and T are connected by an HSPR move at rank k and T and R are connected by an HSPR move at rank i with k > i.Then there is a path p = [T , T , R] where an HSPR move at rank i connects T and T and an HSPR move at rank k connects T and R.
Proof As described in Theorem 2, we can describe the HSPR move between trees T and T by the change in the set of edges E(T ), compared to E(T ): where (k + , k), (k, k − ), (l + , l − ) ∈ E(T ) and (l + , l − ) covers rank k.Similarly, we assume that E(T ) and E(R) are related in the following way: where (i + , i), (i, i − ), ( j + , j − ) ∈ E(T ) and ( j + , j − ) covers rank i.In total, we get Note that there might be edges that are added to T and then deleted in R. Therefore, when summarising multiple HSPR moves as set operations, we assume that we consider multisets, even though the edge set of a tree does not contain an edge multiple times.Let Then E 3 ∩ E 1 = ∅, as otherwise the HSPR move between T and R would not be possible.
If E 2 ∩ E 3 = ∅, then E 3 ⊂ E(T ), and we can create a tree T with E(T ) = E(T ) \ E 3 ∪ E 4 , i.e.T and T are connected by an HSPR move at rank i.With E 2 ∩ E 3 = ∅ we get E 2 ∈ E(T ) and we can perform an HSPR move at rank k on T that results in Therefore, R * and R are identical and the path p = [T , T , R] contains an HSPR move at rank i followed by an HSPR move at rank k.
We now distinguish different cases in which We distinguish six different cases, based on this intersection.In every case, we construct an alternative path p = [T , T , R], where T and T are connected by an HSPR move at rank i and T and R are connected by an HSPR move at rank k.To show that the moves that we describe by edge set changes are valid HSPR moves, we need to show that the edge set changes we provide fulfil the criteria listed in Theorem 2. By our assumption of the moves on p, we know that E 1 ⊂ E(T ) and if e ∈ E 3 \ E 2 , then e ∈ E(T ).

(i
We perform an HSPR move on T to receive the tree T with This is a valid move because: ∈ E 2 • ( j + , j − ) covers rank i by the definition of the moves on p.
We can then receive a tree R * by an HSPR move on T with This is a valid move because: between T and T .• (k, i − ) ∈ E(T ), as it has been added between T and T .
between T and T .• (l + , l − ) covers rank k by the definition of the moves on p.
Let T be resulting from T by an HSPR move at rank i that changes E(T ) as follows: 123 This is a valid move because: • ( j + , j − ) covers rank i by the definition of the moves on p.
We can then receive a tree R * by an HSPR move on T with This is a valid move because: between T and T .• (k, i − ) ∈ E(T ), as it has been added between T and T .• (l + , i) ∈ E(T ), as it has been added between T and T .• (l + , i) covers rank k by the definition of the moves on p.
We can perform an HSPR move at rank i on T that gives us the following tree T : This is a valid move because: ∈ E 2 .• ( j + , j − ) covers rank i by the definition of the moves on p.
We can then perform an HSPR move at rank k on T to get a tree R * with This is a valid move because: between T and T .• (l + , i − ) ∈ E(T ), as it has been added between T and T .
• ( j + , j − ) covers rank k by the definition of the moves on p.
We can perform an HSPR move at rank i on T to get a tree T with This is a valid move because: • (k, j − ) covers rank i by the definition of the moves on p.
An HSPR move at rank i can transform T to a tree R * with This is a valid move because: between T and T .• (k, i) ∈ E(T ), as it has been added between T and T .• (l + , i − ) ∈ E(T ), as it has been added between T and T .• (l + , i − ) covers rank k by the definition of the moves on p.
Performing an HSPR move at rank i on T can give us a tree T with This is a valid move because: • (k, j − ) covers rank i by the definition of the moves on p.
An HSPR move at rank k can therefore convert T into a tree R * with This is a valid move because: between T and T .• (k, i) ∈ E(T ), as it has been added between T and T .
• (l + , l − ) ∈ E(T ), because (l + , l − ) ∈ E 1 and (l + , l − ) has not been removed from T to get T .• (l + , l − ) covers rank k by the definition of the moves on p.
We can perform an HSPR move at rank i on T to get a tree T with edge set This is a valid move because: Then an HSPR move at rank k is possible on T and transforms this tree into R * with This is a valid move because: because it has been added by the move between T and T .• (l + , i) covers rank k by the definition of the moves on p.
Theorem 8 There is always a shortest HSPR path between any two trees T and R on which the rank of HSPR moves increases monotonically.
Proof Let p be a shortest path between trees T and R. By Lemma 2 we can take any pair of consecutive HSPR moves at ranks k and i and if k > i, we can replace them by two HSPR moves so that first a rank i and then a rank k HSPR move is performed.By replacing all such pairs of HSPR moves iteratively, we receive a path on which the ranks of HSPR moves increase monotonically.
To show that all shortest paths preserve a shared cherry at rank one (Theorem 9), we need the following two lemmas, which describe HSPR moves at a fixed rank k along a shortest path.These lemmas are interesting on their own as they are informative about the local geometry of the HSPR treespace.Recall that (T ) i is the node of rank i in T whereas T | i is the subtree induced by that node.

Lemma 3 Let T and R be trees containing subtrees
T | m for all j = m, so that (T ) i j and (R) i j induce the subtree T | j for j = 1, . . ., d in T and R, respectively.Let (i + j , i j ) be the edges between (T ) i j and its parent in T for all j = 1, . . ., d.If (i) i j ≺ min k=1,...,d (i + k ) for all j = 1, . . ., d and (ii) Informally, the difference between T and R is the positioning of the subtrees T | 1 , T | 2 , . . ., T | d , which we can interpret as a permutation of these subtrees.By changing every edge We now create a path p = [T 0 T , T 1 , . . ., T d R] of length d.We describe the moves on p iteratively for every pair T j−1 , T j by using the edge set notation.For every move we make, we prove that it is a valid move by showing that the conditions of Theorem 2 are fulfilled.Note that we can infer from T | j T | m that i j = i m for all j = m.
We define the first move on p by the following change of the edge set of T : This is a valid move, because: by the assumption of the lemma.
2 , i 2 ) covers rank k by the assumptions of the lemma.The next moves on p between T j−1 and T j for j = 2, . . ., d − 1 are described by the following changes in edge sets: These are valid moves, because: 123 , because this edge is added to E(T j−1 ) by the previous move between T j−2 and T j−1 .
• (k, i j−1 ) ∈ E(T j−1 ), because this edge is added in the move from T j−3 to T j−2 , and it is not changed when moving from T j−2 to T j−1 .
) ∈ E(T ) by the assumptions of the lemma, and with i j = i m for all j = m and i + j+1 = k for all j = 1, this edge is not changed on any previous move on p.
• (i + j+1 , i j+1 ) covers rank k by the assumption of the lemma.The last step, transforming T d−1 to T d , we apply an HSPR move to T d−1 that changes the edge set as follows: This is a valid move, because: and it is not changed when moving from ), because it is added to E(T 1 ) in the first move on p and with i j = i m for all j = m and i ++ 1 = k, we can infer that it is not removed from the edge set of any tree on any previous move on p.
We can summarise all changes to the edge sets along p using multisets, which can be simplified using k = i + 1 : It follows E(T d ) = E(R) and therefore, T d R. Hence, p is a path from T to R of length d, proving d HSPR (T , R) ≤ d.Fig. 7 Shortest HSPR path between trees T and R. The subtree consisting of only the leaf a 5 is moved by the first and the last HSPR move.The path displayed here does not preserve the shared cluster {a 1 , a 2 , a 3 } at rank three.The nodes inducing this cluster are highlighted in T and R. On this path only subtrees consisting of single leaves move Another interesting property of shortest paths is that of Lemma 4: No subtree can move twice by HSPR moves at the same rank.It is however possible for one subtree to move twice at different ranks on a shortest path.For example in Fig. 7, the subtree only consisting of the leaf a 5 is moved by the first HSPR move at rank one and by the last HSPR move at rank two.

. , T | i d−1 in this order, i.e. p has length d. Then every T | i j is subtree of both T and R and T
Proof Let p be a shortest path only containing HSPR moves at rank k as described in the lemma.Every subtree T | i j with j ∈ {0, 1, . . ., d − 1} is induced by a node i j with rank less than k in T .Therefore, no edges inside any of these subtrees can change on p, which implies that T | i 0 , . . ., T | i d−1 are subtrees in all trees on p, including T and R.
To show T | i j T | i l for all j = l, we assume to the contrary that a subtree T ∈ {T | i 0 , . . ., T | i d−1 } moves twice on a p.To simplify notation, we assume without loss of generality that it is T T | i 0 T | i d−1 and that no other subtree moves twice on p.Note that it could be d −1 = 1, in which case T moves twice in two consecutive moves.
Let i + j = parent T (i j ) in T .Since the first move on p moves the subtree T | i 0 and all moves on p are HSPR moves at rank k, it is i 0 = k.Furthermore, the subtree T | i j needs to have node k as its parent before it gets pruned from a tree on p, as all moves are HSPR moves at rank k.This implied that T | i j gets reattached on the edge (i + j+1 , i j+1 ) for every j = 0, . . ., d − 2 to ensure that the next move can prune the subtree T | i j+1 at rank k.Let i 0 be the sibling of i 1 in T , i.e. parent T (i the HSPR moves along p can formally be described by the following changes in edge sets (see Theorem 3): and for j = 2, . . ., d − 2: By our assumptions on p, T | i j T | i m for all j = m in {1, . . ., d − 1}, we know that (i + j+1 , i j+1 ) ∈ E(T j−1 ) for all j = 1, . . ., d − 1.We can use similar arguments to those used in the proof of Lemma 3 to show that the changes of edge sets presented here describe valid HSPR moves up to T d−2 .
For the last move on p, moving the subtree T | i 0 , and its parent is (i + 2 , i 1 ), which is created by the move between T 1 and T 2 and with i + 2 = k and i j = i k for all j = k in {1, . . ., d − 1}, this edge is not changed again on p until reaching T d−2 .The move between T d−2 and T d−1 can therefore be described by the following change in the edge set of T d−2 : Let (i + d+1 , i d+1 ) be the edge on which T T d s gets reattached by the last move on p.We can then write this last move as It is again not hard to see with Theorem 2 that this describes a valid HSPR move.
Using multisets, we can describe the difference between E(R) and E(T ) as follows: Fig. 8 Trees T and R with cherry {x, y} at rank one and all moves on a path p from T to R that does not preserve the cherry as described in the proof of Theorem 9.The labels of the arrows indicate the leaves that are pruned and reattached by the corresponding HSPR moves that T and R have minimum distance among all tree pairs connected by such a shortest path.It then follows that p contains HSPR moves at rank one only: Otherwise, we could change the order of ranks of HSPR moves along the path to have all rank one HSPR moves first (Lemma 2).Since the cherry at rank one cannot change by HSPR moves at rank greater than one, T and R would then not be a minimum distance counterexample.Therefore, all HSPR moves on p are at rank one and hence move subtrees that consist of one leaf only.
By the assumption on T and R, either x or y is pruned by the first move on p.We assume that x is moved first, otherwise we swap notations for x and y.Since the last move on p reconstructs the cherry {x, y} at rank one, and the subtree containing only the leaf x can only move once at rank one (Lemma 4), the last leaf that moves on p is y.Let a 0 = x, a 1 , a 2 , . . ., a d−1 , a d = y be the sequence of leaves moved on p.By Lemma 4, all these leaves are distinct i.e. a i = a j for all i = j.Note that d − 1 ≥ 1, as after the first move on p the parent of y has rank greater than one, but the last move on p moves y by an HSPR move at rank one, for which the parent of y needs to have rank one.Moving a d−1 = x to become sibling of y in the second to last move is hence necessary to get to R, so d − 1 ≥ 1.The path p as described above is depicted in Fig. 8.
Let a + i = parent T (a i ) for all i = 1, . . ., d and p xy = parent T (1).Note that this implies a + i = 1 for all i.Assuming p = [T 0 T , T 1 , . . ., T d , T d+1 R], we can then describe the moves on p similar to how it has been done in the proof of Lemma 4: This describes a valid move, since: This describes a valid move, because: • (a + j−1 , 1) ∈ E(T j−1 ), because it has been added to this set by the previous move between T j−2 and T j−1 .
• (1, a j−2 ) ∈ E(T j−1 ), because it has been added to E(T j−2 ) by the move between T j−3 and T j−2 and with a i = a j for all i = j, this edge has not been removed from E(T j−2 ) to get E(T j−1 ).• (a + j , a j ) ∈ E(T j−1 ), because (a + j , a j ) ∈ E(T ) and with a i = a j for all i = j, this edge is not removed from the edge set in any previous move on p.
• (a + j , a j ) covers rank 1, because a j is a leaf and a + j = 1.
This describes a valid move, because: because it has been added to this set by the previous move between because it has been added to E(T 1 ) by the first move on p and with y = a i , 1 for all i, this edge is not removed from the edge set in any previous move on p. • ( p xy , y) covers rank 1, because ( p xy , 1) and (1, y) are edges in T .
This describes a valid move, because: because it has been added to this set by the previous move between T d−1 and T d .

• (1, a d−1 ) ∈ E(T d ), because it has been added to E(T d−2 ) by the move between
T d−2 and T d−1 and with a i = a j for all i = j, this edge has not been removed from , because it has been added to E(T 2 ) by the second move on p (x = a 0 ) and with x = a i , 1 for all i, this edge is not removed from the edge set in any previous move on p.
• (a + 1 , x) covers rank 1, because x is a leaf and 1 ≺ a + i for all i.Since some edges that are added along p are later deleted, we can describe the changes between E(T ) and E(R) by using multisets: which we can summarise to: T and R fulfil the requirements of the trees of Lemma 3, where i + j = a + j and i j = a j for all j = 1, . . ., d − 1, i + d = p xy , and i d = 1.Applying this lemma gives us d HSPR (T , R) ≤ d.
Therefore, p is not shortest paths, contradicting our assumption that there is a shortest path from T to R that does not preserve the shared cherry.
Theorem 9 implies that the distance of two trees with identical cherry at rank one does not change when one of the leaves in this cherry is deleted (Corollary 3).We will see in Theorem 11 that we can generally not assume that the distance stays the same or decreases when deleting a leaf from two trees.
Corollary 3 Let T and R be trees on n leaves sharing their cherry {c 1 , c 2 } at rank one and let T and R result from deleting c 2 from T and R, respectively, suppressing the internal node of rank one and subtracting one from the rank of all remaining internal nodes so that T and R are trees on n − 1 leaves.Then d HSPR (T , R ) = d HSPR (T , R).
Proof By Theorem 9 all shortest HSPR paths from T to R preserve the cherry {c 1 , c 2 }.Therefore, deleting c 2 from every tree on a shortest T -R-path, suppressing the resulting degree-two nodes, and subtracting one from the ranks of all remaining nodes, gives a path between T and R , i.e. d HSPR (T , R ) ≤ d HSPR (T , R).If there was a path between T and R that was shorter than d HSPR (T , R), then adding a leaf c 2 as sibling of c 1 with parent of rank one and adding one to the ranks of all other internal nodes in every tree on p results in a path between T and R of length less than d HSPR (T , R), which is a contradiction.Hence, d HSPR (T , R) = d HSPR (T , R ).
Another observation that follows from Theorem 9 is that if the lowest part of two trees, i.e. all clusters up to a certain rank, is identical, then no shortest path in HSPR changes this part of the tree.
Corollary 4 Let T and R be trees so that for some i ∈ {2, . . ., n} the cluster induced by (T ) j is identical to the cluster induced by (R) j for all j < i.Then every node (T ) j in every tree T on every shortest path between T and R induces the same cluster as (T ) j and (R) j for all j < i.

Proof
To prove this corollary, we assume to the contrary that there is a node with rank less than i that induces the same cluster in T and R, but this cluster is not present on a shortest path p between T and R. Let j be the rank of such a node so that there is no other node with this property in T and R with rank less than j.Then all clusters induced by nodes of rank less than j are present in all trees on p.We can iteratively apply Corollary 3 to the trees T and R where in the first iteration deleting a cherry leaf and updating the rest of the tree as described in the lemma results in trees T 1 and R 1 with d HSPR (T , R) = d HSPR (T 1 , R 1 ).This can be repeated until trees T j−1 , R j−1 are received with d HSPR (T j−1 , R j−1 ) = d HSPR (T , R), where T j−1 and R j−1 are trees on n − ( j − 1) leaves.As all clusters induced by nodes of rank less than j are present in all trees on p, the leaves that are deleted from T and R to receive T j−1 and R j−1 , respectively, can be deleted in the same order in all trees on p in the same way, and we receive a path p from T j−1 to R j−1 with | p | = |p|.Since the cluster induced by the nodes of rank j is the same in T and R, T j−1 and R j−1 have the same cherry at rank one.Because p does not preserve the cluster at rank j, there must be a tree on p not containing the cherry of rank one in T j−1 and R j−1 , which implies by Corollary 3 that p is not a shortest path, i.e. d HSPR (T j−1 , R j−1 ) < | p | = |p|.This however is a contradiction to d HSPR (T j−1 , R j−1 ) = d HSPR (T , R) = |p|.Therefore, there cannot be a shortest path p between T and R that does not preserve the cluster of rank j < i that is present in T and R, which concludes the proof of this corollary.
It is in general not true that a cluster C that is induced by nodes of the same rank r in two trees T and R is present in all trees on all shortest paths.A counterexample to this can be found in Fig. 7 where the shared cluster {a 1 , a 2 , a 3 } at rank three is not present in every tree on any shortest path from T to R, which we found out through exhaustive search using our implementation (Collienne 2023).

RSPR Shortest Paths
The RSPR space can be interpreted as an extension of HSPR in which rank moves are added to provide shortcuts between some trees.Here, we investigate how the addition of rank moves changes shortest paths.Our analysis of these paths will provide insights into the relationship of the complexity of computing distances in HSPR and RSPR.We show in Theorem 10 that we can change the order of moves on paths in RSPR, while not changing their length, so that all rank moves are grouped at the beginning followed by a sequence of only HSPR moves.This indicates that shortest paths in HSPR and RSPR can be very similar, sometimes identical (e.g. in the case of caterpillar trees, Corollary 6), and suggests that the complexity of computing shortest paths (and distances) might be the same in both treespaces.We first need the following lemma: Lemma 5 Let T and R trees and p = [T , T , R] a path with an HSPR move between T and T and a rank move between T and R. Then there is a path p = [T , T , R] consisting of either two HSPR moves or a rank move followed by an HSPR move.
Whether the path p in Lemma 5 contains two HSPR moves or a rank move followed by an HSPR move depends on the specific moves on p, as we will see in the proof of this lemma.For this proof we use the cluster representation of trees and describe HSPR moves by the corresponding changes in cluster representation as described in Theorem 3.
Proof Let k and k + 1 for some k ∈ {1, . . ., n − 3} be the ranks of the nodes of the rank move between T and R.
] be the cluster representation of T , which is illustrated in the top middle of Fig. 9. Since T and R are connected by a rank move of nodes k and k + 1, the cluster representation of 9).
In the following, we distinguish different HSPR moves possible between T and T and show how to replace T by a tree T to get a path p = [T , T , R] with either two HSPR moves or a rank move followed by an HSPR move.
1.The HSPR move between T and T is neither a rank k nor a rank k + 1 HSPR move.
An HSPR move at rank greater than k does not change the clusters induced by nodes k and k + 1, so we can in this case first perform a rank move of nodes k and k + 1 on T to get a tree T .Since all clusters of rank greater than k + 1 are identical in T and T we can perform an HSPR move on T that changes the clusters of this tree in exactly the same way as they change between T and T .This HSPR move on T then results in R, giving us a path p = [T , T , R] with the desired properties.
If the HSPR move between T and T is an HSPR move at rank r < k, it might change the clusters induced by the nodes of rank k and k + 1.It does however not matter in which order these two clusters appear in the tree, they would change in the exact same way if they were swapped.Therefore, we can first swap the ranks of the nodes with rank k and k + 1 in T , resulting in a tree T , and then perform the same HSPR move on T as the one between T and T .This results in a path p = [T , T , R] with a rank move between T and T and an HSPR move between T and R. 2. T and T are connected by an HSPR move at rank k.Without loss of generality we can assume that T | A is moved between T and T , otherwise we swap notations for A and B. We now further distinguish whether there is an edge connecting the nodes of rank k and k + 1 in T .
2.1.Let there be an edge connecting the nodes with ranks k and k + 1 in T .Note that by our assumptions on T , parent Therefore, an HSPR move at rank k on T that moves the subtree T | A and creates a tree T containing an edge We assume without loss of generality that T | A and T | D are siblings in T , as depicted in Fig. 9, otherwise we change notation for C and D. Then the cluster representation of T is: where for all m = k + 1, . . ., n − 2: To get a path p with the desired properties, we perform an HSPR move at rank k on T that moves T | D to become sibling of T | C , resulting in a tree T in which the parent of T | A has rank k + 1.This tree T has cluster representation: where for all m = k + 1, . . ., n − 2: Remember that every cluster is the union of two clusters at lower rank and/or leaves of a tree.Since the node (T ) k+1 induced the cluster where for all m = k + 1, . . ., n − 2: Using the fact that B ⊂ C m if and only if B ⊂ C m , we can summarise the change of a cluster at rank m > k + 1 between T and R as follows: otherwise.
Since the cluster at rank Therefore, Ĉm = C m for all m > k + 1.It follows that the cluster representation of R and R coincides, which gives us R R, so p is a path from T to R with two HSPR moves (see Fig. 9).2.2.If there is no edge between the nodes with rank k and k + 1 in T , then there is a cluster E in T so that T | A is sibling of T | E in T with E = A, B, C, D (see Fig. 10).Then the cluster representation of T is: where for all m = k + 1, . . ., n − 2: To create an alternative path p , we first perform a rank move swapping the ranks of the nodes k and k + 1 in T to receive a tree T with cluster representation: As second move on p we perform an HSPR move at rank k + 1, moving the subtree T | A to become sibling of T | B , giving us the following tree R: where clusters Ĉm with m ≥ k + 2 are defined as follows: To show that Ĉm = C m for all m = k + 2, . . ., n − 1, we distinguish whether  So for every m = k + 2, . . ., n − 1, it is Ĉm = C m , which implies R R, so p is a path from T to R with first a rank move and then an HSPR move.
3. T and T are connected by an HSPR move at rank k + 1.We again further distinguish whether the node of rank k + 1 is parent of the node of rank k in T or not.Then T has the following cluster representation: where A path p can now be constructed by first performing an HSPR move at rank k on T moving T | A to become sibling of T | C in the resulting tree T : with as D is not removed from any clusters between trees T and T .And since D ⊂ C m if and only if C ⊂ C m , we can see that Ĉm = C m for both cases (i) and (ii).
We can conclude that R R, so p is a path from T to R consisting of two HSPR moves.3.2.There is no edge between the nodes of rank k and k + 1 in T .
Let E be a cluster in T so that the subtree T | E is sibling of T | C in T with E = A, B, C, D, i.e.T | C is moved to become sibling of T | E by the HSPR move between T and T (see Fig. 12).The cluster notation for T is: We construct a path p by first performing a rank move swapping the nodes k and k + 1 of T , resulting in the tree T : We can then preform an HSPR move on T , moving the subtree T | C to become sibling of T | D , which gives us the tree R: In any case, it is Ĉm = C m for all m = k + 2, . . ., n − 1 and therefore R R, which implies that p is a path from T to R that consists of a rank move followed by an HSPR move.
In all cases above, we found an alternative path p to the path p so that p consists of either two HSPR moves or a rank move followed by an HSPR move, which concludes the proof of this lemma.
Using Lemma 5, we can now prove that we can change the order of moves on an RSPR path so that in the beginning we have a sequence of rank moves, followed by a sequence of HSPR moves.Theorem 10 In RSPR there is always a shortest path between two trees that has a sequence of rank moves (if there are any) at the beginning followed by only HSPR moves.
Proof To prove the theorem we assume that there is a shortest path p = [T 0 , T 1 , . . ., T d ] between trees T 0 and T d that has at least one rank move preceded by an HSPR move.Let T i−1 and T i for some 1 < i < d − 1 be connected by the first HSPR move on p that has a rank move following it, i.e.T i and T i+1 are connected by a rank move.By Lemma 5, we can replace T i by a tree T i so that there is a rank move or HSPR move between T i−1 and T i and an HSPR move between T i and R. When iteratively applying this procedure to all rank moves that are preceded by an HSPR move on p, we receive a path from T to R that has all rank moves at the beginning of the path, followed by a sequence of only HSPR moves.
Theorem 10 implies that any (not necessarily shortest) path between two trees T and R can be converted into a T -R-path of the same length that has all rank moves bundled at the beginning of the path, followed by HSPR moves.We can also shuffle moves on an RSPR path in the opposite way so that HSPR moves happen first and rank moves last.Additionally, we can infer the following from Theorem 10.
Corollary 5 Let T u and R u be rooted (unranked) trees.Then there are ranked trees T Proof Let rank T be an arbitrary rank function of T u , i.e.T = (T u , rank T ) is a ranked tree, and let rank R be a rank function on R u so that R = (R u , rank R ) is a ranked tree.By Theorem 10, we can find a shortest path p from T to R in RSPR that has all rank moves bundled in the beginning of the path.Let T be the ranked tree on p after this sequence of rank moves, i.e. the remainder of p between T and R consists of HSPR moves only.Since there is a path from T to T consisting of rank moves only, the unranked versions of T and T are identical, i.e.T = (T u , rank T ) for some rank function rank T on T u .Furthermore, as p is a shortest path in RSPR, the part of p between T and R is a shortest path between those trees in RSPR.And because this part of p consists of HSPR moves only, it must also be a shortest path in HSPR.We conclude that there are ranked trees T = (T u , rank T ) and R Corollary 6 If there is a shortest path between trees T and R in RSPR that contains a caterpillar tree, then there is a shortest path from T to R that consists of HSPR moves only and d HSPR (T , R) = d RSPR (T , R).
Proof Let T and R be trees connected by a shortest path p that contains a caterpillar tree T c .The sequence of trees on p between T and T c is a shortest path between these trees, and its reversed order a shortest path from T c to T .Applying Theorem 10 to this path from T c to T gives a shortest path where all rank moves are at the beginning of the path.Since T c is a caterpillar tree, there cannot be rank moves on T c which implies that there is a shortest paths from T to T c that consists of HSPR moves only.
By the same argument, the restriction of p from T c to R can be transformed into a path of the same length that consists of only HSPR moves.
We can concatenate the shortest path from T to T c with only HSPR moves and the one from T c and R with only HSPR moves and receive a path from T to R that has the same length as p.This path is hence a shortest path and consists of HSPR moves only.

Adding leaves
In this section we consider how adding a leaf to two trees can change their distance.Especially when considering data sets of ongoing evolutionary processes, like virus transmissions, it would be ideal to re-use the already inferred tree, and some methods to do this already exist (Gill et al. 2020;Dinh et al. 2018;Fourment et al. 2018;Bouckaert et al. 2022).For analyses where new leaves are added to an already existing tree, it is important to understand how adding a leaf can change a tree.It is also of interest to know how the distance between two trees changes when a new leaf is added.One would naturally assume that the addition of a leaf to two trees increases their distance.For HSPR and RSPR, however, we find that adding one leaf can decrease the distance between two trees linearly in n (Theorem 11).We will see that this is the case when two trees are similar and the new leaf is added at different heights in the tree.Such leaves with varying positions in a tree are often referred to as "rogue taxa" (Aberer et al. 2013), and our results here show that the existence of these can have a big impact on the distance between two trees under RSPR and HSPR.
Before we describe how adding a leaf results in a decrease in distance, we need the following observation.

Lemma 6 The caterpillar trees
have HSPR and RSPR distance greater than or equal to n−1 2 .The trees T and R of Lemma 6 are displayed in the top row of Fig. 13.Whether the bound for the distance of T and R in Lemma 6 is sharp remains an open question.
Proof The parents of n − 1 leaves l 1 , l 3 , l 4 , . . ., l n , i.e. all leaves except l 2 , have different ranks in T and R. By Lemma 1 it follows d HSPR (T , R) ≥ n−1 2 .And as the HSPR and RSPR distance between caterpillar trees coincides (Corollary 6), d RSPR (T , R) ≥ n−1 2 .Note that Lemma 6 implies that the diameter of RSPR is greater than or equal to n−1 2 , which is the same boundary we already found for HSPR in Theorem 6.We are now ready to prove the main theorem of this section.
Theorem 11 Adding a leaf to two trees can decrease their HSPR distance by n−1 2 −1.Proof Let T and R be the trees of Lemma 6.We add a leaf l n+1 to these trees T and R as follows (see Fig. 13): A new root is added to T so that the resulting tree T has l n+1 and the old root of T as children.In R we attach l n+1 as sibling of l 1 so that their parent has rank one, and the ranks of all other internal nodes of R are increased by one, giving us a tree R on n + 1 leaves.The cluster representations of T and R are: T = [{l 1 , l 2 }, {l 1 , l 2 , l 3 }, . . ., {l 1 , l 2 , . . ., l n }, {l 1 , l 2 , . . ., l n , l n+1 }] and R = [{l 1 , l n+1 }, {l 2 , l 3 }, {l 2 , l 3 , l 4 }, . . ., {l 2 , l 3 , l 4 , . . ., l n }, {l 1 , l 2 , . . ., l n , l n+1 }].
Let C i and C i be the cluster induced by the node i in T and R , respectively, for all i = 1, . . ., n − 1.We can then describe the difference between the cluster representations of T and R as follows: C 1 = {l 1 , l n+1 } and for all m = 2, . . ., n: By Theorem 3, T and R are connected by an HSPR move.Therefore, adding one leaf to T and R changes the distance from d HSPR (T , R) ≥ In Theorem 12 we will see that adding a leaf can only increase the HSPR distance if the unranked versions of the given trees are connected by a rooted (unranked) SPR move.
Theorem 12 Let T = (T u , rank R ) and R = (R u , rank R ) be trees with HSPR distance d HSPR (T , R) = d > 1, and let T and R be trees resulting from adding a leaf x to T and R, respectively, so that d HSPR (T , R ) = 1.
If T u and R u are the unranked versions of T and R, respectively, then the rooted (unranked) SPR distance of T u and R u is one: d SPR (T u , R u ) = 1.Fig. 13 Top: trees T and R with HSPR distance greater than or equal to n−1 2 (Lemma 6).Bottom: Adding leaf l n+1 to both T and R results in trees T and R that are connected by one HSPR move that moves l 1 Proof Since T and R have HSPR distance one, there is a subtree T | i present in T and R that is moved between them.It follows directly from the definitions of SPR and HSPR moves that if T and R are connected by an HSPR move, their unranked versions T u and R u are connected by an SPR move, too, and the subtree moving between T u and R u is the unranked version T | u i of T | i .To show d SPR (T u , R u ) = 1, we compare the distance of T u and R u with that of T u and R u and analyse how removing leaf x from T u and R u to obtain T u and R u affects their distance.We know that T u and R u differ only in the position of subtree T | u i .We consider following cases, depending on the position of the leaf x relative to T | u i : (i) If T | u i has x as its only leaf, then T u R u , contradicting the assumptions of the theorem.(ii) If x is not in T | u i , then T u and R u differ only by the position of T | u i .(iii) If x is not the only leaf in the leaf set of T | u i , then T u and R u differ only by the position of T | u i − x.While case (i) is not possible under our assumptions, case (ii) and (iii) both imply that T u and R u are connected by one SPR move.Therefore, we conclude that d SPR (T u , R u ) = 1.
There have however not been any comparable studies for phylogenetic time trees, even though time trees are inferred from sequence data in many applications and software packages like BEAST2 (Bouckaert et al. 2014) for time tree inference are extremely popular.One approach to considering SPR moves for time trees is that by Song (2006), where SPR moves are allowed to re-attach subtrees at the same height or closer to the root than its previous attachments.With this paper, we introduce two ranked SPR treespaces, RSPR and HSPR, that are motivated by the variations of SPR moves that are actually used as tree proposals in practice and have some properties in common with rooted (unranked) SPR.We show that all of these treespaces are connected, have neighbourhood sizes quadratic in the number of leaves n, and have a diameter linear in n.These properties have already been proven to be useful for SPR, which provides a good range of different trees for tree proposals (Whidden and Matsen 2015).Adding ranks to SPR provides an even more biologically relevant distance measure, as it ensures that the times of nodes in subtrees cannot change by one HSPR move, which particularly makes sense when using the number of SPR moves as a proxy for the number of reticulation events such as hybridisation, recombination, or horizontal gene transfer.Furthermore, HSPR moves provide a wider range of trees at close distance than for example RNNI moves, which is especially useful for tree proposals.These observations demonstrate that studying ranked SPR treespaces may provide insights to better understanding phylogenetic methods for time trees.
We also find some interesting differences between ranked and unranked SPR spaces.One of those is the absence of the (weak and strong) cluster property in ranked SPR.This suggests that the work on Maximum Agreement Forests (MAFs) cannot be transferred to ranked SPR spaces (see Whidden and Matsen (2019) for a discussion on MAF-like problems).Since MAFs are essential in the proofs of N P-hardness for classic SPR, a different strategy is needed to prove the complexity of computing distances in RSPR and HSPR.Note that it is currently not known whether this problem is N P-hard in the ranked SPR treespaces.
We did however obtain some results on properties for shortest paths in HSPR and RSPR.The order of moves on shortest paths in RSPR can be changed so that first rank moves and then HSPR moves are performed, which suggests that the complexity of computing distances in the two spaces is the same.Furthermore, we found that there is a shortest path on which the ranks of HSPR moves increases monotonically.Leveraging these results in ranked SPR spaces might lead to developing an algorithm for computing or approximating distances in these treespaces.Another characteristic of ranked SPR spaces that makes them stand out from known tree rearrangement based treespaces is that adding one leaf to two trees can decrease their distance linearly in n.This is very interesting behaviour, as it has not been observed before in any of the known tree rearrangement based treespaces, and seems to be related to the presence of rogue leaves.It is worthwhile investigating the influence of this effect on tree inference algorithms, especially when considering algorithms that aim to add new sequence data to existing phylogenies (online algorithms).
This paper provides the definition and first analysis of spaces of ranked trees to enable studying time tree inference methods in the same way as untimed tree inference methods.One important open question is that of the complexity of computing distances in RSPR and HSPR treespace.A first step could be to establish that the complexity of computing distances is the same for these two ranked treespaces.Furthermore, the exact diameter of HSPR and RSPR space is still to be determined.We can use our computations (Collienne 2023) to show that the diameter of HSPR follows the formula 3 2 (n − 2) for n ≤ 7, but whether this is true for any n remains an open question.Another very important next step for research on ranked SPR is developing algorithms, ideally fixed-parameter tractable ones, to calculate or approximate distances.This would facilitate leveraging our newly defined treespaces to allow analysing BEAST2 (Bouckaert et al. 2014) output as it has been done for MrBayes output with unranked SPR (Whidden andMatsen 2015, 2017).

Fig. 2
Fig.2A rooted tree T u on the left and a ranked tree T = (T u , rank) on the right

Fig. 5
Fig. 5 Path between T and T 2 as described in the proof of Theorem 4.Only subtrees involved in the moves given in the theorem are shown in this illustration

Fig. 6
Fig. 6 Trees T and R of the counterexample to the weak cluster property in RSPR and HSPR in Theorem 7. The labels of the arrows indicate leaves that are pruned in the corresponding HSPR moves by an HSPR move at rank k, the subtree T | i d−1 needs to become sibling of T | i d by an HSPR move at rank k between T d−2 and T d−1 .The edge connecting i 1 , which is the root of T | i d and only if (C ∪ D) ⊂ C m for all m ≥ k + 1.Therefore, all clusters induced by nodes with rank greater than k + 1 are the same in T and T : C m = C m for all m = k + 2, . . ., n − 1.We can now perform an HSPR move at rank k + 1 on T that moves the subtree T | A to become sibling of T | B , which results in a tree R with cluster representation: R = [C 1 , . . ., C k−1 , C ∪ D, A ∪ B, Ĉk+2 , . . ., Ĉn−2 ]

Fig. 9
Fig. 9 Trees T , T , and R on p, and alternative path p from T to R via T at the bottom if T and T are connected by an HSPR move at rank k and the nodes of rank k and k + 1 in T are connected by an edge, as explained in case 2.1.The dotted parts of the trees might contain further nodes and leaves

Fig. 11
Fig. 11 Trees T , T , and R on p, and alternative path p from T to R via T at the bottom if T and T are connected by an HSPR move at rank k + 1 and the nodes of rank k and k + 1 in T are connected by an edge, The dotted parts of the trees might contain further nodes and leaves

3. 1 .
Let there be an edge connecting nodes k + 1 and k in T .Remember that by our assumptions on T , parent T (T | A ) = parent T (T | B ) = k.Therefore, an HSPR move at rank k + 1 on T that moves the subtree T | C and creates a tree T containing an edge (k + 1, k) must move T | C to become sibling of T | A∪B (see Fig. 11).
m otherwise for all m = k + 2, . . ., n − 1.Because the node of rank k + 1 in T induced the cluster A ∪ B ∪ C, A ⊂ C m if and only if C ⊂ C m for all m = k + 2, . . ., n − 1.Therefore, C m = C m for all m = k + 2, . . ., n − 1.We then perform an HSPR move at rank k on T that moves the subtree T | C to become sibling of T | D , resulting in the following tree R: R = [C 1 , . . ., C k−1 , C ∪ D, A ∪ B, Ĉk+2 , . . ., Ĉn−1 ] we use that C ⊂ C m if and only if D ⊂ C m , and A ⊂ C m if and only if B ⊂ C m for all m = k + 2, . . ., n − 1, because the clusters induced by nodes of rank k and k + 1 in T are A ∪ B and C ∪ D, respectively.Furthermore, we distinguish whether (A ∪ B) ⊂ C m , (C ∪ D) ⊂ C m , and otherwise for all m = k + 2, . . ., n − 1.To show that Ĉm = C m for all m = k + 2, . . ., n − 1, we distinguish whether C ⊂ C m or C ⊂ C m .Note that since C ∪ D is the cluster induced by node k in T , it is C ⊂ C m if and only if D ⊂ C m and since D is not removed from any clusters between T and T , also D ⊂ C m if and only if D ⊂ C m for all m = k + 2, . . ., n − 1.If C ⊂ C m and E ⊂ C m , then C m = C m \C and since D ⊂ C m , Ĉm = C m ∪C, resulting in Ĉm = C m .If on the other hand C ⊂ C m and E ⊂ C m , then C m = C m ∪ C and with D ⊂ C m , Ĉm = C m ∪ C, resulting in Ĉm = C m .If C ⊂ C m and E ⊂ C m , then C m = C m ∪C and since D ⊂ C m , Ĉm = C m \C, resulting in Ĉm = C m .If on the other side C ⊂ C m and E ⊂ C m , then C m = C m and Ĉm = C m , resulting in Ĉm = C m .

Fig. 12
Fig. 12 Trees T , T , and R on p at the top, and alternative path p from T to R via T at the bottom if T and T are connected by an HSPR move at rank k and the nodes of rank k and k + 1 in T are not connected by an edge.The dotted parts of the trees might contain further nodes and leaves to d HSPR (T , R ) = 1, which means that their distance decreases by at least n d − 1, and with (i + 1 , i 1 ) being replaced by(i + 1 , i d ), T | d in R is attached where T | 1 is in T .We could describe this permutation of the subtrees by (T | 1 , T | 2 , . . ., T | d ), using the cycle notation for permutations.
j for m = 1, we rename the subtrees T | (( j+d−m+1) mod d)+1 to T | j for all j = 1, . . ., d.By the assumptions of the lemma, all edges (i j and only if B ⊂ C m and also B ⊂ C m if and only if B ⊂ C m .