1 Introduction

The following optimization problem can be studied in any metric space. Given a finite number of points, sometimes called sites, find a point which minimizes the sum of the distances to the sites. Such a point is called a Fermat–Weber point, and this is some version of a geometric median of the sites, which is known to be robust in a certain sense [7, §21]. Computing Fermat–Weber points is a rich topic with a remarkable history; see [7, Chapter II]. Here we consider a specific distance function, \(d_\triangle \), which occurs in tropical geometry [2, 32]. This function is asymmetric, i.e., \(d_\triangle (a,b)\) may differ from \(d_\triangle (b,a)\). So we call \(d_\triangle \) the asymmetric tropical distance, to differentiate from the symmetric tropical distance, which is more common [20, §5.3]. The symmetric tropical Fermat–Weber problem was investigated by Lin et al. [28] and Lin and Yoshida [29].

As our main result we prove that one (asymmetric tropical) Fermat–Weber point can be computed by solving a transportation problem. The transportation problem is an optimization classic, with numerous applications, both in theory and practice. For an overview we refer to Schrijver’s monograph [40, §21.6]; see also the survey by De Loera and Kim for the polyhedral geometry point of view [14]. Efficient methods for solving transportation problems include algorithms by Tokuyama and Nakano [42], Kleinschmidt and Schannath [22], and Brenner [8]. In general, Fermat–Weber points are not unique, so one part of the present work is devoted to understanding the entire (asymmetric tropical) Fermat–Weber set for a given set of sites. This is tightly related to the study of tropical hyperplane arrangements and tropical convex hulls, which are at the core of tropical combinatorics [20]. The latter subject is concerned with the rich interplay between tropical geometry and optimization.

One motivation for studying the Fermat–Weber problem in the setting of tropical geometry comes from phylogenetics [28, 29]. In that field, a part of computational biology, the goal is to associate trees to input data. A typical example are trees encoding ancestral relations among species, where the data originates from strands of DNA. In tropical geometry spaces of metric trees with n labeled leaves occur naturally as the tropical Grassmannians \({{\,\textrm{TGr}\,}}(2,n)\); see [31, §4.3] and [20, §10.6]. In phylogenetics many different methods are known to construct trees from a fixed data set. Since those methods usually do not come up with the same tree, there is a need to find the common ground. This gives rise to some consensus tree, which describes where the several methods agree. Finding consensus trees is a topic of its own [9], and the authors of [28] argue that “tropical convexity and tropical linear algebra \(\ldots \) behave better” than other methods. Here we show that passing from the symmetric tropical distance function to its asymmetric sibling leads to a new method for computing metric consensus trees which is even better behaved. This is because the asymmetric tropical Fermat–Weber sets are nicer geometrically. In particular, we show that our approach leads to a consensus method which is regular in the sense of [10]. Such a procedure is not known for symmetric tropical distances. For the purpose of finding a consensus method in tree space, there is no immediate disadvantage of employing an asymmetric distance rather than a symmetric one. In fact, asymmetric distances are common in location theory [34].

Our paper is organized as follows. We start out with a brief summary of facts from tropical combinatorics which are relevant for the Fermat–Weber problem. Then we prove that the Fermat–Weber set arises as a cell in the covector decomposition of the tropical torus induced by the sites. That covector cell is then characterized in several ways. The first approach employs regular subdivisions of products of simplices; see [15, §6.2]. That leads to a linear programming formulation of the Fermat–Weber problem, and the dual linear program is a transportation problem. The latter then provides efficient algorithms. The final chapter is devoted to computing Fermat–Weber sets in spaces of equidistant trees. In addition to theoretical results we report on computational experiments with polymake [19] and mcf [30].

Related work As an early contribution of tropical geometry to data science Gärtner and Jaggi [18] developed a concept for “tropical support vector machines”, with applications to classification in mind. A different train of thought was developed by Pachter and Sturmfels [35, §2.4] who related tropical geometry to phylogenetic trees. Later, Lin et al. [28] connected these ideas to the geometry of tree spaces studied by Billera et al. [5]. Yoshida et al. [43] proposed a method to analyze data, which they call “tropical principal component analysis”. In a way, the latter may be viewed as a synthesis of the above. A key contribution here is work of Ardila and Klivans, who saw that spaces of equidistant trees form the Bergman fans of the graphic matroids of complete graphs [3]. The term “tropical convexity” was coined by Develin and Sturmfels [16] to connect tropical geometry with the older topic of \((\max ,+)\)-linear algebra [11].

2 Tropical convexity

The purpose of this section is to set our notation and to collect a few facts which are key to our methods; for the details we refer to [20]. We consider the tropical semiring \({\mathbb T}^{\max }=({\mathbb R}\cup \{-\infty \},\oplus ,\odot )\) with \(\oplus =\max \) as the tropical addition and \(\odot =+\) as the tropical multiplication. The additive neutral element is \(-\infty \), and 0 is neutral with respect to the multiplication. Usually, we abbreviate \({\mathbb T}={\mathbb T}^{\max }\). The set \({\mathbb T}^n\) inherits the structure of a semimodule by componentwise tropical addition and tropical scalar multiplication.

Fig. 1
figure 1

Five sites in \({\mathbb R}^{3}/{\mathbb R}\mathbbm {1}\), their \(\max \)-tropical convex hull, the induced \(\min \)-tropical hyperplane arrangement, and the unique Fermat–Weber point (marked white)

A tropical cone in \({\mathbb T}^n\) is a nonempty subset C which contains each tropical linear combination \(\lambda {\odot } x \oplus \mu {\odot } y\) for \(\lambda ,\mu \in {\mathbb T}\) and \(x,y\in C\). Each tropical cone contains the point \(-\infty \mathbbm {1}\) and the entire set \({\mathbb R}\mathbbm {1}\), where \(\mathbbm {1}\) is the all-ones vector. Therefore it is convenient to pass to the quotient \({{\mathbb T}{\mathbb P}}^{n-1}:=({\mathbb T}^n\setminus \{-\infty \mathbbm {1}\})/{\mathbb R}\mathbbm {1}\), which is called the tropical projective space. A subset of \({{\mathbb T}{\mathbb P}}^{n-1}\) is tropically convex if it arises as the image of a tropical cone under the canonical projection. A tropical polytope is a finitely generated tropically convex set. The tropical projective torus \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\) is the subset of points in \({{\mathbb T}{\mathbb P}}^{n-1}\) with finite coordinates. We say that a tropical polytope is bounded if it lies in \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\).

Tropical convexity is intimately related to ordinary convexity, polyhedral geometry and (linear) optimization. For instance, tropical polytopes arise as the images of ordinary convex polytopes over ordered fields of real Puiseux series under the valuation map; see [20, Observation 5.10]. Yet, here the following less algebraic description is more relevant.

We consider an arbitrary \(m{\times }n\)-matrix \(V=(v_{ij})\) with real coefficients. Each row \(v_i=(v_{i1},\dots ,v_{in})\) is a point in \({\mathbb R}^n\) (or \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\), if we ignore the tropical scaling). So V may be viewed as a configuration of m labeled points in \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\). Technically, it is convenient to assume that each ordinary row sum equals zero, i.e., each row lies in the set

$$\begin{aligned} {\mathcal H}\ = \ \left\{ x\in {\mathbb R}^n \mid x_1+x_2+\dots +x_n=0\right\} \hspace{5.0pt}. \end{aligned}$$

Observe that each point in \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\) has a unique representative in \({\mathbb R}^n\) which lies in \({\mathcal H}\). So we can identify the tropical projective torus \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\) with the ordinary linear hyperplane \({\mathcal H}\) in \({\mathbb R}^n\). This also works topologically, since the quotient vector space \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\) is homeomorphic to \({\mathbb R}^{n-1}\) (and thus with \({\mathcal H}\) considered as a subset of \({\mathbb R}^n\)). The tropical projective space \({{\mathbb T}{\mathbb P}}^{n-1}\) is a compactification of \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\), where the boundary comprises those points which have at least one infinite coordinate.

Adding vectors tropically works coefficient-wise, and there is also tropical multiplication by scalars. With these notions, the \(\max \)-tropical convex hull of (the rows of) V is

$$\begin{aligned} {{\,\textrm{tconv}\,}}^{\max }(V) \,= \ \bigl \{ \lambda _1{\odot } v_1 \oplus \dots \oplus \lambda _m{\odot }v_m \bigm | \lambda _i\in {\mathbb R}\bigr \} + {\mathbb R}\mathbbm {1} \hspace{5.0pt}, \end{aligned}$$

which is a subset of \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\). The rows of the matrix V also define an arrangement of m tropical hyperplanes in \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\), with respect to \(\min \) as the tropical addition, and we denote this \({\mathcal T}(V)\). In this context each row arises as the apex of a \(\min \)-tropical hyperplane. Everything that we explained above also works for the \(\min \)-tropical semiring \({\mathbb T}^{\min }=({\mathbb R}\cup \{\infty \},\min ,+)\), which is isomorphic to \({\mathbb T}^{\max }\) as a semiring via \(x\mapsto -x\). Observe that this involution leaves the hyperplane \({\mathcal H}\) invariant.

The \(\min \)-tropical hyperplane arrangement with the rows of V as their apices induces an ordinary polyhedral subdivision, \({{\,\textrm{CovDec}\,}}(V)\), of \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\) (or, equivalently, \({\mathcal H}\)), called the covector subdivision induced by (the rows of) V. Its cells, which are called covector cells, are convex in three different senses: they are \(\max \)-tropically convex, \(\min \)-tropically convex and convex in the ordinary sense (as subsets of \({\mathcal H}\)). Such polyhedra are known as polytropes, and they may be bounded or unbounded. As ordinary polyhedra, the polytropes are characterized by the property that their facet normal directions are \(e_i-e_j\) for \(i,j\in [n]\) distinct. The union of the bounded covector cells equals the tropical convex hull \({{\,\textrm{tconv}\,}}^{\max }(V)\).

Fig. 2
figure 2

Mixed subdivision \({\mathcal S}(V)\) of \(5\cdot \varDelta _2\) for V as in Example 1. The 21 lattice points of \(5\cdot \varDelta _2\) are marked with their coordinates. The \(\min \)-tropical hyperplane arrangement \({\mathcal T}(V)\) admits a piecewise-linear embedding

Example 1

We illustrate the various concepts from tropical convexity for the matrix \(V\in {\mathbb R}^{5\times 3}\) whose transpose reads

$$\begin{aligned} V^\top \ = \ \begin{pmatrix} 14 &{} 13 &{} 11 &{} 10 &{} 3 \\ -7 &{} -14 &{} -13 &{} 1 &{} -3 \\ -7 &{} 1 &{} 2 &{} -11 &{} 0 \\ \end{pmatrix} \hspace{5.0pt}. \end{aligned}$$

The rows of V (equivalently, the columns of \(V^\top \)) encode five points in \({\mathbb R}^{3}/{\mathbb R}\mathbbm {1}\); see Fig. 1. The covector decomposition has 15 regions of maximal dimension 2, and six of them are bounded. The \(\max \)-tropical convex hull \({{\,\textrm{tconv}\,}}^{\max }(V)\) consists of the union of bounded cells, which are shaded gray; it also contains the green line segment extending from \((3,-3,0)\) to the lower left.

We consider the envelope

$$\begin{aligned} {\mathcal E}(V) \,= \ \bigl \{(t,x)\in {\mathbb R}^m\times {\mathbb R}^n \bigm | t_i+x_j \ge v_{ij}\bigr \} \hspace{5.0pt}, \end{aligned}$$
(1)

which is an unbounded ordinary polyhedron. By [20, Theorem 6.14] the cells of \({{\,\textrm{CovDec}\,}}(V)\) arise as the images of faces of \({\mathcal E}(V)\) under the coordinate projection \((t,x) \mapsto x\). Moreover, \({{\,\textrm{CovDec}\,}}(V)\) is dual to the regular subdivision, \(\varSigma (V)\), of the product of simplices \(\varDelta _{m-1}\times \varDelta _{n-1}={{\,\textrm{conv}\,}}\left\{ (e_i,e_j) \mid i\in [m],\, j\in [n]\right\} \) obtained from lifting \((e_i,e_j)\) to the height \(v_{ij}\). Here we take regular subdivisions induced by upper convex hulls since \(\max \) is our tropical addition. For the same reason the inequality sign “\(\ge \)” is reversed in comparison with the \(\min \)-tropical version in [20, (6.1)]. Via the Cayley trick the subdivision \(\varSigma (V)\) of \(\varDelta _{m-1}\times \varDelta _{n-1}\) corresponds to a mixed subdivision, \({\mathcal S}(V)\), of the dilated simplex \(m\cdot \varDelta _{n-1}\); see [15, §9.2] and [20, §4.5]. For instance, this is convenient for properly visualizing \(\varSigma (V)\), which is a polyhedral complex of dimension \((m-1)(n-1)\). Figure 2 shows the mixed subdivision \({\mathcal S}(V)\) of \(5\cdot \varDelta _2\) for the matrix V from Example 1.

3 Fermat–Weber sets

We examine the Fermat–Weber problem through tropical combinatorics and polyhedral geometry. As our first key observation we show that asymmetric tropical Fermat–Weber sets arise as specific cells in the covector subdivisions induced by the sites.

The asymmetric tropical distance in \({\mathbb R}^n\) is given by

$$\begin{aligned} d_\triangle (a,b) \ = \ \sum _{i\in [n]} (b_i-a_i) - n \min _{i\in [n]}(b_i-a_i) \ = \ \sum _{i\in [n]} (b_i-a_i) + n \max _{i\in [n]}(a_i-b_i) ,\nonumber \\ \end{aligned}$$
(2)

where \(a, b\in {\mathbb R}^n\). Since \(d_\triangle (a',b')=d_\triangle (a,b)\) for \(a-a'\in {\mathbb R}\mathbbm {1}\) and \(b-b'\in {\mathbb R}\mathbbm {1}\), this induces a directed distance function in the \((n{-}1)\)-dimensional quotient vector space \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\). We do not distinguish between the function \(d_\triangle :{\mathbb R}^n\times {\mathbb R}^n\rightarrow {\mathbb R}_ {\ge 0}\) and the induced function on \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\). The asymmetric tropical distance is a “polyhedral norm” with respect to the standard simplex \(\varDelta _{n-1}:={{\,\textrm{conv}\,}}\{e_1,\dots ,e_n\}+{\mathbb R}\mathbbm {1}\) in \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\); see [7, §20]. This may also be seen as a rescaled version of the “tropical Funk metric” discussed in [1, §3.3]. More common in tropical geometry is the symmetric tropical distance between \(a,b\in {\mathbb R}^n\) (or \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\)). It is defined as

$$\begin{aligned} {{\,\textrm{dist}\,}}(a,b) \ = \ \max _{i\in [n]} \left( a_i-b_i\right) - \min _{j\in [n]} \left( a_j-b_j\right) \ = \ \max _{i,j\in [n]}(a_i-b_i-a_j+b_j) ; \end{aligned}$$

see [20, §5.3]. We have \({{\,\textrm{dist}\,}}(a,b) = \tfrac{1}{n}(d_{\triangle }(a,b) + d_\triangle (b,a))\). Throughout this section we fix a finite set of sites \(V=\{v_1,v_2,\dots ,v_m\}\) in \({\mathcal H}\), which we identify with \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\).

Definition 2

An (asymmetric tropical) Fermat–Weber point with respect to V is a point in \({\mathcal H}\) which minimizes the sum of the asymmetric tropical distances from these sites.

In general such a point is not unique. Hence, we let \({{\,\textrm{FW}\,}}(V)\) denote the set of all asymmetric tropical Fermat–Weber points and call it the (asymmetric tropical) Fermat–Weber set with respect to V. This is the asymmetric analog of the symmetric tropical Fermat–Weber set studied in [29].

Fixing the site \(v_i\in V\), the distance function from \(v_i\), which reads

$$\begin{aligned} d_\triangle (v_i,x) \ = \ n \max _{j\in [n]} (v_{ij}-x_j) \quad \text {for } x\in {\mathcal H}\,, \end{aligned}$$
(3)

is convex in the ordinary sense and piecewise linear. Its regions of linearity are precisely the n closed sectors of the \(\min \)-tropical hyperplane with apex \(v_i\); see [20, §5.5]. Consequently, the common subdivision of the regions of linearity of the sites is exactly the covector decomposition \({{\,\textrm{CovDec}\,}}(V)\). Our first main theorem shows that the Fermat–Weber set \({{\,\textrm{FW}\,}}(V)\) is a bounded cell of that subdivision.

For the sake of a precise formulation of that result, we pass to the regular triangulation, \(\varSigma (V)\), of \(\varDelta _{m-1}\times \varDelta _{n-1}\) which is dual to the covector subdivision \({{\,\textrm{CovDec}\,}}(V)\). The relatively open cells of \(\varSigma (V)\) partition the product of the ordinary polytope \(\varDelta _{m-1}\times \varDelta _{n-1}\). The point \(\left( \tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1}\right) \) is the vertex barycenter of \(\varDelta _{m-1}\times \varDelta _{n-1}\). So there is a unique cell, \(C_V\), which contains that \(\left( \tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1}\right) \) in its relatively interior. This will play an important role in our study of Fermat–Weber points.

Definition 3

We call the unique cell of \(\varSigma (V)\) which contains that \(\left( \tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1}\right) \) in its relatively interior the central cell of \(\varSigma (V)\), and denote it by \(C_V\). Its dual in \({{\,\textrm{CovDec}\,}}(V)\) will be called the central covector cell.

Theorem 4

The Fermat–Weber set \({{\,\textrm{FW}\,}}(V)\) agrees with the central covector cell in \({{\,\textrm{CovDec}\,}}(V)\). In particular, \({{\,\textrm{FW}\,}}(V)\) is a bounded polytrope in \({\mathcal H}\), and it is contained in the tropical polytope \({{\,\textrm{tconv}\,}}^{\max }(V)\).

Proof

Consider the linear program

$$\begin{aligned} \begin{array}{ll@{}ll} \text {minimize} &{} \displaystyle n \cdot (t_{1} + \dots + t_{m}) &{} \\ \text {subject to}&{} \displaystyle v_{ij} - x_j \ \le \ t_i , &{} \quad \text {for } i\in [m] \text { and } j\in [n] \\ &{} \displaystyle x_{1} + \dots + x_{n} \ = \ 0 &{} \end{array} \end{aligned}$$
(4)

with \(mn+1\) constraints in the \(m+n\) variables \(t_1,t_2,\dots ,t_m,x_1,x_2,\dots , x_n\). The coefficients \(v_{ij}\) are the coordinates of the sites. The constant factor n in the objective function is not relevant here, but it does make the dual linear program (5) studied below look more natural. If \((t^*,x^*)\) is an optimal solution of (4), then \(x^*\in {{\,\textrm{FW}\,}}(V)\), and \(t_i^*=\tfrac{1}{n}d_\triangle (v_i,x^*)\). Conversely, each Fermat–Weber point arises in this way. The constraints \(v_{ij} \le t_i + x_j\) describe the \(\max \)-tropical version of the envelope \({\mathcal E}(V)\); see [20, §6.1]. In that reference that inequality would look like “\(v_{ij} \le -t_i + x_j\)”. Yet the linear substitution \(t_i\mapsto -t_i\) is natural here, because (4) is a minimization problem; see also [20, Remark 6.28]. The cells of \({{\,\textrm{CovDec}\,}}(V)\) are precisely the projections of the faces of the unbounded ordinary polyhedron \({\mathcal E}(V)\subset {\mathbb R}^m\times {\mathbb R}^n\) onto the x-coordinates; see [20, Proposition 6.11]. Let F be the optimal face of the linear program (4). The set \({{\,\textrm{FW}\,}}(V)\) is the projection of F.

The face F is gotten by minimizing \(t_{1} + \dots + t_{m}\), or equivalently \(t_{1} + \dots + t_{m} + x_1 + \dots + x_n\), as the points x are restricted to the hyperplane \({\mathcal H}\). Let D be the cell in the triangulation of \(\varDelta _{m-1}\times \varDelta _{n-1}\) which is dual to F. Then D contains the vector \((\tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1})\in {\mathbb R}^m\times {\mathbb R}^n\) in its relative interior. In other words, \({{\,\textrm{FW}\,}}(V)\) is dual to the central cell \(C_V\).

A sublevel set of a function \(f:{\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\rightarrow {\mathbb R}\) is a set of the form \(\left\{ x\in {\mathbb R}^{n}/{\mathbb R}\mathbbm {1} \mid f(x)\le \alpha \right\} \) for some \(\alpha \in {\mathbb R}\). If all of its sublevel sets are bounded, then the set of minima is bounded. We use this property to show that \({{\,\textrm{FW}\,}}(V)\) is bounded.

The sublevel sets of \(d_{\triangle }(v_i,\cdot )\) are simplices if non-empty; in particular, they are bounded. Consequently, the function \(x\mapsto \sum _{i\in [m]}d_{\triangle }(v_i,x)\) has bounded sublevel sets. The latter implies that \({{\,\textrm{FW}\,}}(V)\) is bounded, as it is the set of minimizers of the aforementioned function. The \(\max \)-tropical convex hull of the rows of V equals the union of the bounded covector cells in \({{\,\textrm{CovDec}\,}}(V)\); see [20, Corollary 6.17].

Example 5

For the matrix V from Example 1 the unique optimal solution of the primal linear program (4) reads

$$\begin{aligned} t^*=(5,4,5,7,3) \quad \text {and} \quad x^*=(9,-6,-3) , \end{aligned}$$

with optimal value \(3 \cdot 24=72\). We have \({{\,\textrm{FW}\,}}(V)=\{x^*\}\); see Fig. 1. As \(\gcd (5,3)=1\), the uniqueness is implied by Theorem 7 below. The point \(x^*\), which is a pseudovertex of \({{\,\textrm{CovDec}\,}}(V)\), is dual to the central cell \(C_V={{\,\textrm{conv}\,}}\{113,122,212,221\}\); see Fig. 2.

Remark 6

Theorem 4 reveals a similarity to the Euclidean case: by [7, Proposition 19.1] any Fermat–Weber point with respect to the Euclidean distance is contained in the convex hull of the sites. The analogous result to Theorem 4 for the symmetric tropical distance is [28, Proposition 26]. As shown in [28, Example 27], in general, the symmetric tropical Fermat–Weber set is not contained in the tropical convex hull.

Via the Cayley trick, the central cell \(C_V\) in \(\varSigma (V)\) corresponds to the central covector cell in \({{\,\textrm{CovDec}\,}}(V)\), which is \({{\,\textrm{FW}\,}}(V)\). The dimension of the latter is the codimension of the former.

Theorem 7

The dimension of \({{\,\textrm{FW}\,}}(V)\) is at most \(\gcd (m,n)-1\). In particular, if m and n are relatively prime, the Fermat–Weber point is unique.

Proof

We consider the regular subdivision, \(\varSigma (V)\), of \(\varDelta _{m-1}\times \varDelta _{n-1}\) induced by the matrix V. Let \(C_V\) be the central cell. By [15, §6.2] the vertices of \(C_V\) are in bijection with the edges in a subgraph \(G(C_V)\) of the complete bipartite graph \(K_{m,n}\) with \(c:={{\,\textrm{codim}\,}}(C_V)+1\) connected components; see also [20, §4.7]. We will show that \(c\le \gcd (m,n)\). Here we may assume that V is generic, whence \(C_V\) is a simplex; note that any refinement of \(\varSigma (V)\) can only increase the codimension of the cell containing a specific point in its relative interior. If \(c=1\), there is nothing to prove. Hence, we assume that \(c\ge 2\).

We consider a maximal simplex, U, of \(\varSigma (V)\) which contains \(C_V\), and let T be the subtree of \(K_{m,n}\) corresponding to U. A facet, F, of U corresponds to the removal of some edge e of T, yielding disjoint unions \([m]=I'\sqcup I''\) and \([n]=J'\sqcup J''\) such that the connected components of \(T{\setminus } e\) are the restrictions on \(I'\sqcup J'\) and \(I''\sqcup J''\) and e is incident to a vertex in \(I''\) and one in \(J'\). The hyperplane \(\sum _{i'\in I'}x_{i'}+\sum _{j''\in J''}y_{j''}=1\) contains the facet F. If \(I'\) were empty, then \(\sum _{j'\in J'}y_{j'}=0\) for every \((x,y)\in C_V\). But \(C_V\) contains the point \(\left( \tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1}\right) \), so \(J'\) must be empty as well. This contradicts the fact that \(J'\) contains a vertex incident to e. Hence, \(I'\ne \emptyset \) and, similarly, \(J''\ne \emptyset \).

Consequently, there exist two proper partitions \([m]=I_1\sqcup I_2\sqcup \dots \sqcup I_c\) and \([n]=J_1\sqcup J_2\sqcup \dots \sqcup J_c\) such that the restriction of \(G(C_V)\) on \(I_i\sqcup J_i\) is a tree, and there exist edges \(e_2,\dots ,e_c\) in \(K_{m,n}\) between a vertex of \(I_{i}\) and \(J_{i-1}\) such that adding these edges we obtain the tree T.

A supporting hyperplane for the facet corresponding to \(T\setminus \{e_i\}\) is given by the equation

$$\begin{aligned}\sum _{i\in I_1\sqcup \dots \sqcup I_{i-1}}x_i+\sum _{j\in J_i\sqcup \dots \sqcup J_c}y_j \ = \ 1 ,\end{aligned}$$

where \(2\le i\le c\). Now the point \((\tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1})\) is contained in these hyperplanes, yielding the \(c-1\) equalities

$$\begin{aligned}\tfrac{1}{m}\sum _{k<i}|I_k|+\tfrac{1}{n}\sum _{\ell \ge i}|J_\ell | \ = \ 1 , \qquad \text {for } 2\le i\le c \hspace{5.0pt}.\end{aligned}$$

Multiplying with the least common multiple of m and n we obtain

$$\begin{aligned}\tfrac{n}{\gcd (m,n)}\sum _{k<i}|I_k|+\tfrac{m}{\gcd (m,n)}\sum _{\ell \ge i}|J_\ell | \ = \ {{\,\textrm{lcm}\,}}(m,n) , \qquad \text {for } 2\le i\le c .\end{aligned}$$

Putting those relations together with \(\sum _{k\in [c]}|I_k|=m\) and \(\sum _{\ell \in [c]}|J_\ell |=n\), we yield that \(|I_k|\) is a multiple of \({m}/{\gcd (m,n)}\) for every \(k\in [c]\) and \(|J_\ell |\) is a multiple of \({n}/{\gcd (m,n)}\) for all \(\ell \in [c]\). Further \(I_k\ne \emptyset \) for every \(k\in [c]\), and so \(|I_k|\ge {m}/{\gcd (m,n)}\) for all \(k\in [c]\). Hence

$$\begin{aligned} m \ = \ \sum _{k\in [c]}|I_k| \ \ge \ c\cdot \tfrac{m}{\gcd (m,n)} ,\end{aligned}$$

which implies \(c\le \gcd (m,n)\) as claimed.

Restricting to the one-dimensional case (i.e., \(n=2\)), we recover the known fact that an odd number of points have a unique median, while the median can be selected from an interval for an even number of points. The following example shows that the inequality in Theorem 7 is tight for all m and n; see also Example 1.

Example 8

Our example employs the matrix \(V=(v_{ij})\in {\mathbb R}^{m\times n}\) with \(v_{ij}=(i-1)(j-1)\). The rows are points on the tropical moment curve, and their (\(\max \)-)tropical convex hull is a tropical cyclic polytope; see [6, §4] and [20, Example 5.18]. The dual triangulation \(\varSigma (V)\) of \(\varDelta _{m-1}\times \varDelta _{n-1}\) is the staircase triangulation; see [15, §6.2.3]. We explain the construction.

Fig. 3
figure 3

Staircase in a \(6\times 9\) grid

In this case, we represent the simplices as points in an \(m{\times }n\) grid instead of subgraphs of \(K_{m,n}\). The staircase triangulation consists of all the paths in an \(m{\times }n\) grid starting at (1, 1) and ending at (mn) obtained by going right or down. Figure 3 portrays such a path. Due to its shape, it is called a staircase. We will show that in the staircase triangulation the simplex containing \(\left( \tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1}\right) \) lies in a cell of codimension \(\gcd (m,n)-1\). For improved readability, we abbreviate \(d=\gcd (m,n)\), \(a={m}/{d}\), and \(b={n}/{d}\).

If \(d=1\), then there is a unique maximal simplex in the staircase triangulation containing \((\tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1})\) in its interior. To find the precise staircase, we use the Northwest Corner Rule [12, §8.3.1] in the standard transportation array with marginal column \(\tfrac{1}{m}\mathbbm {1}\) and marginal row \(\tfrac{1}{n}\mathbbm {1}\). The visited cells form the staircase, which we call the central staircase. Note that this method provides also barycentric coordinates for \(\left( \tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1}\right) \) in this simplex.

When \(d\ge 2\), consider the partitions in d subsets \([m]=I_1\sqcup \dots \sqcup I_d\) and \([n]=J_1\sqcup \dots \sqcup J_d\) where \(I_i=\{(i-1)a+1,(i-1)a+2,\dots ,ia\}\) and \(J_i=\{(i-1)b+1,(i-1)b+2,\dots ,ib\}\). Consider on \(I_i\times J_i\) the central staircase on an \(a\times b\) grid and add the points \((ia,ib+1)\) for \(i=1,\dots ,d-1\): in Fig. 3 the blocks on \(I_i\times J_i\) correspond to the gray areas whereas the added points are those in the white area. This staircase corresponds to a maximal simplex U in the staircase triangulation, which contains \((\tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1})\).

The removal of the grid point \((ia,ib+1)\) yields the facet defined by

$$\begin{aligned} \sum _{k\le ia}x_k+\sum _{\ell >ib}y_\ell \ = \ 1 . \end{aligned}$$

Using \(d={m}/{a}={n}/{b}\), we see that \(\left( \tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1}\right) \) is contained in this facet. In total, there are \(d-1\) facets of this form.

Each remaining facet of U corresponds to the removal of a grid point from some block \(I_k\times J_k\). In particular, each facet induces a partition \([m]=I'\sqcup I''\) such that \(I_1\sqcup \dots \sqcup I_{k-1}\subset I'\), \(I_{k+1}\sqcup \dots \sqcup I_{d}\subset I''\) for some \(k\in [d]\); similarly, there is a partition \(J'\sqcup J''\) on [n]. Moreover, at least one of the intersections \(I_k\cap I'\cap I''\) and \(J_k\cap J'\cap J''\) is nonempty, which implies that at least one of \(|I'|\) and \(|J''|\) is not a multiple of d. As in the proof of Theorem 7, the facet defining equation

$$\begin{aligned} \sum _{i'\in I'}x_{i'}+\sum _{j''\in J''}y_{j''} \ = \ 1 \end{aligned}$$

is not satisfied by \((\tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1})\). Since \(\left( \tfrac{1}{m}\mathbbm {1},\tfrac{1}{n}\mathbbm {1}\right) \) is contained in precisely \(d-1\) facets of U, the dimension of \({{\,\textrm{FW}\,}}(V)\) equals \(d-1\).

The staircase for U is obtained by using the Northwest Corner Rule with breaking ties by going East. If we break the ties randomly, then \(2^{d-1}\) staircases appear with nonzero probability. These staircases are in bijection with the ordinary vertices of \({{\,\textrm{FW}\,}}(V)\), which is a \((d{-}1)\)-dimensional cube, seen as an ordinary polytope.

Remark 9

In the special case \(m=n-1\) computing tropical Fermat–Weber points reduces to the tropical Cramer rule [20, §4.9]. Adding one more site, p, the new Fermat–Weber set, F, can have higher dimension, but contains the tropical Cramer point, c. Perturbing c in the direction of p, we obtain a point in the relative interior of F. This also agrees with results of Gärtner and Jaggi [18, §4.1] in the context of “tropical support vector machines” on computing a “separating hyperplane for n points”.

Remark 10

As a consequence of [20, Theorem 6.14] the Fermat–Weber set \({{\,\textrm{FW}\,}}(V^\top )\) in \({\mathbb R}^{m}/{\mathbb R}\mathbbm {1}\) of the n columns of V is affinely isomorphic to the Fermat–Weber set \({{\,\textrm{FW}\,}}(V)\) in \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\) of the m rows.

Remark 11

Our analysis rests on the decision, in (3), to look at the distances from the sites to the Fermat–Weber points. This leads to \(\min \)-tropical hyperplane arrangements and \(\max \)-tropical convex hulls. Reversing the direction, i.e., considering distances from the Fermat–Weber points to the sites, amounts to exchanging \(\min \) and \(\max \) throughout. Conceptually, the results remain the same; cf. [20, §1.3].

4 Transportation

This section comprises the algorithmic core of this paper. The basic ingredient is the transportation problem which already occurred in Example 8.

Again let us fix a matrix \(V=(v_{ij})\in {\mathbb R}^{m\times n}\). Whenever it will suit us, we may also view the rows of V is a m labeled points \(v_1,\dots ,v_m\) in \({\mathcal H}\) or \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\). The following is the dual of the linear program (4) with variables \(\lambda \) and \(y_{ij}\) for \(i\in [m]\) and \(j\in [n]\):

$$\begin{aligned} \begin{array}{ll@{}ll} \text {maximize} &{} \displaystyle \sum _{i\in [m]}\sum _{j\in [n]}v_{ij}\cdot y_{ij} &{} \\ \text {subject to}&{} \displaystyle \sum _{j\in [n]} y_{ij} \ = \ n \hspace{5.0pt}, &{} \quad \text {for } i\in [m] \\ &{} \displaystyle \lambda + \sum _{i\in [m]} y_{ij} \ = \ 0 \hspace{5.0pt}, &{} \quad \text {for } j\in [n] \\ &{} \displaystyle y_{ij} \ \ge \ 0 \hspace{5.0pt}, &{} \quad \text {for } i\in [m] \text { and } j\in [n] \hspace{5.0pt}. \end{array} \end{aligned}$$
(5)

From \(0 = n\cdot \lambda + \sum _{i,j}y_{ij} = n\cdot (\lambda + m)\), we get \(\lambda =-m\) for every feasible point. By substituting that value in (5) we obtain the standard linear programming formulation of a transportation problem; see [40, §21.6]. The primal linear program (4) and the envelope \({\mathcal E}(V)\) are related to transportation via Hitchcock’s theorem [40, Theorem 21.13].

As the right hand sides are the integral constants m and n, the feasible region of (5) is a central transportation polytope, and we denote it T(mn). The polytope T(mn) is known to be integral [14, Lemma 2.13]. The nonzero entries of any vertex, which is an \(m{\times }n\)-matrix, defines a subgraph of \(K_{m,n}\) by picking edges [14, Lemma 2.9]. This is the support graph of that vertex, and this is a forest; see [40, Theorem 21.15].

Example 12

For the \(5{\times }3\)-matrix V from Example 1 the transpose of the unique optimal solution of (5) is

$$\begin{aligned} (y_{ij}^*)^\top \ = \ \begin{pmatrix} 3 &{} 2 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 3 &{} 2 \\ 0 &{} 1 &{} 3 &{} 0 &{} 1 \end{pmatrix} . \end{aligned}$$
(6)

The support graph a spanning tree; see Fig. 4. By Theorem 15 below that tree encodes the covector of the unique Fermat–Weber point \(x^*\). The degree sequence for the column nodes is (223). This is the coarse type of \(x^*\) and also the componentwise maximum of the four vertices of the central cell in the mixed subdivision; cf. Example 5 and [20, §4.5].

Fig. 4
figure 4

Spanning subtree of \(K_{5,3}\) encoding the covector of the unique Fermat–Weber point from Example 1. Column nodes at the bottom

Fig. 5
figure 5

Five sites in \({\mathbb R}^{3}/{\mathbb R}\mathbbm {1}\) and the unique point that evenly splits them. Each closed sector of the max-tropical halfspace with apex \((9,-6,-3)\) contains at least two sites

Assuming \(m\ge n\), Tokuyama and Nakano gave an algorithm (which they called “splitter finding”) to solve a transportation problem like (5) in \(O(n^2m\log ^2 m)\) time [42, Theorem 3.1]. We borrow some of their ideas. For \(J\subseteq [n]\) and \(u\in {\mathcal H}\) consider the set

$$\begin{aligned} S_J(u) \,= \ \bigl \{x\in {\mathcal H} \bigm | \max _{j\in J}(x_j-u_j)\ge \max _{i\notin J}(x_i-u_i)\bigr \} . \end{aligned}$$

We have \(S_\emptyset (u)=\emptyset \), and \(S_{[n]}(u)={\mathcal H}\), where we use the convention that the maximum of the empty set is \(-\infty \), the neutral element of the tropical addition. If J is a nonempty proper subset of [n], then \(S_J(u)\) is a max-tropical halfspace with apex u; see [20, §7.1]. In the special case where \(J=\{j\}\) is a singleton that tropical halfspace is a closed sector; in general, \(S_J(u)=\bigcup _{j\in J}S_{\{j\}}(u)\).

Definition 13

We say that \(u\in {\mathcal H}\) evenly splits V if for every subset J of [n] we have \(n\cdot |V\cap S_J(u)|\ge m\cdot |J|\).

Tokuyama and Nakano [42] call the point u a “\(\mathbbm {1}\)-splitter”, and the sectors are the “regions split by u”. Theorem 15 below may be seen as our interpretation of their results in the setting of tropical convexity.

Example 14

Consider the points V from Example 1. The Fermat–Weber point \(u=(9,-6,-3)\) evenly splits them. This is illustrated in Fig. 5, where we have drawn the \(\max \)-tropical hyperplane based at u with dotted lines. In particular, we see the subdivision of \({\mathbb R}^{3}/{\mathbb R}\mathbbm {1}\) in three convex regions.

Theorem 15

A point \(u\in {\mathcal H}\) evenly splits V if and only if \(u\in {{\,\textrm{FW}\,}}(V)\). The support graph of an optimal dual solution \(y^*\) is the covector of the Fermat–Weber point \(u(y^*)\).

Proof

Assume that u evenly splits V and consider \(t_i=\max _{j\in [n]}(v_{ij}-u_j)\) for \(i\in [m]\). Thus (ut) is a feasible solution of the primal linear program (4). Moreover, \(t_i+u_j=v_{ij}\) if and only if \(v_i\in S_{\{j\}}(u)\).

Now [42, Theorem 2.2] says that there exists a solution \((y_{ij})\) of the transportation problem (5), which is dual to (4), such that (ut) and \((y_{ij})\) satisfy the complementary slackness condition. Indeed, if \(t_i+u_j\ne v_{ij}\), then \(v_i\notin S_{\{j\}}(u)\), so the aforementioned result gives \(y_{ij}=0\). So it follows from linear programming duality that (ut) is an optimal solution of (4).

In particular, u is a Fermat–Weber point for V.

For the converse, we denote by \(\phi \) the convex function \(\tfrac{1}{n}\sum _{s\in V}d_\triangle (s,\cdot )\). Also, for every subset J of [n], denote by \(\sigma _J\) the cardinality of \(V\cap S_J(u)\). Abbreviating \(f_J:=(n-|J|)\sum _{j\in J}e_j-|J|\sum _{i\in [n]{\setminus } J}e_i\), which is a point in \({\mathcal H}\), we obtain:

  • if \(s\in S_J(u)\), then \(s\in S_J(u-\delta f_J)\) for any \(\delta \ge 0\);

  • if \(s\notin S_J(u)\), then \(s\notin S_J(u-\delta f_J)\) for any \(\delta \ge 0\) sufficiently small.

The condition \(s\in S_J(u)\) implies \(d_\triangle (s,u)=n(s_j-u_j)\) for some \(j\in J\). Therefore, for \(\delta >0\) sufficiently small, we obtain

  • \(d_\triangle (s,u-\delta f_J)=d_\triangle (s,u)+n\cdot \delta (n-|J|)\), when \(s\in S_J(u)\);

  • \(d_\triangle (s,u-\delta f_J)=d_\triangle (s,u)-n\cdot \delta |J|\), when \(s\notin S_J(u)\).

Summing up and dividing by n yields

$$\begin{aligned} \phi (u-\delta f_J) \ = \ \phi (u)+\delta (n-|J|)\sigma _J-\delta |J|(m-\sigma _J) \ = \ \phi (u)+\delta \left( n\sigma _J-m|J|\right) \nonumber \\ \end{aligned}$$
(7)

for any \(\delta > 0\) sufficiently small. If \(u\in {{\,\textrm{FW}\,}}(V)\), then u is a minimizer, so \(\phi (u-\delta f_J)\ge \phi (u)\) for any \(\delta >0\) and \(J\subseteq [n]\). Equation (7) implies \(n\sigma _J\ge m|J|\) for every \(J\subseteq [n]\), under this assumption. Consequently, u evenly splits V.

The argument in the proof of Theorem 15 leads to the following algorithm for obtaining a tropical Fermat–Weber point from an optimal solution \(y^*\) of (5). By complementary slackness each edge (ij) of the support graph, \(B(y^*)\), imposes the equality

$$\begin{aligned} t_i+x_j \ = \ v_{ij} \end{aligned}$$
(8)

for any optimal dual solution. If we assume \(x_j=0\) for some column node j, then we can perform a depth-first search from j and recover all the the other values of (tx) using the equalities in (8). There may be more connected components, in which case we start a depth-first search from every unvisited column node. In this way all the values are recovered, as \(B(y^*)\) is spanning—no row or column of \(y^*\) can be zero as its elements must sum up to a positive number. Note that the point x obtained this way may not lie in \({\mathcal H}\). Yet, by adding \((x_1+\dots +x_n)/n\) to every entry of x and subtracting the same value from every entry of t, we obtain a feasible solution \((t^*,x^*)\) of (4) that satisfies the equations (8). In particular, that solution is optimal, whence \(x^*\in {{\,\textrm{FW}\,}}(V)\). So the method of [42] gives the following complexity result.

Corollary 16

Assuming \(m\ge n\), one tropical Fermat–Weber point can be computed in \(O(n^2m\log ^2 m)\) time.

The complexity bounds of the algorithms by Kleinschmidt and Schannath [22] and Brenner [8] are similar, but slightly different. Naturally, they carry over as well.

It is also natural to ask for explicit representations of the entire Fermat–Weber set. A first idea could be to list the vertices of \({{\,\textrm{FW}\,}}(V)\), seen as an ordinary polytope. Example 8 shows that cubes occur as Fermat–Weber sets, and their number of vertices is exponential in the dimension. Yet, the polytropal structure allows for more efficient choices. Namely, a polytrope in \({\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\) has at most n tropical vertices and at most \(n^2-n\) ordinary facets.

Let us start with the latter. Since we know the possible directions of the (outer) facet normal vectors, \(e_k-e_\ell \) for \(k\ne \ell \), we can find the ordinary facets by solving \(n^2-n\) linear programs like:

$$\begin{aligned} \begin{array}{ll@{}ll} \text {maximize} &{} \displaystyle x_k - x_\ell &{} \\ \text {subject to}&{} \displaystyle v_{ij} - x_j \le t_i , &{} \quad \text {for } i\in [m] \text { and } j\in [n] \\ &{} \displaystyle x_{1} + \dots + x_{n} = 0 &{} \\ &{} \displaystyle t_{1} + \dots + t_{m} = p^*/n &{} \end{array} \end{aligned}$$
(9)

where \(p^*\) is the optimal value of (4). The constraint matrix has only 0 and \(\pm 1\) entries, and so a linear program of the form (9) can be solved in strongly polynomial time [39, Corollary 15.3a].

Each such linear program yields one tight inequality \(x_k-x_\ell \le a_{k,\ell }\) of \({{\,\textrm{FW}\,}}(V)\), where \(a_{k,\ell }\) is the optimal value. Then, the tropical vertices can be found as \(A_k=(-a_{k,1},\dots ,-a_{k,n})\in {\mathbb R}^{n}/{\mathbb R}\mathbbm {1}\), where \(a_{k,k}=0\). From Corollary 16 we thus infer the following result.

Corollary 17

The ordinary facet description and the tropical vertices of \({{\,\textrm{FW}\,}}(V)\) can be found in strongly polynomial time.

If the tropical vertices are known, then we can check if a given point lies in \({{\,\textrm{FW}\,}}(V)\) in \(O(n^2)\) time by checking the criterion [20, Proposition 5.37].

5 Tropical median consensus trees

Phylogenetics is a branch of (computational) biology that seeks to associate trees to mark ancestral relations among given taxa, e.g., species [41]. The taxa correspond to the leaves. Here we show how asymmetric tropical Fermat–Weber sets give rise to a new algorithm for finding consensus trees. Since there is no particular shortage on consensus tree methods, we compare its features to other approaches, and we discuss the practical applicability.

5.1 Equidistant trees

We recall known facts about ultrametrics and equidistant trees to fix our notation; see [41, §7.2] and [20, §10.9]. A dissimilarity map is a symmetric \(n{\times }n\)-matrix \(D=(d_{ij})\) with zero diagonal. It is called an ultrametric if D is nonnegative, and the ultrametric inequality

$$\begin{aligned} d_{ik} \ \le \ \max (d_{ij}, d_{jk}) \end{aligned}$$
(10)

holds for all \(i,j,k\in [n]\). Since the (zero) diagonal is implicit, and the matrix is required to be symmetric, we may view a dissimilarity map as an element of \({\mathbb R}^{\genfrac(){0.0pt}1{n}{2}}\).

A rooted metric tree with n labeled leaves is equidistant if the distance from any leaf to the root is the same. It is known that the ultrametrics are precisely the distance functions among the leaves in an equidistant tree; see [41, Theorem 7.2.5]. Note that the all-ones vector \(\mathbbm {1}\) of length \(\genfrac(){0.0pt}1{n}{2}\) is an ultrametric. The corresponding equidistant tree has interior edges of length zero, while each leaf edge has length \(\tfrac{1}{2}\).

Ardila and Klivans [3, Theorem 3] showed that a dissimilarity map is an ultrametric if and only if it corresponds to a point on the Bergman fan \({{\widetilde{B}}}(K_n)\) of the complete graph \(K_n\). The Bergman fan of a matroid is a special case of a tropical linear space, i.e., with constant coefficients. The ultrametric inequality (10) is a tropical convexity condition with respect to \(\max \). From a dissimilarity map D and any real constant c we can construct the dissimilarity map \(D+c\mathbbm {1}\). Moreover, if D is an ultrametric, and \(D+c\mathbbm {1}\) is nonnegative, then it is an ultrametric, too. In this way, we may view \({{\widetilde{B}}}(K_n)\) as a \(\max \)-tropical linear space in the tropical projective torus \({\mathbb R}^{\genfrac(){0.0pt}1{n}{2}}/{\mathbb R}\mathbbm {1}\); see [28, Proposition 16] and [43, Theorem 3]. We abbreviate \({\mathcal T}_n:={{\widetilde{B}}}(K_n)/{\mathbb R}\mathbbm {1}\) and call it the space of equidistant trees on n labeled leaves. Now we can apply our results from Sects. 3 and 4 to points in \({\mathcal T}_n\). Our first observation says that Fermat–Weber points of equidistant trees are again equidistant trees.

Theorem 18

Let \(V\subset {\mathcal T}_n\) be a finite set of equidistant trees on n leaves. Then the tropical polytope \({{\,\textrm{FW}\,}}(V)\) is contained in \({\mathcal T}_n\). Moreover, any two trees in \({{\,\textrm{FW}\,}}(V)\) share the same tree topology.

Proof

According to Theorem 4 the set \({{\,\textrm{FW}\,}}(V)\) is contained in the \(\max \)-tropical convex hull of V. The space of equidistant trees \({\mathcal T}_n\) is a \(\max \)-tropical linear space and thus \(\max \)-tropically convex; see [20, Proposition 10.33]. This is the first claim. Page, Yoshida and Zhang showed [36, Theorem 3.2] that the trees in any relatively open cell of the covector decomposition of V in \({\mathcal T}_n\) share the same tree topology. With this observation the second claim follows also from Theorem 4.

As we know, Fermat–Weber points are not unique, in general. Here is a more precise statement.

Corollary 19

Let \(V\subset {\mathcal T}_n\) be a set of m equidistant trees on n leaves. Then

$$\begin{aligned} \dim {{\,\textrm{FW}\,}}(V) \ \le \ \min \bigl (\, n-1 ,\; \gcd (m,\genfrac(){0.0pt}1{n}{2}) \,\bigr )-1 \hspace{5.0pt}. \end{aligned}$$

Proof

This follows from Theorem 7 and the fact that the graphic matroid of \(K_n\) has rank \(n-1\), so \(\dim {\mathcal T}_n= n-2\).

Fig. 6
figure 6

Three trees (a,b,c) and their tropical median consensus tree (d)

5.2 Consensus trees

Our goal now is to employ the results obtained so far to study the consensus problem for metric trees. Formally, a consensus method on equidistant trees, with n taxa, is a function \(c:{\mathcal T}_n^*\rightarrow {\mathcal T}_n\) where \({\mathcal T}_n^*=\bigcup _{m\ge 1}{\mathcal T}_n^m\). For surveys on the subject see [9] and, for a more recent account, [10]. We say that a consensus method c is tropically convex if \(c(D_1,\dots ,D_m)\in {{\,\textrm{tconv}\,}}^{\max }(D_1,\dots ,D_m)\) for every \(m\ge 1\) and every \(D_1,\dots ,D_m\in {\mathcal T}_n\). Theorem 18 says that selecting an arbitrary tree in \({{\,\textrm{FW}\,}}(D_1,\dots ,D_m)\) yields a consensus method which is tropically convex. Recall that \({{\,\textrm{FW}\,}}(D_1,\dots ,D_m)\) is a polytrope in \({\mathbb R}^{\genfrac(){0.0pt}1{n}{2}}/{\mathbb R}\mathbbm {1}\), which thus has at most \(\genfrac(){0.0pt}1{n}{2}\) tropical vertices.

Definition 20

We define the tropical median consensus tree of \(D_1,\dots ,D_m\in {\mathcal T}_n\) as the ordinary average of the tropical vertices of \({{\,\textrm{FW}\,}}(D_1,\dots ,D_m)\).

That ordinary average is the barycenter of the ordinary simplex spanned by the tropical vertices. As polytropes are convex in the ordinary sense, the tropical median consensus tree method is tropically convex. By Corollary 17, the tropical vertices of the Fermat–Weber set can be found in strongly polynomial time and hence also their ordinary average.

Example 21

The first three equidistant trees on \(n=9\) taxa in Fig. 6, called (a), (b), (c), are taken from [21, Chapter 7]. Since the trees in that reference are not metric, here we choose weights that are compatible with the graphical representation in [21, Fig. 7.1]. The tropical median consensus tree is depicted in Fig. 6 as (d). This is the unique Fermat–Weber point of the three input trees in \({\mathcal T}_9\). This can be verified using version 4.9 of polymake [19].

The Newick format is a standard for encoding phylogenetic trees [17, p. 590]; it is supported by polymake. A tree is represented as a string, where leaves are given by their (text) labels, and internal nodes correspond to matching pairs of parentheses. Recursively, such a pair of parentheses encloses the Newick representation of the subtree rooted at that internal node. Further, each node is followed by a numerical value, after a colon, and this denotes the length of the edge between the node and its parent. For example, the Newick representation of (a) is (A:8,((B:2,(C:1,D:1):1):5,((E:1,F:1):3,(G:2,(H:1,I:1):1):2):3):1).

Bryant et al. [10] impose three conditions for a consensus method to be regular:

  1. (U)

    The consensus of any number of copies of the same tree, T, is T;

  2. (A)

    the consensus does not depend on the ordering of the trees;

  3. (N)

    permuting the taxa in the input trees results in the same permutation of the taxa in the consensus.

These properties are called unanimity, anonymity, and neutrality, respectively. For the tropical median consensus all of them are immediate: Unanimity is due to fact that \(d_\triangle \) is definite; anonymity follows from the commutativity of the addition; neutrality is a consequence of the invariance of \(d_\triangle \) under the action of the symmetric group. It then follows from [10, Theorem 3] that the tropical median consensus is not “extension stable”.

Our next step is to investigate properties of arbitrary tropically convex consensus methods. To this end, let \(i,j,k\in [n]\) be pairwise distinct taxa in some equidistant tree such that the lowest common ancestor of i and j is a proper descendant of the lowest common ancestor of i, j, and k. Then we say that these taxa form a rooted triplet, and we denote it by ij|k. If \(D=(d_{ij})\) is its ultrametric distance, then ij|k is a rooted triplet if and only if \(d_{ij}<d_{ik}\). Note that the ultrametric property implies \(d_{jk}=d_{ik}\), so we also have \(d_{ij}<d_{jk}\). We denote by r(D) the set of rooted triplets of the tree. A consensus method is called Pareto on rooted triplets if \(\bigcap _{\ell \in [m]}r(D_\ell )\subseteq r(D)\); it is called co-Pareto on rooted triplets if \(r(D)\subseteq \bigcup _{\ell \in [m]}r(D_\ell )\); here \(D_1,\dots ,D_m\) correspond to the input trees, and D represents the consensus tree. These are desirable properties for consensus methods; see [9, §3].

Proposition 22

Any tropically convex consensus method is Pareto and co-Pareto on rooted triplets.

Proof

We consider the set of equidistant trees containing the rooted triplet ij|k, which we denote

$$\begin{aligned} {\mathcal T}_n(ij|k) \,= \ \left\{ D\in {\mathcal T}_n \mid d_{ij}<d_{ik}\right\} \hspace{5.0pt}. \end{aligned}$$

The key observation is that this set is tropically convex: it arises as the intersection of \({\mathcal T}_n\) with an open tropical halfspace. Note that, therefore, the complement in \({\mathcal T}_n\) is also tropically convex.

Now let \(D_1,\dots ,D_m\) be ultrametrics and D any point in their max-tropical convex hull. We need to verify the Pareto and co-Pareto properties. If the rooted triplet ij|k belongs to \(\bigcap _{\ell }r(D_\ell )\), then \(D_\ell \in {\mathcal T}_n(ij|k)\) for every \(\ell \in [m]\). As \({\mathcal T}_n(ij|k)\) is tropically convex, we have \(D\in {\mathcal T}_n(ij|k)\). Thus, ij|k also belongs to r(D), showing that a tropically convex consensus method is Pareto on rooted triplets.

Conversely, if ij|k does not belong to \(\bigcup _{\ell \in [m]}r(D_i)\), then the input ultrametrics \(D_1,\dots ,D_m\) lie in the complement \({\mathcal T}_n\setminus {\mathcal T}_n(ij|k)\). Again, the latter set of tropically convex, and thus \(D\notin {\mathcal T}_n(ij|k)\). We infer that \(ij|k\notin r(D)\), and we conclude that a tropically convex consensus method is co-Pareto on rooted triplets.

Before we continue to study our tropical median consensus method, we look at other ideas.

Fig. 7
figure 7

Pointwise maximum of the three trees from Example 23

Example 23

A particularly simple way to produce a tropically convex consensus tree method is the following. For given ultrametrics \(D_1,\dots , D_m\in {\mathcal T}_n\), we can consider the pointwise maximum \(c(D_1,\dots ,D_m)=D_1\oplus \dots \oplus D_m\). See Fig. 7 for the pointwise maximum of the three trees from Example 23 and Fig. 6. The tropical median consensus tree (d) in Fig. 6 partially resolves the maximum consensus tree from Fig. 7.

Note that the above definition depends on the representatives of \(D_1,\dots ,D_m\) modulo \({\mathbb R}\mathbbm {1}\). For the output in Fig. 7, we used the representatives displayed in Fig. 6. However, we could have chosen the representatives lying on the hyperplane \({\mathcal H}\) in \({\mathbb R}^{\genfrac(){0.0pt}1{n}{2}}/{\mathbb R}\mathbbm {1}\); the corresponding pointwise maximum represents the center of the smallest ball with respect to \(d_\triangle \) that contains the points \(D_1,\dots ,D_m\). Alternatively, considering representatives with a fixed entry equal to 0, we obtain the tropical barycenter from [1, §3.2].

Whatever convention we may fix for the representatives, the pointwise maximum exhibits a drawback. To exemplify it, consider the trees (a), (b), and one million copies of the tree (c) from Fig. 6. Then, the pointwise maximum consensus is still the one from Fig. 7, whereas the tropical median consensus tree looks like (c). So the pointwise maximum consensus is highly sensitive to outliers. In contrast, the tropical median consensus is robust.

Most known consensus tree methods deal with unweighted phylogenetic trees, so they may be seen as discrete analogues of our approach. For instance, unweighted “median consensus trees” were defined by Bathélémy and McMorris [4], and the asymmetric case was analyzed by Phillips and Warnow [38]. Bryant [9] presents only one consensus method for rooted trees that takes the edge lengths into consideration: the “average consensus tree” of Lapointe and Cucumel [24]. In [9, §2.4.2] two drawbacks of the average consensus tree method are explicitly mentioned: no efficient algorithm is known, and the (co-)Pareto properties are unclear. The average consensus method involves the Euclidean distance, and the unconstrained optima might lie outside the tree space \({\mathcal T}_n\). Therefore, ultrametric conditions must be imposed to the solution, making it difficult to obtain a regular consensus method; see [24] for details. Further, Lapointe and Cucumel [25] show that the procedure proposed is \(\textsf{NP}\)-hard, and the solution may not be unique. A similar complexity result exists for the median consensus method developed by Lavasseur and Lapointe [26]; see [13].

In the remainder of this section we compare the tropical median consensus method to algorithms proposed by Lin and Yoshida [29]. In fact, our approach is very similar and was inspired by that article. Lin et al. [28] studied the Fermat–Weber problem for the symmetric tropical distance function, with a focus on the tree space. Crucially, the symmetric tropical Fermat–Weber points may lie outside the \(\max \)-tropical convex hull; see [28, Example 27].

Table 1 Timings (in s) for computing symmetric tropical Fermat–Weber sets of equidistant trees. The computations up to 8 leaves are averages over 10 iterations, whereas those for 9 and 10 leaves are results after one iteration

Example 24

Symmetric tropical Fermat–Weber sets may be surprisingly complicated. Consider the trees

$$\begin{aligned}{} & {} T_1 = \mathtt {(D:10,(C:4,(B:2,A:2):2):6)} \quad \text {and}\quad \nonumber \\{} & {} T_2 = \mathtt {(A:10,(B:4,(C:2,D:2):2):6)} \end{aligned}$$
(11)

from [27, Fig. 5]; here and below we employ the Newick format discussed in Example 21. The symmetric Fermat–Weber set of \(T_1\) and \(T_2\), denoted \({{\,\textrm{FW}\,}}_{\text {sym}}(T_1,T_2)\), contains the tropical segment between them, which exhibits seven distinct tree topologies, four of which are binary. In contrast, the asymmetric setting is trivial: the tropical median is \(\mathtt {(A:10,D:10,(B:4,C:4):6)}\), and this is the unique asymmetric tropical Fermat–Weber point.

In view of [36, Lemma 3.5] the symmetric tropical distance function might lead to a robust tropical consensus method. However, examples like the above form a challenge to defining a method which is globally consistent. The next case shows more differences between the symmetric and the asymmetric distances.

Example 25

With \(T_1\) and \(T_2\) defined as in (11), we consider two copies of \(T_1\), two copies of \(T_2\), and the tree \(T_3=\mathtt {(A:10,((B:4,C:4):3,D:7):3)}\). That is to say, with multiplicities, we have five trees altogether. The unique asymmetric tropical Fermat–Weber point of these five trees is the tree \(\mathtt {(A:10,D:10,(B:4,C:4):6)}\). On the other hand, \(T_3\) is the unique symmetric tropical Fermat–Weber point, by [29, Lemma 8]. Both Fermat–Weber points are unique, but they differ.

Remark 26

The \(\max \)-tropical convexity of the tropical median consensus method ultimately rests on the specific formulation of the Fermat–Weber problem in (3). Exchanging the arguments in the asymmetric tropical distance function gives the \(\min \)-tropical analog.

5.3 Computational experiments

We compare running times for experiments concerning Fermat–Weber sets in tree space with respect to the symmetric and the asymmetric tropical distance functions. As input data we take random trees which were produced using the function rmtree from the R library ape [37]. This is similar to the experiment reported in [43, Example 8]. Most of the trees generated are not equidistant, so we adjust the lengths of the leaf edges to make them equidistant. In this way we get any number of trees in \({\mathcal T}_n\), for various values of n, the number of taxa. Recall that, by [28, Proposition 26], the symmetric tropical Fermat–Weber set \({{\,\textrm{FW}\,}}_{\text {sym}}(T_1,\dots ,T_m)\) of trees \(T_i\in {\mathcal T}_n\) is a convex polytope in \({\mathbb R}^{\genfrac(){0.0pt}1{n}{2}}/{\mathbb R}\mathbbm {1}\). The asymmetric tropical Fermat–Weber set \({{\,\textrm{FW}\,}}(T_1,\dots ,T_m)\) is a polytrope, and thus a polytope, too; cf. Theorem 4. All timings are obtained with polymake, version 4.9, running on a quad core Intel Core i5-4590 processor (6599.89 bogomips), openSUSE Leap 15.3 (Linux 6.1.0). For details see our data repository at https://github.com/micjoswig/TropicalDataAnalysis/tree/main/Tropical_medians_by_transportation.

Table 2 Timings (in s) for computing asymmetric tropical Fermat–Weber sets of equidistant trees. Each entry is the average running time from 100 individual experiments, which show only very small variance

Entire Fermat–Weber sets. First, we compute exact facet descriptions of the polytopes \({{\,\textrm{FW}\,}}_{\text {sym}}(T_1,\dots ,T_m)\) and \({{\,\textrm{FW}\,}}(T_1,\dots ,T_m)\), where \(T_i\in {\mathcal T}_n\), for various values of m and n. While the trees are generated with edge lengths given by floating point numbers, we convert them to exact rationals. We are not aware of a published implementation for computing symmetric tropical Fermat–Weber sets, so we implemented it in polymake. The algorithm suggested by [28, Proposition 26] allows for an improvement by exploiting properties of polyhedral L-convex functions in the sense of Murota [33, §7.8]. In this way, a facet description of \({{\,\textrm{FW}\,}}_{\text {sym}}(T_1,\dots ,T_m)\) can be obtained from solving \(n(n-1)\) linear programs similar to those in (9). The timings for up to ten trees on up to ten leaves are given in Table 1. Lin and Yoshida report about computations of symmetric tropical Fermat–Weber sets in [29, §4]; however, they do not give timings. The parameters leading to [29, Table 1] are much smaller than the parameters in our Table 1. The combinatorial description of the asymmetric tropical Fermat–Weber set \({{\,\textrm{FW}\,}}(T_1,\dots ,T_m)\) via optimal dual variables of a transportation problem allows us to exploit complementary slackness. So we can compute the description of \({{\,\textrm{FW}\,}}(T_1,\dots ,T_m)\) in terms of its ordinary facets much faster than its symmetric counterpart; see Table 2.

Tropical median consensus trees. Our most interesting experiment is concerned with computing tropical median consensus trees of m trees in \({\mathcal T}_n\), again for various values of m and n. This time we use mcf [30] (through our polymake interface), which is a standard implementation of the network simplex algorithm, using floating point computations. Table 3 has the timings for up to 25 trees on up to 300 leaves.

The last row of Table 3, for \(m=25\) trees, is particularly interesting. Observe that the timings in that row, with an increasing number of leaves, are not monotone. The most likely explanation comes from Corollary 19, which gives an upper bound for the dimension of \({{\,\textrm{FW}\,}}(T_1,\dots ,T_m)\); the critical contribution is \(\gcd (m,\genfrac(){0.0pt}1{n}{2})\). We have \(\genfrac(){0.0pt}1{25}{2}=300\), and the running times are almost proportional to \(\gcd (m,300)\).

In order to see what is going on, we conducted a more refined experiment for \(n=25\) taxa, trying all values of m in the set \(\{1,2,\dots ,299\}\). The results are shown in Fig. 8. The timing for \(m=300\), being more than 380 s, has been omitted for better readability of the other results. There are only ten computations which last more than four seconds. These are the values \(m\in \{50,60,75,100,120,150,200,225,240,300\}\); all are integers with \(\gcd (m,300)\ge 50\). A more detailed analysis is beyond the scope of the present article.

To the best of our knowledge, no regular consensus method arising from the symmetric tropical distance function on tree space has been proposed. So there is no direct way to make a comparison.

Table 3 Timings (in s) for computing tropical median consensus trees using mcf [30]
Fig. 8
figure 8

Number of trees versus time in seconds for up to 299 trees on 25 leaves

Apicomplexa gene trees. For our final experiment we consider an existing dataset of \(m=268\) trees with \(n=8\) leaves, which was already studied by Page, Yoshida and Zhang [36, §6.2]. Via simulated annealing, the latter authors generate three trees whose tropical convex hull fits best the input data; then they project the input trees on this tropical triangle and display the result in [36, Fig. 6]. So this line of research is more about dimension reduction techniques for analyzing data rather than obtaining location statistics; see also [43]. In particular, this does not seem to lead to a regular consensus method in the sense of [10]. Nonetheless, our method applies to their data.

The trees discussed in [36, §6.2] have been reconstructed by Kuo, Wares and Kissinger [23] from 268 orthologous sequences with eight species of protozoa. Seven species among the taxa are Babesia bovis (Bb), Cryptosporidium parvum (Cp), Eimeria tenella (Et), Plasmodium falciparum (Pf), Plasmodium vivax (Pv), Theileria annulata (Ta) and Toxoplasma gondii (Tg). The eighth taxon is Tetrahymena thermophila (Tt), which forms the outgroup. For this data we obtain the tropical median consensus tree

$$\begin{aligned} \begin{aligned}&\texttt {(Cp:0.570333,}{} \texttt {Et:0.570333,Tg:0.570333,Tt:0.570333,}\\&\texttt {(Pf:0.43862,Pv:0.43862):0.131713,(Bb:0.57033,Ta:0.57033):}\\&\qquad \texttt {0.000003)}, \end{aligned} \end{aligned}$$

which is displayed in Fig. 9. Only two cherries are resolved; the outgroup was not detected. So our method is quite conservative, making an effort to avoid false positive results. This may be an advantage, in particular since the 268 input trees were generated by a diverse range of methods.

6 Conclusion

There is a considerable amount of work on the tropical metric geometry of the space of equidistant trees [27,28,29, 36, 43]. This is quite natural, because the symmetric tropical distance between two equidistant trees can be interpreted naturally, as the cost of changing one tree into the other along a tropical line segment, where the cost is measured in the \(\ell ^\infty \) norm; see [27] for details. Here we study the asymmetric tropical distance, which has a similar interpretation, but for the \(\ell ^1\) norm. What makes the asymmetric tropical distance attractive is the fact that it leads to a regular consensus method, while this is not known to exist in the symmetric case. Further, the tropical median consensus method is robust and fast in practice; it can be applied to hundreds of trees with dozens of taxa. The robustness is computationally valuable as it makes floating-point computations reliable.

The tropical median consensus method seems to be rather conservative, in the sense that our consensus trees sometimes show little resolution. This is not necessarily bad. For instance, the apicomplexa gene trees studied in Sect. 5 come from a diverse mix of tree building methods, which should probably make it difficult to find a fully resolved consensus; cf. Fig. 9.

Fig. 9
figure 9

The tropical median consensus of the apicomplexa data