Abstract
The Robinson-Foulds (RF) metric is arguably the most widely used measure of phylogenetic tree similarity, despite its well-known shortcomings: For example, moving a single taxon in a tree can result in a tree that has maximum distance to the original one; but the two trees are identical if we remove the single taxon. To this end, we propose a natural extension of the RF metric that does not simply count identical clades but instead, also takes similar clades into consideration. In contrast to previous approaches, our model requires the matching between clades to respect the structure of the two trees, a property that the classical RF metric exhibits, too. We show that computing this generalized RF metric is, unfortunately, NP-hard. We then present a simple Integer Linear Program for its computation, and evaluate it by an all-against-all comparison of 100 trees from a benchmark data set. We find that matchings that respect the tree structure differ significantly from those that do not, underlining the importance of this natural condition.
This work is supported in part by the National Institutes of Health under grant R01 HG006677.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Annals Combinatorics 5, 1–15 (2001)
Bansal, M.S., Dong, J., Fernández-Baca, D.: Comparing and aggregating partially resolved trees. Theor. Comput. Sci. 412(48), 6634–6652 (2011)
Bogdanowicz, D.: Comparing phylogenetic trees using a minimum weight perfect matching. In: Proc. of Information Technology (IT 2008), pp. 1–4 (2008)
Bogdanowicz, D., Giaro, K.: Matching split distance for unrooted binary phylogenetic trees. IEEE/ACM Trans. Comput. Biol. Bioinformatics 9(1), 150–160 (2012)
Canzar, S., Elbassioni, K., Klau, G., Mestre, J.: On tree-constrained matchings and generalizations. Algorithmica, 1–22 (2013)
Critchlow, D.E., Pearl, D.K., Qian, C.: The triples distance for rooted bifurcating phylogenetic trees. Syst. Biol. 45(3), 323–334 (1996)
Dabrowski, K., Lozin, V.V., Müller, H., Rautenbach, D.: Parameterized algorithms for the independent set problem in some hereditary graph classes. In: Iliopoulos, C.S., Smyth, W.F. (eds.) IWOCA 2010. LNCS, vol. 6460, pp. 1–9. Springer, Heidelberg (2011)
Deza, M., Laurent, M.: Geometry of Cuts and Metrics. Springer, New York (1997)
Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, Berlin (1999)
Dubois, O.: On the r, s-SAT satisfiability problem and a conjecture of Tovey. Discrete Applied Mathematics 26(1), 51–60 (1990)
Finden, C., Gordon, A.: Obtaining common pruned trees. J. Classif. 2(1), 255–276 (1985)
Griebel, T., Brinkmeyer, M., Böcker, S.: EPoS: A modular software framework for phylogenetic analysis. Bioinformatics 24(20), 2399–2400 (2008)
Kao, M.-Y., Lam, T.W., Sung, W.-K., Ting, H.-F.: An even faster and more unifying algorithm for comparing trees via unbalanced bipartite matchings. J. Algorithms 40(2), 212–233 (2001)
Lewis, L.A., Lewis, P.O.: Unearthing the molecular phylodiversity of desert soil green algae (Chlorophyta). Syst. Biol. 54(6), 936–947 (2005)
Lin, Y., Rajan, V., Moret, B.M.E.: A metric for phylogenetic trees based on matching. IEEE/ACM Trans. Comput. Biol. Bioinformatics 9(4), 1014–1022 (2012)
Munzner, T., Guimbretière, F., Tasiran, S., Zhang, L., Zhou, Y.: TreeJuxtaposer: Scalable tree comparison using focus+context with guaranteed visibility. ACM Trans. Graph. 22(3), 453–462 (2003)
Nye, T.M.W., Liò, P., Gilks, W.R.: A novel algorithm and web-based tool for comparing two alternative phylogenetic trees. Bioinformatics 22(1), 117–119 (2006)
Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Math. Biosci. 53(1-2), 131–147 (1981)
Sul, S.-J., Williams, T.L.: An experimental analysis of robinson-foulds distance matrix algorithms. In: Halperin, D., Mehlhorn, K. (eds.) ESA 2008. LNCS, vol. 5193, pp. 793–804. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Böcker, S., Canzar, S., Klau, G.W. (2013). The Generalized Robinson-Foulds Metric. In: Darling, A., Stoye, J. (eds) Algorithms in Bioinformatics. WABI 2013. Lecture Notes in Computer Science(), vol 8126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40453-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-40453-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40452-8
Online ISBN: 978-3-642-40453-5
eBook Packages: Computer ScienceComputer Science (R0)