The Generalized Robinson-Foulds Metric
- 4 Citations
- 1.7k Downloads
Abstract
The Robinson-Foulds (RF) metric is arguably the most widely used measure of phylogenetic tree similarity, despite its well-known shortcomings: For example, moving a single taxon in a tree can result in a tree that has maximum distance to the original one; but the two trees are identical if we remove the single taxon. To this end, we propose a natural extension of the RF metric that does not simply count identical clades but instead, also takes similar clades into consideration. In contrast to previous approaches, our model requires the matching between clades to respect the structure of the two trees, a property that the classical RF metric exhibits, too. We show that computing this generalized RF metric is, unfortunately, NP-hard. We then present a simple Integer Linear Program for its computation, and evaluate it by an all-against-all comparison of 100 trees from a benchmark data set. We find that matchings that respect the tree structure differ significantly from those that do not, underlining the importance of this natural condition.
Keywords
Maximum Match Optimal Match Satisfying Assignment Complete Binary Tree Variable GadgetPreview
Unable to display preview. Download preview PDF.
References
- 1.Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Annals Combinatorics 5, 1–15 (2001)MathSciNetCrossRefGoogle Scholar
- 2.Bansal, M.S., Dong, J., Fernández-Baca, D.: Comparing and aggregating partially resolved trees. Theor. Comput. Sci. 412(48), 6634–6652 (2011)zbMATHCrossRefGoogle Scholar
- 3.Bogdanowicz, D.: Comparing phylogenetic trees using a minimum weight perfect matching. In: Proc. of Information Technology (IT 2008), pp. 1–4 (2008)Google Scholar
- 4.Bogdanowicz, D., Giaro, K.: Matching split distance for unrooted binary phylogenetic trees. IEEE/ACM Trans. Comput. Biol. Bioinformatics 9(1), 150–160 (2012)CrossRefGoogle Scholar
- 5.Canzar, S., Elbassioni, K., Klau, G., Mestre, J.: On tree-constrained matchings and generalizations. Algorithmica, 1–22 (2013)Google Scholar
- 6.Critchlow, D.E., Pearl, D.K., Qian, C.: The triples distance for rooted bifurcating phylogenetic trees. Syst. Biol. 45(3), 323–334 (1996)CrossRefGoogle Scholar
- 7.Dabrowski, K., Lozin, V.V., Müller, H., Rautenbach, D.: Parameterized algorithms for the independent set problem in some hereditary graph classes. In: Iliopoulos, C.S., Smyth, W.F. (eds.) IWOCA 2010. LNCS, vol. 6460, pp. 1–9. Springer, Heidelberg (2011)CrossRefGoogle Scholar
- 8.Deza, M., Laurent, M.: Geometry of Cuts and Metrics. Springer, New York (1997)zbMATHGoogle Scholar
- 9.Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, Berlin (1999)CrossRefGoogle Scholar
- 10.Dubois, O.: On the r, s-SAT satisfiability problem and a conjecture of Tovey. Discrete Applied Mathematics 26(1), 51–60 (1990)MathSciNetzbMATHCrossRefGoogle Scholar
- 11.Finden, C., Gordon, A.: Obtaining common pruned trees. J. Classif. 2(1), 255–276 (1985)CrossRefGoogle Scholar
- 12.Griebel, T., Brinkmeyer, M., Böcker, S.: EPoS: A modular software framework for phylogenetic analysis. Bioinformatics 24(20), 2399–2400 (2008)CrossRefGoogle Scholar
- 13.Kao, M.-Y., Lam, T.W., Sung, W.-K., Ting, H.-F.: An even faster and more unifying algorithm for comparing trees via unbalanced bipartite matchings. J. Algorithms 40(2), 212–233 (2001)MathSciNetzbMATHCrossRefGoogle Scholar
- 14.Lewis, L.A., Lewis, P.O.: Unearthing the molecular phylodiversity of desert soil green algae (Chlorophyta). Syst. Biol. 54(6), 936–947 (2005)CrossRefGoogle Scholar
- 15.Lin, Y., Rajan, V., Moret, B.M.E.: A metric for phylogenetic trees based on matching. IEEE/ACM Trans. Comput. Biol. Bioinformatics 9(4), 1014–1022 (2012)CrossRefGoogle Scholar
- 16.Munzner, T., Guimbretière, F., Tasiran, S., Zhang, L., Zhou, Y.: TreeJuxtaposer: Scalable tree comparison using focus+context with guaranteed visibility. ACM Trans. Graph. 22(3), 453–462 (2003)CrossRefGoogle Scholar
- 17.Nye, T.M.W., Liò, P., Gilks, W.R.: A novel algorithm and web-based tool for comparing two alternative phylogenetic trees. Bioinformatics 22(1), 117–119 (2006)CrossRefGoogle Scholar
- 18.Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Math. Biosci. 53(1-2), 131–147 (1981)MathSciNetzbMATHCrossRefGoogle Scholar
- 19.Sul, S.-J., Williams, T.L.: An experimental analysis of robinson-foulds distance matrix algorithms. In: Halperin, D., Mehlhorn, K. (eds.) ESA 2008. LNCS, vol. 5193, pp. 793–804. Springer, Heidelberg (2008)CrossRefGoogle Scholar