The Generalized Robinson-Foulds Metric

  • Sebastian Böcker
  • Stefan Canzar
  • Gunnar W. Klau
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8126)

Abstract

The Robinson-Foulds (RF) metric is arguably the most widely used measure of phylogenetic tree similarity, despite its well-known shortcomings: For example, moving a single taxon in a tree can result in a tree that has maximum distance to the original one; but the two trees are identical if we remove the single taxon. To this end, we propose a natural extension of the RF metric that does not simply count identical clades but instead, also takes similar clades into consideration. In contrast to previous approaches, our model requires the matching between clades to respect the structure of the two trees, a property that the classical RF metric exhibits, too. We show that computing this generalized RF metric is, unfortunately, NP-hard. We then present a simple Integer Linear Program for its computation, and evaluate it by an all-against-all comparison of 100 trees from a benchmark data set. We find that matchings that respect the tree structure differ significantly from those that do not, underlining the importance of this natural condition.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Annals Combinatorics 5, 1–15 (2001)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Bansal, M.S., Dong, J., Fernández-Baca, D.: Comparing and aggregating partially resolved trees. Theor. Comput. Sci. 412(48), 6634–6652 (2011)MATHCrossRefGoogle Scholar
  3. 3.
    Bogdanowicz, D.: Comparing phylogenetic trees using a minimum weight perfect matching. In: Proc. of Information Technology (IT 2008), pp. 1–4 (2008)Google Scholar
  4. 4.
    Bogdanowicz, D., Giaro, K.: Matching split distance for unrooted binary phylogenetic trees. IEEE/ACM Trans. Comput. Biol. Bioinformatics 9(1), 150–160 (2012)CrossRefGoogle Scholar
  5. 5.
    Canzar, S., Elbassioni, K., Klau, G., Mestre, J.: On tree-constrained matchings and generalizations. Algorithmica, 1–22 (2013)Google Scholar
  6. 6.
    Critchlow, D.E., Pearl, D.K., Qian, C.: The triples distance for rooted bifurcating phylogenetic trees. Syst. Biol. 45(3), 323–334 (1996)CrossRefGoogle Scholar
  7. 7.
    Dabrowski, K., Lozin, V.V., Müller, H., Rautenbach, D.: Parameterized algorithms for the independent set problem in some hereditary graph classes. In: Iliopoulos, C.S., Smyth, W.F. (eds.) IWOCA 2010. LNCS, vol. 6460, pp. 1–9. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  8. 8.
    Deza, M., Laurent, M.: Geometry of Cuts and Metrics. Springer, New York (1997)MATHGoogle Scholar
  9. 9.
    Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, Berlin (1999)CrossRefGoogle Scholar
  10. 10.
    Dubois, O.: On the r, s-SAT satisfiability problem and a conjecture of Tovey. Discrete Applied Mathematics 26(1), 51–60 (1990)MathSciNetMATHCrossRefGoogle Scholar
  11. 11.
    Finden, C., Gordon, A.: Obtaining common pruned trees. J. Classif. 2(1), 255–276 (1985)CrossRefGoogle Scholar
  12. 12.
    Griebel, T., Brinkmeyer, M., Böcker, S.: EPoS: A modular software framework for phylogenetic analysis. Bioinformatics 24(20), 2399–2400 (2008)CrossRefGoogle Scholar
  13. 13.
    Kao, M.-Y., Lam, T.W., Sung, W.-K., Ting, H.-F.: An even faster and more unifying algorithm for comparing trees via unbalanced bipartite matchings. J. Algorithms 40(2), 212–233 (2001)MathSciNetMATHCrossRefGoogle Scholar
  14. 14.
    Lewis, L.A., Lewis, P.O.: Unearthing the molecular phylodiversity of desert soil green algae (Chlorophyta). Syst. Biol. 54(6), 936–947 (2005)CrossRefGoogle Scholar
  15. 15.
    Lin, Y., Rajan, V., Moret, B.M.E.: A metric for phylogenetic trees based on matching. IEEE/ACM Trans. Comput. Biol. Bioinformatics 9(4), 1014–1022 (2012)CrossRefGoogle Scholar
  16. 16.
    Munzner, T., Guimbretière, F., Tasiran, S., Zhang, L., Zhou, Y.: TreeJuxtaposer: Scalable tree comparison using focus+context with guaranteed visibility. ACM Trans. Graph. 22(3), 453–462 (2003)CrossRefGoogle Scholar
  17. 17.
    Nye, T.M.W., Liò, P., Gilks, W.R.: A novel algorithm and web-based tool for comparing two alternative phylogenetic trees. Bioinformatics 22(1), 117–119 (2006)CrossRefGoogle Scholar
  18. 18.
    Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Math. Biosci. 53(1-2), 131–147 (1981)MathSciNetMATHCrossRefGoogle Scholar
  19. 19.
    Sul, S.-J., Williams, T.L.: An experimental analysis of robinson-foulds distance matrix algorithms. In: Halperin, D., Mehlhorn, K. (eds.) ESA 2008. LNCS, vol. 5193, pp. 793–804. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sebastian Böcker
    • 1
  • Stefan Canzar
    • 2
  • Gunnar W. Klau
    • 3
  1. 1.BioinformaticsFriedrich Schiller University JenaGermany
  2. 2.Center for Computational Biology, McKusick-Nathans Institute of Genetic MedicineJohns Hopkins University, School of MedicineBaltimoreUSA
  3. 3.Life Sciences Group, Centrum Wiskunde & InformaticaAmsterdamThe Netherlands

Personalised recommendations