Skip to main content

Cluster Matching Distance for Rooted Phylogenetic Trees

  • Conference paper
  • First Online:
Bioinformatics Research and Applications (ISBRA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10847))

Included in the following conference series:

Abstract

Phylogenetic trees are fundamental to biology and are benefitting several other research areas. Various methods have been developed for inferring such trees, and comparing them is an important problem in computational phylogenetics. Addressing this problem requires tree measures, but all of them suffer from problems that can severely limit their applicability in practice. This also holds true for one of the oldest and most widely used tree measures, the Robinson-Foulds distance. While this measure is satisfying the properties of a metric and is efficiently computable, it has a negatively skewed distribution, a poor range of discrimination and diameter, and may not be robust when comparing erroneous trees. The cluster distance is a measure for comparing rooted trees that can be interpreted as a weighted version of the Robinson-Foulds distance. We show that when compared with the Robinson-Foulds distance, the cluster distance is much more robust towards small errors in the compared trees, and has a significantly improved distribution and range.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Ann. Comb. 5(1), 1–15 (2001)

    Article  MathSciNet  Google Scholar 

  2. Arvestad, L., et al.: Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution. In: RECOMB, pp. 326–335. ACM (2004)

    Google Scholar 

  3. Betkier, A., Szczęsny, P., Górecki, P.: Fast algorithms for inferring gene-species associations. In: Harrison, R., Li, Y., Măndoiu, I. (eds.) ISBRA 2015. LNCS, vol. 9096, pp. 36–47. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19048-8_4

    Chapter  Google Scholar 

  4. Bogdanowicz, D., Giaro, K.: On a matching distance between rooted phylogenetic trees. Int. J. Appl. Math. Comput. Sci. 23(3), 669–684 (2013)

    Article  MathSciNet  Google Scholar 

  5. Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Ann. Comb. 8(4), 409–423 (2005)

    Article  MathSciNet  Google Scholar 

  6. Bourque, M.: Arbres de Steiner et réseaux dont varie l’emplagement de certains sommets. Ph.D. thesis, University of Montréal Montréal, Canada (1978)

    Google Scholar 

  7. Bryant, D.: Hunting for trees, building trees and comparing trees: theory and method in phylogenetic analysis. Ph.D. thesis, University of Canterbury, New Zealand (1997)

    Google Scholar 

  8. Bryant, D., Steel, M.: Computing the distribution of a tree metric. IEEE/ACM Trans. Comput. Biol. Bioinf. 6(3), 420–426 (2009)

    Article  Google Scholar 

  9. Das Gupta, B., et al.: On distances between phylogenetic trees. In: SODA 1997, pp. 427–436 (1997)

    Google Scholar 

  10. Day, W.H.E.: Optimal algorithms for comparing trees with labeled leaves. J. Classif. 2(1), 7–28 (1985)

    Article  MathSciNet  Google Scholar 

  11. Felenstein, J.: Inferring Phylogenies. Sinauer, Sunderland (2003)

    Google Scholar 

  12. Forster, P., Renfrew, C.: Phylogenetic Methods and the Prehistory of Languages. McDonald Institute of Archeological, Cambridge (2006)

    Google Scholar 

  13. Harding, E.F.: The probabilities of rooted tree-shapes generated by random bifurcation. Adv. Appl. Probab. 3(1), 44–77 (1971)

    Article  MathSciNet  Google Scholar 

  14. Harris, S.R., et al.: Whole-genome sequencing for analysis of an outbreak of meticillin-resistant staphylococcus aureus: a descriptive study. Lancet. Infect. Dis. 13(2), 130–136 (2013)

    Article  Google Scholar 

  15. Hein, J., et al.: On the complexity of comparing evolutionary trees. Discret. Appl. Math. 71(1–3), 153–169 (1996)

    Article  MathSciNet  Google Scholar 

  16. Hickey, G., et al.: SPR distance computation for unrooted trees. Evol. Bioinform. Online 4, 17–27 (2008)

    Article  Google Scholar 

  17. Huber, K.T., Spillner, A., Suchecki, R., Moulton, V.: Metrics on multilabeled trees: interrelationships and diameter bounds. IEEE/ACM Trans. Comput. Biol. Bioinf. 8(4), 1029–1040 (2011)

    Article  Google Scholar 

  18. Hufbauer, R.A., et al.: Population structure, ploidy levels and allelopathy of Centaurea maculosa (spotted knapweed) and C. diffusa (diffuse knapweed) in North America and Eurasia. In: ISBCW, pp. 121–126. USDA Forest Service (2003)

    Google Scholar 

  19. Katherine, S.J.: Review paper: the shape of phylogenetic treespace. Syst. Biol. 66(1), e83–e94 (2017)

    Google Scholar 

  20. Kuhner, M.K., Yamato, J.: Practical performance of tree comparison metrics. Syst. Biol. 64(2), 205–214 (2015)

    Article  Google Scholar 

  21. Li, M., Tromp, J., Zhang, L.: On the nearest neighbour interchange distance between evolutionary trees. J. Theor. Biol. 182(4), 463–467 (1996)

    Article  Google Scholar 

  22. Li, M., Zhang, L.: Twist-rotation transformations of binary trees and arithmetic expressions. J. Algorithms 32(2), 155–166 (1999)

    Article  MathSciNet  Google Scholar 

  23. Lin, Y., Rajan, V., Moret, B.M.E.: A metric for phylogenetic trees based on matching. IEEE/ACM Trans. Comput. Biol. Bioinf. 9(4), 1014–1022 (2012)

    Article  Google Scholar 

  24. Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)

    Article  MathSciNet  Google Scholar 

  25. Makarenkov, V., Leclerc, B.: Comparison of additive trees using circular orders. J. Comput. Biol. 7(5), 731–744 (2000)

    Article  Google Scholar 

  26. Nik-Zainal, S., et al.: The life history of 21 breast cancers. Cell 149(5), 994–1007 (2012)

    Article  Google Scholar 

  27. Robinson, D.F., Foulds, L.R.: Comparison of weighted labelled trees. In: Horadam, A.F., Wallis, W.D. (eds.) Combinatorial Mathematics VI. LNM, vol. 748, pp. 119–126. Springer, Heidelberg (1979). https://doi.org/10.1007/BFb0102690

    Chapter  Google Scholar 

  28. Robinson, D.F.: Comparison of labeled trees with valency three. J. Comb. Theory Ser. B 11(2), 105–119 (1971)

    Article  MathSciNet  Google Scholar 

  29. Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Math. Biosci. 53(1–2), 131–147 (1981)

    Article  MathSciNet  Google Scholar 

  30. Semple, C., Steel, M.A.: Phylogenetics. Oxford (2003)

    Google Scholar 

  31. Steel, M.A., Penny, D.: Distributions of tree comparison metrics. Syst. Biol. 42(2), 126–141 (1993)

    Google Scholar 

  32. Sukumaran, J., Holder, M.T.: DendroPy: a python library for phylogenetic computing. Bioinformatics 26(12), 1569–1571 (2010)

    Article  Google Scholar 

  33. Than, C.V., Rosenberg, N.A.: Mathematical properties of the deep coalescence cost. IEEE/ACM Trans. Comput. Biol. Bioinf. 10(1), 61–72 (2013)

    Article  Google Scholar 

  34. Wilkinson, M., et al.: The shape of supertrees to come: tree shape related properties of fourteen supertree methods. Syst. Biol. 54(3), 419–431 (2005)

    Article  MathSciNet  Google Scholar 

  35. Wu, Y.-C., et al.: TreeFix: statistically informed gene tree error correction using species trees. Syst. Biol. 62(1), 110–120 (2013)

    Article  Google Scholar 

Download references

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. 1617626.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jucheol Moon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moon, J., Eulenstein, O. (2018). Cluster Matching Distance for Rooted Phylogenetic Trees. In: Zhang, F., Cai, Z., Skums, P., Zhang, S. (eds) Bioinformatics Research and Applications. ISBRA 2018. Lecture Notes in Computer Science(), vol 10847. Springer, Cham. https://doi.org/10.1007/978-3-319-94968-0_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94968-0_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94967-3

  • Online ISBN: 978-3-319-94968-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics