Skip to main content

Cophenetic Distances: A Near-Linear Time Algorithmic Framework

  • Conference paper
  • First Online:
Computing and Combinatorics (COCOON 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10976))

Included in the following conference series:

  • 1481 Accesses

Abstract

Tree metrics that compare pairs of trees are an elementary tool for analyzing phylogenetic trees. The cophenetic distance is a classic vector-based tree metric introduced by Cardona et al. that originates from the pioneering work of Sokal and Rohlf more than 50 years ago. However, when faced with phylogenetic analyses where sets of large-scale trees are compared, the quadratic runtime of the current best-known (naïve) algorithm to compute the cophenetic distance becomes prohibitive. Here we describe an algorithmic framework that computes the cophenetic distance under the \(L_1\)-norm in \(O(n \log ^2 n)\) time, where n is the size of the compared pair of trees. Based on the work from Sokal and Rohlf, we introduce a natural class of cophenetic distances and show that our algorithmic framework can compute each member of this class in \(O(n \log ^2 n)\) time. In addition, we present a modification of this framework for computing these distances under the \(L_2\)-norm in \(O(n \log n)\) time. Finally, we demonstrate the scalability of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Ann. Comb. 5(1), 1–15 (2001)

    Article  MathSciNet  Google Scholar 

  2. Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Ann. Comb. 8(4), 409–423 (2005)

    Article  MathSciNet  Google Scholar 

  3. Bourque, M.: Arbres de Steiner et réseaux dont varie l’emplagement de certains sommets. Ph.D. thesis, University of Montréal Montréal, Canada (1978)

    Google Scholar 

  4. Bryant, D.: Hunting for trees, building trees and comparing trees: theory and method in phylogenetic analysis. Ph.D. thesis, University of Canterbury, New Zealand (1997)

    Google Scholar 

  5. Cardona, G., Mir, A., Rosselló, F., Rotger, L.: The expected value of the squared cophenetic metric under the yule and the uniform models. Math. Biosci. 295, 73–85 (2018)

    Article  MathSciNet  Google Scholar 

  6. Cardona, G., Mir, A., Rosselló, F., Rotger, L., Sánchez, D.: Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf. BMC Bioinform. 14(1), 3 (2013)

    Article  Google Scholar 

  7. Critchlow, D., Pearl, D., Qian, C.: The triples distance for rooted bifurcating phylogenetic trees. Syst. Biol. 45, 323–334 (1996)

    Article  Google Scholar 

  8. DasGupta, B., et al.: On distances between phylogenetic trees. In: SODA, vol. 97, pp. 427–436 (1997)

    Google Scholar 

  9. Estabrook, G., McMorris, F., Meacham, C.: Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units. Syst. Zool. 34, 193–200 (1985)

    Article  Google Scholar 

  10. Eulenstein, O., Huzurbazar, S., Liberles, D.: Reconciling phylogenetic trees. In: Evolution After Gene Duplication. Wiley, Hoboken (2010)

    Google Scholar 

  11. Felsenstein, J.: Inferring Phylogenies. Sinauer Associates, Inc., Sunderland (2004)

    Google Scholar 

  12. Forster, P., Renfrew, C.: Phylogenetic Methods and the Prehistory of Languages. McDonald Inst of Archeological, Cambridge (2006)

    Google Scholar 

  13. Górecki, P., Eulenstein, O., Tiuryn, J.: Unrooted tree reconciliation: a unified approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 522–536 (2013)

    Article  Google Scholar 

  14. Harris, S., et al.: Whole-genome sequencing for analysis of an outbreak of meticillin-resistant staphylococcus aureus: a descriptive study. Lancet. Infect. Dis. 13(2), 130–136 (2013)

    Article  Google Scholar 

  15. Hein, J.: Reconstructing evolution of sequences subject to recombination using parsimony. Math. Biosci. 98(2), 185–200 (1990)

    Article  MathSciNet  Google Scholar 

  16. Hein, J., et al.: On the complexity of comparing evolutionary trees. Discrete Appl. Math. 71(1–3), 153–169 (1996)

    Article  MathSciNet  Google Scholar 

  17. Hickey, G., et al.: SPR distance computation for unrooted trees. Evol. Bioinform. online 4, 17–27 (2008)

    Article  Google Scholar 

  18. Hoef-Emden, K.: Molecular phylogenetic analyses and real-life data. Comput. Sci. Eng. 7(3), 86–91 (2005)

    Article  Google Scholar 

  19. Katherine, S.J.: Review paper: the shape of phylogenetic treespace. Syst. Biol. 66(1), e83–e94 (2017)

    Google Scholar 

  20. Kendall, M., Colijn, C.: Mapping phylogenetic trees to reveal distinct patterns of evolution. Mol. Biol. Evol. 33(10), 2735–2743 (2016)

    Article  Google Scholar 

  21. Kuhner, M.K., Yamato, J.: Practical performance of tree comparison metrics. Syst. Biol. 64(2), 205–214 (2015)

    Article  Google Scholar 

  22. Li, M., Tromp, J., Zhang, L.: On the nearest neighbour interchange distance between evolutionary trees. J. Theor. Biol. 182(4), 463–467 (1996)

    Article  Google Scholar 

  23. Markin, A., Eulenstein, O.: Cophenetic median trees under the manhattan distance. In: ACM-BCB 2017, pp. 194–202. ACM, New York (2017)

    Google Scholar 

  24. Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Math. Biosci. 53(1–2), 131–147 (1981)

    Article  MathSciNet  Google Scholar 

  25. Roux, J., et al.: Resolving the native provenance of invasive fireweed (Senecio madagascariensis Poir.) in the Hawaiian Islands as inferred Poir.) in the Hawaiian Islands as inferred from phylogenetic analysis. Div. Distr. 12, 694–702 (2006)

    Article  MathSciNet  Google Scholar 

  26. Sand, A., et al.: Algorithms for computing the triplet and quartet distances for binary and general trees. Biology 2(4), 1189–1209 (2013)

    Article  Google Scholar 

  27. Semple, C., Steel, M.A.: Phylogenetics. University Press, Oxford (2003)

    MATH  Google Scholar 

  28. Sokal, R.R., Rohlf, F.J.: The comparison of dendrograms by objective methods. Taxon 11(2), 33–40 (1962)

    Article  Google Scholar 

  29. Steel, M.A., Penny, D.: Distributions of tree comparison metrics. Syst. Biol. 42(2), 126–141 (1993)

    Google Scholar 

  30. Williams, W., Clifford, H.: On the comparison of two classifications of the same set of elements. Taxon 20(4), 519–522 (1971)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paweł Górecki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Górecki, P., Markin, A., Eulenstein, O. (2018). Cophenetic Distances: A Near-Linear Time Algorithmic Framework. In: Wang, L., Zhu, D. (eds) Computing and Combinatorics. COCOON 2018. Lecture Notes in Computer Science(), vol 10976. Springer, Cham. https://doi.org/10.1007/978-3-319-94776-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94776-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94775-4

  • Online ISBN: 978-3-319-94776-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics