A Fast Algorithm for Computing the Quartet Distance for Large Sets of Evolutionary Trees

  • Ralph W. Crosby
  • Tiffani L. Williams
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7292)

Abstract

We present the QuickQuartet algorithm for computing the all-to-all quartet distance for large evolutionary tree collections. By leveraging the relationship between bipartitions and quartets, our approach significantly improves upon the performance of existing quartet distance algorithms. To explore QuickQuartet’s performance, sets of biological data containing 20,000 and 33,306 trees over 150 taxa and 567 taxa, respectively are analyzed. Experimental results show that QuickQuartet is up to 100 times faster than existing methods. With the availability of QuickQuartet, the use of quartet distance as a tool for analysis of evolutionary relationships becomes a practical tool for biologists to use in order to gain new insights regarding their large tree collections.

Keywords

Directed Acyclic Graph Target Tree Hash Table Internal Edge Source Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Brodal, G.S., Fagerberg, R., Pedersen, C.N.S.: Computing the Quartet Distance between Evolutionary Trees in Time O(nlog2 n). In: Eades, P., Takaoka, T. (eds.) ISAAC 2001. LNCS, vol. 2223, pp. 731–742. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  2. 2.
    Davis, B.W., Li, G., Murphy, W.J.: Supermatrix and species tree methods resolve phylogenetic relationships within the big cats, panthera (carnivora: Felidae). Molecular Phylogenetics and Evolution 56(1), 64–76 (2010)CrossRefGoogle Scholar
  3. 3.
    Huelsenbeck, J.P., Ronquist, F.: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17(8), 754–755 (2001)CrossRefGoogle Scholar
  4. 4.
    Lewis, L.A., Lewis, P.O.: Unearthing the molecular phylodiversity of desert soil green algae (chlorophyta). Syst. Bio. 54(6), 936–947 (2005)CrossRefGoogle Scholar
  5. 5.
    Mailund, T., Pedersen, C.N.S.: QDist–quartet distance between evolutionary trees. Bioinformatics 20(10), 1636–1637 (2004)CrossRefGoogle Scholar
  6. 6.
    Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Mathematical Biosciences 53, 131–147 (1981)MathSciNetMATHCrossRefGoogle Scholar
  7. 7.
    Schmidt, H.A., Strimmer, K., Vingron, M., von Haeseler, A.: Tree-puzzle: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18(3), 502–504 (2002)CrossRefGoogle Scholar
  8. 8.
    Soltis, D.E., Gitzendanner, M.A., Soltis, P.S.: A 567-taxon data set for angiosperms: The challenges posed by bayesian analyses of large data sets. Int. J. Plant Sci. 168(2), 137–157 (2007)CrossRefGoogle Scholar
  9. 9.
    Steel, M.A., Penny, D.: Distributions of tree comparision metrics—some new results. Systematic Biology 42(2), 126–141 (1993)MathSciNetGoogle Scholar
  10. 10.
    Stissing, M., Mailund, T., Pedersen, C., Brodal, G., Fagerberg, R.: Computing the all-pairs quartet distance on a set of evolutionary trees. Journal of Bioinformatics & Computational Biology 6(1), 37–50 (2008)CrossRefGoogle Scholar
  11. 11.
    Sul, S.-J., Williams, T.L.: An Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms. In: Halperin, D., Mehlhorn, K. (eds.) ESA 2008. LNCS, vol. 5193, pp. 793–804. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Swofford, D.L.: PAUP*: Phylogenetic analysis using parsimony (and other methods), Sinauer Associates, Underland, Massachusetts, Version 4.0 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ralph W. Crosby
    • 1
  • Tiffani L. Williams
    • 1
  1. 1.Department of Computer Science and EngineeringTexas A&M UniversityUSA

Personalised recommendations