Sequence-Length Requirements for Phylogenetic Methods
We study the sequence lengths required by neighbor-joining, greedy parsimony, and a phylogenetic reconstruction method (DCM NJ +MP) based on disk-covering and the maximum parsimony criterion. We use extensive simulations based on random birth-death trees, with controlled deviations from ultrametricity, to collect data on the scaling of sequence-length requirements for each of the three methods as a function of the number of taxa, the rate of evolution on the tree, and the deviation from ultrametricity. Our experiments show that DCM NJ +MP has consistently lower sequence-length requirements than the other two methods when trees of high topological accuracy are desired, although all methods require much longer sequences as the deviation from ultrametricity or the height of the tree grows. Our study has significant implications for large-scale phylogenetic reconstruction (where sequencelength requirements are a crucial factor), but also for future performance analyses in phylogenetics (since deviations from ultrametricity are proving pivotal).
KeywordsModel Tree Sequence Length Phylogenetic Method Expected Deviation True Tree
Unable to display preview. Download preview PDF.
- 2.O.R.P. Bininda-Emonds, S.G. Brady, J. Kim, and M.J. Sanderson. Scaling of accuracy in extremely large phylogenetic trees. In Proc. 6th Pacific Symp. Biocomputing PSB 2002, pages 547–558. World Scientific Pub., 2001.Google Scholar
- 3.W. J. Bruno, N. Socci, and A. L. Halpern. Weighted neighbor joining: A likelihoodbased approach to distance-based phylogeny reconstruction. Mol. Biol. Evol., 17(1):189–197, 2000.Google Scholar
- 4.M. Csűrös. Fast recovery of evolutionary trees with thousands of nodes. To appear in RECOMB 01, 2001.Google Scholar
- 5.M. Csűrös and M. Y. Kao. Recovering evolutionary trees through harmonic greedy triplets. Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA 99), pages 261–270, 1999.Google Scholar
- 6.P. L. Erdős, M. Steel, L. Székély, and T. Warnow. A few logs suffice to build almost all trees-I. Random Structures and Algorithms, 14:153–184, 1997.Google Scholar
- 12.D. Huson, K. A. Smith, and T. Warnow. Correcting large distances for phylogenetic reconstruction. In Proceedings of the 3rd Workshop on Algorithms Engineering (WAE), 1999. London, England.Google Scholar
- 14.K. Kuhner and J. Felsenstein. A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Mol. Biol. Evol., 11:459–468, 1994.Google Scholar
- 15.L. Nakhleh, B.M.E. Moret, U. Roshan, K. St. John, and T. Warnow. The accuracy of fast phylogenetic methods for large datasets. In Proc. 7th Pacific Symp. Biocomputing PSB 2002, pages 211–222. World Scientific Pub., 2002.Google Scholar
- 16.L. Nakhleh, U. Roshan, K. St. John, J. Sun, and T. Warnow. Designing fast converging phylogenetic methods. In Proc. 9th Int’l Conf. on Intelligent Systems for Molecular Biology (ISMB01), volume 17 of Bioinformatics, pages S190–S198. Oxford U. Press, 2001.Google Scholar
- 17.L. Nakhleh, U. Roshan, K. St. John, J. Sun, and T. Warnow. The performance of phylogenetic methods on trees of bounded diameter. In O. Gascuel and B.M.E. Moret, editors, Proc. 1st Int’l Workshop Algorithms in Bioinformatics (WABI’01), pages 214–226. Springer-Verlag, 2001.Google Scholar
- 18.A. Rambaut and N. C. Grassly. Seq-gen: An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comp. Appl. Biosci., 13:235–238, 1997.Google Scholar
- 21.N. Saitou and M. Nei. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol., 4:406–425, 1987.Google Scholar
- 22.M.J. Sanderson. r8s software package. Available from http://ginger.ucdavis.edu/r8s/.
- 24.T. Warnow, B. Moret, and K. St. John. Absolute convergence: true trees from short sequences. Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA 01), pages 186–195, 2001.Google Scholar
- 25.Z. Yang. Maximum likelihood estimation of phylogeny from DNA sequences whensubstitution rates differ over sites. Mol. Biol. Evol., 10:1396–1401, 1993.Google Scholar