Maximal Accurate Forests from Distance Matrices

  • Constantinos Daskalakis
  • Cameron Hill
  • Alexandar Jaffe
  • Radu Mihaescu
  • Elehanan Mossel
  • Satish Rao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3909)

Abstract

We present a fast converging method for distance-based phylogenetic inference, which is novel in two respects. First, it is the only method (to our knowledge) to guarantee accuracy when knowledge about the model tree, i.e bounds on the edge lengths, is not assumed. Second, our algorithm guarantees that, with high probability, no false assertions are made. The algorithm produces a maximal forest of the model tree, in time Õ(n3) in the typical case. Empirical testing has been promising, comparing favorably to Neighbor Joining, with the advantage of making few or no false assertions about the topology of the model tree; guarantees against false positives can be controlled as a parameter by the user.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Buneman, P.: The recovery of trees from measures of dissimilarity. In: Mathematics in the Archaeological and Historical Sciences, pp. 387–395. Edinburgh University Press, Edinburgh (1971)Google Scholar
  2. 2.
    Cavender, J.: Taxonomy with confidence. Mathematical Biosciences 40, 271–280 (1978)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Day, W.: Optimal algorithms for comparing trees with labelled leaves. J. Class. 2, 7–28 (1995)CrossRefGoogle Scholar
  4. 4.
    Erdos, P., Steel, M., Szekely, L., Warnow, T.: A few logs suffice to build (almost) all trees (part 1). Random Structures and Algorithms 14(2), 153–184 (1999)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Erdos, P., Steel, M., Szekely, L., Warnow, T.: A few logs suffice to build (almost) all trees (part 2). Theoretical Computer Science 221, 77–118 (1999)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Farris, J.: A probability model for inferring evolutionary trees. Systematic Zoology 22, 250–256 (1973)CrossRefGoogle Scholar
  7. 7.
    Golumbic, M.: Algorithmic Graph Theory and Perfect Graphs. Academic Press, New York (1980)MATHGoogle Scholar
  8. 8.
    Huson, D., Nettles, S., Warnow, T.: Disk-Covering, A fast converging method for phylogenetic tree reconstruction. Journal of Computational Biology 6, 369–386 (1999)CrossRefGoogle Scholar
  9. 9.
    Mossel, E.: Distorted metrics on trees and phylogenetic forests. IEEE Comp. Biol. and Bioinformatics (to appear, 2004), Availible at: http://arxiv.org/abs/math.CO/0403508
  10. 10.
    Mossel, E.: Phase Transitions in Phylogeny. Trans. Amer. Math. Soc. 356(6), 2379–2404 (2004) (electronic)Google Scholar
  11. 11.
    Neyman, J.: Molecular studies of evolution: a source of novel statistical problems. In: Gupta, S., Yackel, J. (eds.) Statistical Decision Theory and Related Topics. Academic Press, New York (1971)Google Scholar
  12. 12.
    Saitou, N., Nei, M.: The neighbor-joing method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)Google Scholar
  13. 13.
    Usman, R., Moret, B., Warnow, T., Williams, T.: Rec-I-DCM3: A fast algorithmic technique for reconstructing large phylogenetic trees. In: Proc. IEEE Computer Society Bioinformatics Conference CSB 2004. Stanford Univ. (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Constantinos Daskalakis
    • 1
  • Cameron Hill
    • 1
  • Alexandar Jaffe
    • 1
  • Radu Mihaescu
    • 1
  • Elehanan Mossel
    • 1
  • Satish Rao
    • 1
  1. 1.University of CaliforniaBerkeley

Personalised recommendations