Skip to main content

Alignment-Free Phylogenetic Reconstruction

  • Conference paper
Book cover Research in Computational Molecular Biology (RECOMB 2010)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6044))

Abstract

We introduce the first polynomial-time phylogenetic reconstruction algorithm under a model of sequence evolution allowing insertions and deletions (or indels). Given appropriate assumptions, our algorithm requires sequence lengths growing polynomially in the number of leaf taxa. Our techniques are distance-based and largely bypass the problem of multiple alignment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Thorne, J.L., Kishino, H., Felsenstein, J.: An evolutionary model for maximum likelihood alignment of dna sequences. Journal of Molecular Evolution 33(2), 114–124 (1991)

    Article  Google Scholar 

  2. Thorne, J.L., Kishino, H., Felsenstein, J.: Inching toward reality: An improved likelihood model of sequence evolution. Journal of Molecular Evolution 34(1), 3–16 (1992)

    Article  Google Scholar 

  3. Loytynoja, A., Goldman, N.: Phylogeny-Aware Gap Placement Prevents Errors in Sequence Alignment and Evolutionary Analysis. Science 320(5883), 1632–1635 (2008)

    Article  Google Scholar 

  4. Wong, K.M., Suchard, M.A., Huelsenbeck, J.P.: Alignment Uncertainty and Genomic Analysis. Science 319(5862), 473–476 (2008)

    Article  MathSciNet  Google Scholar 

  5. Metzler, D.: Statistical alignment based on fragment insertion and deletion models. Bioinformatics 19(4), 490–499 (2003)

    Article  Google Scholar 

  6. Miklos, I., Lunter, G.A., Holmes, I.: A ”Long Indel” Model For Evolutionary Sequence Alignment. Mol. Biol. Evol. 21(3), 529–540 (2004)

    Article  Google Scholar 

  7. Suchard, M.A., Redelings, B.D.: BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny. Bioinformatics 22(16), 2047–2048 (2006)

    Article  Google Scholar 

  8. Rivas, E., Eddy, S.R.: Probabilistic phylogenetic inference with insertions and deletions. PLoS Comput. Biol. 4, e1000172 (2008)

    Google Scholar 

  9. Liu, K., Raghavan, S., Nelesen, S., Linder, C.R., Warnow, T.: Rapid and Accurate Large-Scale Coestimation of Sequence Alignments and Phylogenetic Trees. Science 324(5934), 1561–1564 (2009)

    Article  Google Scholar 

  10. Felsenstein, J.: Cases in which parsimony or compatibility methods will be positively misleading. Syst. Biol., 401–410 (1978)

    Google Scholar 

  11. Erdös, P.L., Steel, M.A., Székely, L.A., Warnow, T.A.: A few logs suffice to build (almost) all trees (part 1). Random Struct. Algor. 14(2), 153–184 (1999)

    Article  MATH  Google Scholar 

  12. Semple, C., Steel, M.: Phylogenetics. Mathematics and its Applications series, vol. 22. Oxford University Press, Oxford (2003)

    MATH  Google Scholar 

  13. Graur, D., Li, W.-H.: Fundamentals of Molecular Evolution, 2nd edn. Sinauer Associates, Inc., Sunderland (1999)

    Google Scholar 

  14. Felsenstein, J.: Inferring Phylogenies. Sinauer, New York (2004)

    Google Scholar 

  15. Atteson, K.: The performance of neighbor-joining methods of phylogenetic reconstruction. Algorithmica 25(2-3), 251–278 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  16. Erdös, P.L., Steel, M.A., Székely, L.A., Warnow, T.A.: A few logs suffice to build (almost) all trees (part 2). Theor. Comput. Sci. 221, 77–118 (1999)

    Article  MATH  Google Scholar 

  17. Huson, D.H., Nettles, S.H., Warnow, T.J.: Disk-covering, a fast-converging method for phylogenetic tree reconstruction. J. Comput. Biol. 6(3–4) (1999)

    Google Scholar 

  18. Steel, M.A., Székely, L.A.: Inverting random functions. Ann. Comb. 3(1), 103–113 (1999); Combinatorics and biology (Los Alamos, NM, 1998)

    Google Scholar 

  19. Csurös, M., Kao, M.Y.: Provably fast and accurate recovery of evolutionary trees through harmonic greedy triplets. SIAM Journal on Computing 31(1), 306–322 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  20. Csurös, M.: Fast recovery of evolutionary trees with thousands of nodes. J. Comput. Biol. 9(2), 277–297 (2002)

    Article  Google Scholar 

  21. Steel, M.A., Székely, L.A.: Inverting random functions. II. Explicit bounds for discrete maximum likelihood estimation, with applications. SIAM J. Discrete Math. 15(4), 562–575 (2002) (electronic)

    Google Scholar 

  22. King, V., Zhang, L., Zhou, Y.: On the complexity of distance-based evolutionary tree reconstruction. In: SODA 2003: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 444–453. Society for Industrial and Applied Mathematics, Philadelphia (2003)

    Google Scholar 

  23. Mossel, E., Roch, S.: Learning nonsingular phylogenies and hidden Markov models. Ann. Appl. Probab. 16(2), 583–614 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  24. Daskalakis, C., Mossel, E., Roch, S.: Optimal phylogenetic reconstruction. In: STOC 2006: Proceedings of the 38th Annual ACM Symposium on Theory of Computing, pp. 159–168. ACM Press, New York (2006)

    Chapter  Google Scholar 

  25. Lacey, M.R., Chang, J.T.: A signal-to-noise analysis of phylogeny estimation by neighbor-joining: insufficiency of polynomial length sequences. Math. Biosci. 199(2), 188–215 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  26. Daskalakis, C., Hill, C., Jaffe, A., Mihaescu, R., Mossel, E., Rao, S.: Maximal accurate forests from distance matrices. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, pp. 281–295. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  27. Mossel, E.: Distorted metrics on trees and phylogenetic forests. IEEE/ACM Trans. Comput. Bio. Bioinform. 4(1), 108–116 (2007)

    Article  MathSciNet  Google Scholar 

  28. Gronau, I., Moran, S., Snir, S.: Fast and reliable reconstruction of phylogenetic trees with very short edges. In: Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 379–388. Society for Industrial and Applied Mathematics, Philadelphia (2008)

    Google Scholar 

  29. Roch, S.: Sequence-length requirement for distance-based phylogeny reconstruction: Breaking the polynomial barrier. In: FOCS, pp. 729–738 (2008)

    Google Scholar 

  30. Daskalakis, C., Mossel, E., Roch, S.: Phylogenies without branch bounds: Contracting the short, pruning the deep. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 451–465. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  31. Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. Journal of Computational Biology 1(4), 337–348 (1994)

    Article  Google Scholar 

  32. Elias, I.: Settling the intractability of multiple alignment. Journal of Computational Biology 13(7), 1323–1339 (2006) PMID: 17037961

    Google Scholar 

  33. Higgins, D.G., Sharp, P.M.: Clustal: a package for performing multiple sequence alignment on a microcomputer. Gene 73(1), 237–244 (1988)

    Article  Google Scholar 

  34. Katoh, K., Misawa, K., Kuma, K.: MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucl. Acids Res. 30(14), 3059–3066 (2002)

    Article  Google Scholar 

  35. Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucl. Acids Res. 32(5), 1792–1797 (2004)

    Article  Google Scholar 

  36. Thatte, B.D.: Invertibility of the TKF model of sequence evolution. Math. Biosci. 200(1), 58–75 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  37. Andoni, A., Daskalakis, C., Hassidim, A., Roch, S.: Trace reconstruction on a tree (2009) (Preprint)

    Google Scholar 

  38. Hohl, M., Ragan, M.A.: Is Multiple-Sequence Alignment Required for Accurate Inference of Phylogeny? Syst. Biol. 56(2), 206–221 (2007)

    Article  Google Scholar 

  39. Karlin, S., Taylor, H.M.: A second course in stochastic processes, p. 542. Academic Press Inc.[Harcourt Brace Jovanovich Publishers], New York (1981)

    MATH  Google Scholar 

  40. Buneman, P.: The recovery of trees from measures of dissimilarity. In: Mathematics in the Archaelogical and Historical Sciences, pp. 187–395. Edinburgh University Press, Edinburgh (1971)

    Google Scholar 

  41. Athreya, K.B., Ney, P.E.: Branching processes. Springer, New York (1972); Die Grundlehren der mathematischen Wissenschaften, Band 196

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Daskalakis, C., Roch, S. (2010). Alignment-Free Phylogenetic Reconstruction. In: Berger, B. (eds) Research in Computational Molecular Biology. RECOMB 2010. Lecture Notes in Computer Science(), vol 6044. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12683-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12683-3_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12682-6

  • Online ISBN: 978-3-642-12683-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics