Quartet-Based Phylogeny Reconstruction from Gene Orders

  • Tao Liu
  • Jijun Tang
  • Bernard M. E. Moret
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3595)

Abstract

Phylogenetic reconstruction from gene-rearrangement data is attracting increasing attention from biologists and computer scientists. Methods used in reconstruction include distance-based methods, parsimony methods using sequence encodings, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach; however, its exhaustive approach means that it can be applied only to small datasets of fewer than 15 taxa. While we have successfully scaled it up to 1,000 genomes by integrating it with a disk-covering method (DCM-GRAPPA), the recursive decomposition may need many levels of recursion to handle datasets with 1,000 or more genomes. We thus investigated quartet-based approaches, which directly decompose the datasets into subsets of four taxa each; such approaches have been well studied for sequence data, but not for gene-rearrangement data. We give an optimization algorithm for the NP-hard problem of computing optimal trees for each quartet, present a variation of the dyadic method (using heuristics to choose suitable short quartets), and use both in simulation studies. We find that our quartet-based method can handle more genomes than the base version of GRAPPA, thus enabling us to reduce the number of levels of recursion in DCM-GRAPPA, but is more sensitive to the rate of evolution, with error rates rapidly increasing when saturation is approached.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Berry, V., Jiang, T., Kearney, P., Li, M., Wareham, T.: Quartet cleaning: improved algorithms and simulations. In: Nešetřil, J. (ed.) ESA 1999. LNCS, vol. 1643, pp. 313–324. Springer, Heidelberg (1999)Google Scholar
  2. 2.
    Blanchette, M., Bourque, G., Sankoff, D.: Breakpoint phylogenies. In: Miyano, S., Takagi, T. (eds.) Genome Informatics 1997, pp. 25–34. Univ. Academy Press, Tokyo (1997)Google Scholar
  3. 3.
    Bryant, D., Berry, V., Jiang, T., Kearney, P., Li, M., Wareham, T., Zhang, H.: A practical algorithm for recovering the best supported edges of an evolutionary tree. In: Proc. 11th Ann. ACM/SIAM Symp. Discrete Algs (SODA 2000), pp. 287–296. ACM Press, New York (2000)Google Scholar
  4. 4.
    Buneman, P.: The recovery of trees from measures of dissimilarity. Edinburgh University Press (1971)Google Scholar
  5. 5.
    Caprara, A.: Formulations and hardness of multiple sorting by reversals. In: Proc. 3rd Ann. Int’l Conf. Comput. Mol. Biol (RECOMB 1999), pp. 84–93. ACM Press, New York (1999)Google Scholar
  6. 6.
    Cosner, M.E., Jansen, R.K., Moret, B.M.E., Raubeson, L.A., Wang, L., Warnow, T., Wyman, S.K.: An empirical comparison of phylogenetic methods on chloroplast gene order data in Campanulaceae. In: Sankoff, D., Nadeau, J.H. (eds.) Comparative Genomics, pp. 99–122. Kluwer Academic Publishers, Dordrecht (2000)CrossRefGoogle Scholar
  7. 7.
    Downie, S.R., Palmer, J.D.: Use of chloroplast DNA rearrangements in reconstructing plant phylogeny. In: Soltis, P., Soltis, D., Doyle, J.J. (eds.) Plant Molecular Systematics, pp. 14–35. Chapman and Hall, Boca Raton (1992)CrossRefGoogle Scholar
  8. 8.
    Erdös, P., Steel, M.A., Székely, L.A., Warnow, T.: A few logs suffice to build (almost) all trees I. Random Structs. and Algs. 14, 153–184 (1997)CrossRefGoogle Scholar
  9. 9.
    Erdös, P.L., Steel, M.A., Székely, L.A., Warnow, T.: Constructing big trees from short sequences. In: Degano, P., Gorrieri, R., Marchetti-Spaccamela, A. (eds.) ICALP 1997. LNCS, vol. 1256, Springer, Heidelberg (1997)Google Scholar
  10. 10.
    Erdös, P.L., Steel, M.A., Székely, L.A., Warnow, T.: Local quartet splits of a binary tree infer all quartet splits via one dyadic inference rule. Computers and Artif. Intell. 16(2), 217–227 (1997)MATHGoogle Scholar
  11. 11.
    Huson, D., Nettles, S., Warnow, T.: Disk-covering, a fast converging method for phylogenetic tree reconstruction. J. Comput. Biol. 6(3), 369–386 (1999)CrossRefGoogle Scholar
  12. 12.
    Jiang, T., Kearney, P.E., Li, M.: A polynomial-time approximation scheme for inferring evolutionary trees from quartet topologies and its application. SIAM J. Computing 30(6), 1942–1961 (2001)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Kearney, P.E.: The ordinal quartet method. In: Proc. 2nd Ann. Int’l Conf. Comput. Mol. Biol (RECOMB 1998), pp. 125–134. ACM Press, New York (1998)Google Scholar
  14. 14.
    Moret, B.M.E., Siepel, A.C., Tang, J., Liu, T.: Inversion medians outperform breakpoint medians in phylogeny reconstruction from gene-order data. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 521–536. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Moret, B.M.E., Tang, J., Wang, L.-S., Warnow, T.: Steps toward accurate reconstructions of phylogenies from gene-order data. J. Comput. Syst. Sci. 65(3), 508–525 (2002)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Moret, B.M.E., Tang, J., Warnow, T.: Reconstructing phylogenies from gene-content and gene-order data. In: Gascuel, O. (ed.) Mathematics of Evolution and Phylogeny, pp. 321–352. Oxford University Press, Oxford (2005)Google Scholar
  17. 17.
    Palmer, J.D.: Chloroplast and mitochondrial genome evolution in land plants. In: Herrmann, R. (ed.) Cell Organelles, pp. 99–133. Springer, Heidelberg (1992)CrossRefGoogle Scholar
  18. 18.
    Raubeson, L.A., Jansen, R.K.: Chloroplast DNA evidence on the ancient evolutionary split in vascular land plants. Science 255, 1697–1699 (1992)CrossRefGoogle Scholar
  19. 19.
    Rokas, A., Holland, P.W.H.: Rare genomic changes as a tool for phylogenetics. Trends in Ecol. and Evol. 15, 454–459 (2000)CrossRefGoogle Scholar
  20. 20.
    Roshan, U., Moret, B.M.E., Warnow, T., Williams, T.L.: Performance of supertree methods on various dataset decompositions. In: Bininda-Edmonds, O.R.P. (ed.) Phylogenetic Supertrees: Combining information to reveal the Tree of Life, pp. 301–328. Kluwer Academic Publishers, Dordrecht (2004)CrossRefGoogle Scholar
  21. 21.
    Saitou, N., Nei, M.: The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)Google Scholar
  22. 22.
    Sankoff, D., Blanchette, M.: The median problem for breakpoints in comparative genomics. In: Jiang, T., Lee, D.T. (eds.) COCOON 1997. LNCS, vol. 1276, pp. 251–264. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  23. 23.
    Sankoff, D., Blanchette, M.: Multiple genome rearrangement and breakpoint phylogeny. J. Comput. Biol. 5, 555–570 (1998)CrossRefGoogle Scholar
  24. 24.
    John, K.S., Warnow, T., Moret, B.M.E., Vawter, L.: Performance study of phylogenetic methods (unweighted) quartet methods and neighbor-joining. J. Algorithms 48(1), 173–193 (2003)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Strimmer, K., von Haeseler, A.: Quartet puzzling: A quartet maximum likelihood method for reconstructing tree topologies. Mol. Biol. Evol. 13, 964–969 (1996)CrossRefGoogle Scholar
  26. 26.
    Swenson, K.M., Marron, M., Earnest-DeYoung, J.V., Moret, B.M.E.: Approximating the true evolutionary distance between two genomes. In: Proc. 7th SIAM Workshop on Algorithm Engineering & Experiments (ALENEX 2005), SIAM Press, Philadelphia (2005)Google Scholar
  27. 27.
    Swofford, D.L., Olsen, G.J., Waddell, P.J., Hillis, D.M.: Phylogenetic inference. In: Hillis, D.M., Mable, B.K., Moritz, C. (eds.) Molecular Systematics, pp. 407–514. Sinauer Assoc., SunderlandGoogle Scholar
  28. 28.
    Tang, J., Moret, B.M.E.: Linear programming for phylogenetic reconstruction based on gene rearrangements. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 406–416. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  29. 29.
    Tang, J., Moret, B.M.E.: Scaling up accurate phylogenetic reconstruction from gene-order data. In: Proc. 11th Int’l Conf. on Intelligent Systems for Mol. Biol (ISMB 2003). Bioinformatics, vol. 19, pp. i305–i312. Oxford U. Press, Oxford (2003)Google Scholar
  30. 30.
    Tang, J., Moret, B.M.E., Cui, L., dePamphilis, C.W.: Phylogenetic reconstruction from arbitrary gene-order data. In: Proc. 4th IEEE Symp. on Bioinformatics and Bioengineering BIBE 2004, IEEE Press, Piscataway (2004)Google Scholar
  31. 31.
    Wang, L.-S., Jansen, R.K., Moret, B.M.E., Raubeson, L.A., Warnow, T.: Fast phylogenetic methods for genome rearrangement evolution: An empirical study. In: Proc. 7th Pacific Symp. on Biocomputing (PSB 2002), pp. 524–535. World Scientific Pub., Singapore (2002)Google Scholar
  32. 32.
    Warnow, T., Moret, B.M.E., John, K.S.: Absolute convergence: true trees from short sequences. In: Proc. 12th Ann. ACM/SIAM Symp. Discrete Algs (SODA 2001), pp. 186–195. SIAM Press, Philadelphia (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Tao Liu
    • 1
  • Jijun Tang
    • 2
  • Bernard M. E. Moret
    • 1
  1. 1.Department of Computer ScienceU. of New MexicoAlbuquerqueUSA
  2. 2.Department of Computer Science & EngineeringU. of South CarolinaColumbiaUSA

Personalised recommendations