Advertisement

Phylogenetic Reconstruction from Gene-Rearrangement Data with Unequal Gene Content

  • Jijun Tang
  • Bernard M. E. Moret
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2748)

Abstract

Phylogenetic reconstruction from gene-rearrangement data has seen increased attention over the last five years. Existing methods are limited computationally and by the assumption (highly unrealistic in practice) that all genomes have the same gene content. We have recently shown that we can scale our reconstruction tool, GRAPPA, to instances with up to a thousand genomes with no loss of accuracy and at minimal computational cost. Computing genomic distances between two genomes with unequal gene contents has seen much progress recently, but that progress has not yet been reflected in phylogenetic reconstruction methods. In this paper, we present extensions to our GRAPPA approach that can handle limited numbers of duplications (one of the main requirements for analyzing genomic data from organelles) and a few deletions. Although GRAPPA is based on exhaustive search, we show that, in practice, our bounding functions suffice to prune away almost all of the search space (our pruning rates never fall below 99.995%), resulting in high accuracy and fast running times. The range of values within which we have tested our approach encompasses mitochondria and chloroplast organellar genomes, whose phylogenetic analysis is providing new insights on evolution.

Keywords

Computational biology phylogenetic reconstruction gene-order data whole-genome data signed permutations lower bounds Hannenhalli-Pevzner theory inversion distance reversal distance edit distance gene duplications experimental assessment 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bader, D.A., Moret, B.M.E., Warnow, T., Wyman, S.K., Yan, M.: GRAPPA (Genome Rearrangements Analysis under Parsimony and other Phylogenetic Algorithms), www.cs.unm.edu/~moret/GRAPPA/
  2. 2.
    Bader, D.A., Moret, B.M.E., Yan, M.: A linear-time algorithm for computing inversion distance between signed permutations with an experimental study. J. Comput. Biol. 8(5), 483–491 (2001); A preliminary version appeared in WADS 2001, pp. 365–376 (2001)CrossRefGoogle Scholar
  3. 3.
    Bourque, G., Pevzner, P.: Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Research 12, 26–36 (2002)Google Scholar
  4. 4.
    Bryant, D.: The complexity of calculating exemplar distances. In: Sankoff, D., Nadeau, J. (eds.) Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics. Map Alignment, and the Evolution of Gene Families, pp. 207–212. Kluwer Academic Pubs., Dordrecht (2000)Google Scholar
  5. 5.
    Bryant, D.: A lower bound for the breakpoint phylogeny problem. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 235–247. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Caprara, A.: Sorting by reversals is difficult. In: Proc. 1st Int’l Conf. on Comput. Mol. Biol. RECOMB 1997, pp. 75–83. ACM Press, New York (1997)Google Scholar
  7. 7.
    Caprara, A.: Formulations and hardness of multiple sorting by reversals. In: Proc. 3rd Int’l Conf. on Comput. Mol. Biol. RECOMB 1999, pp. 84–93. ACM Press, New York (1999)CrossRefGoogle Scholar
  8. 8.
    Caprara, A.: On the practical solution of the reversal median problem. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 238–251. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  9. 9.
    Cosner, M.E., Jansen, R.K., Moret, B.M.E., Raubeson, L.A., Wang, L.-S., Warnow, T., Wyman, S.K.: An empirical comparison of phylogenetic methods on chloroplast gene order data in Campanulaceae. In: Sankoff, D., Nadeau, J. (eds.) Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment, and the Evolution of Gene Families, pp. 99–121. Kluwer Academic Pubs., Dordrecht (2000)Google Scholar
  10. 10.
    Downie, S.R., Palmer, J.D.: Use of chloroplast DNA rearrangements in reconstructing plant phylogeny. In: Soltis, P., Soltis, D., Doyle, J.J. (eds.) Plant Molecular Systematics, pp. 14–35. Chapman and Hall, Boca Raton (1992)Google Scholar
  11. 11.
    El-Mabrouk, N.: Genome rearrangement by reversals and insertions/deletions of contiguous segments. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 222–234. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  12. 12.
    Hannenhalli, S., Pevzner, P.A.: Transforming cabbage into turnip (polynomial algorithm for sorting signed permutations by reversals). In: Summer University of Southern Stockholm 1993, pp. 178–189. ACM Press, New York (1995)CrossRefGoogle Scholar
  13. 13.
    Huson, D., Nettles, S., Rice, K., Warnow, T., Yooseph, S.: The hybrid tree reconstruction method. ACM J. Experimental Algorithmics 4(5) (1999), http://www.jea.acm.org/1999/HusonHybrid/
  14. 14.
    Huson, D., Nettles, S., Warnow, T.: Disk-covering, a fast converging method for phylogenetic tree reconstruction. J. Comput. Biol. 6(3), 369–386 (1999)CrossRefGoogle Scholar
  15. 15.
    Huson, D., Vawter, L., Warnow, T.: Solving large-scale phylogenetic problems using DCM-2. In: Proc. 7th Int’l Conf. on Intelligent Systems for Molecular Biology (ISMB 1999), pp. 118–129. AAAI Press, Menlo Park (1999)Google Scholar
  16. 16.
    Larget, B., Kadane, J.B., Simon, D.: A Markov chain Monte Carlo approach to reconstructing ancestral genome rearrangements. Technical report, Carnegie Mellon University, Pittsburgh, PA (2002), Available at www.stat.cmu.edu/tr/tr765/
  17. 17.
    Marron, M., Swenson, K.M., Moret, B.M.E.: Genomic distances under deletions and insertions. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, Springer, Heidelberg (2003) (accepted, to appear)CrossRefGoogle Scholar
  18. 18.
    Moret, B.M.E., Siepel, A.C., Tang, J., Liu, T.: Inversion medians outperform breakpoint medians in phylogeny reconstruction from gene-order data. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 521–536. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  19. 19.
    Moret, B.M.E., Tang, J., Wang, L.-S., Warnow, T.: Steps toward accurate reconstructions of phylogenies from gene-order data. J. Comput. Syst. Sci. 65(3), 508–525 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Olmstead, R.G., Palmer, J.D.: Chloroplast DNA systematics: a review of methods and data analysis. Amer. J. Bot. 81, 1205–1224 (1994)CrossRefGoogle Scholar
  21. 21.
    Palmer, J.D.: Chloroplast and mitochondrial genome evolution in land plants. In: Herrmann, R. (ed.) Cell Organelles, pp. 99–133. Springer, Heidelberg (1992)Google Scholar
  22. 22.
    Pe’er, I., Shamir, R.: The median problems for breakpoints are NP-complete. Elec. Colloq. on Comput. Complexity 71 (1998)Google Scholar
  23. 23.
    Raubeson, L.A., Jansen, R.K.: Chloroplast DNA evidence on the ancient evolutionary split in vascular land plants. Science 255, 1697–1699 (1992)CrossRefGoogle Scholar
  24. 24.
    Saitou, N., Nei, M.: The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)Google Scholar
  25. 25.
    Sankoff, D.: Genome rearrangement with gene families. Bioinformatics 15(11), 909–917 (1999)CrossRefGoogle Scholar
  26. 26.
    Sankoff, D., Blanchette, M.: Multiple genome rearrangement and breakpoint phylogeny. J. Comput. Biol. 5, 555–570 (1998)CrossRefGoogle Scholar
  27. 27.
    Sankoff, D., Nadeau, J. (eds.): Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment, and the Evolution of Gene Families. Kluwer Academic Pubs., Dordrecht (2000)zbMATHGoogle Scholar
  28. 28.
    Siepel, A.C., Moret, B.M.E.: Finding an optimal inversion median: Experimental results. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 189–203. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  29. 29.
    Tang, J., Moret, B.M.E.: Scaling up accurate phylogenetic reconstruction from gene-order data. In: Proc. 11th Int’l Conf. on Intelligent Systems for Molecular Biology (ISMB 2003). Bioinformatics, Suppl. 1, vol. 19, pp. 305–312. Oxford U. Press, Oxford (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Jijun Tang
    • 1
  • Bernard M. E. Moret
    • 1
  1. 1.Dept. of Computer ScienceUniversity of New MexicoAlbuquerqueUSA

Personalised recommendations