Analyzing Patterns of Microbial Evolution Using the Mauve Genome Alignment System

  • Aaron E Darling
  • Todd J Treangen
  • Xavier Messeguer
  • Nicole T Perna
Part of the Methods In Molecular Biology™ book series (MIMB, volume 396)

Summary

During the course of evolution, genomes can undergo large-scale mutation events such as rearrangement and lateral transfer. Such mutations can result in significant variations in gene order and gene content among otherwise closely related organisms. The Mauve genome alignment system can successfully identify such rearrangement and lateral transfer events in comparisons of multiple microbial genomes even under high levels of recombination. This chapter outlines the main features of Mauve and provides examples that describe how to use Mauve to conduct a rigorous multiple genome comparison and study evolutionary patterns.

Key Words

Microbial evolution sequence alignment comparative genomics genome alignment genome rearrangement lateral transfer Yersinia pestis 

Notes

Acknowledgments

This work was funded in part by National Institutes of Health grant GM62994-02. A.E.D. was supported by NLM grant 5T15LM007359-05. T.J.T. was supported by Spanish Ministry MECD research grant TIN2004-03382-2.

References

  1. 1.
    Needleman, S. B. and Wunsch, C. D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453.CrossRefPubMedGoogle Scholar
  2. 2.
    Smith, T. F. and Waterman, M. S. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.CrossRefPubMedGoogle Scholar
  3. 3.
    Higgins, D. G. and Sharp, P. M. (1988) CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73, 237–244.CrossRefPubMedGoogle Scholar
  4. 4.
    Notredame, C., Higgins, D. G., and Heringa, J. (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217.CrossRefPubMedGoogle Scholar
  5. 5.
    Edgar, R. C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797.CrossRefPubMedGoogle Scholar
  6. 6.
    Lee, C., Grasso, C., and Sharlow, M. F. (2002) Multiple sequence alignment using partial order graphs. Bioinformatics 18, 452–464.CrossRefPubMedGoogle Scholar
  7. 7.
    Abouelhoda, M. I. and Ohlebusch, E. (2003) A local chaining algorithm and its applications in comparative genomics. Algorithms in Bioinformatics, Proceedings 2812, 1–16.CrossRefGoogle Scholar
  8. 8.
    Haas, B. J., Delcher, A. L., Wortman, J. R., and Salzberg, S. L. (2004) DAGchainer: a tool for mining segmental genome duplications and synteny. Bioinformatics 20, 3643–3646.CrossRefPubMedGoogle Scholar
  9. 9.
    Hampson, S. E., Gaut, B. S., and Baldi, P. (2005) Statistical detection of chromosomal homology using shared-gene density alone. Bioinformatics 21, 1339–1348.CrossRefPubMedGoogle Scholar
  10. 10.
    Hampson, S., McLysaght, A., Gaut, B., and Baldi, P. (2003) LineUp: statistical detection of chromosomal homology with application to plant comparative genomics. Genome Res. 13, 999–1010.CrossRefPubMedGoogle Scholar
  11. 11.
    Tesler, G. (2002) GRIMM: genome rearrangements web server. Bioinformatics 18, 492–493.CrossRefPubMedGoogle Scholar
  12. 12.
    Spang, R., Rehmsmeier, M., and Stoye, J. (2002) A novel approach to remote homology detection: Jumping alignments. Journal of Computational Biology 9, 747–760.CrossRefPubMedGoogle Scholar
  13. 13.
    Calabrese, P. P., Chakravarty, S., and Vision, T. J. (2003) Fast identification and statistical evaluation of segmental homologies in comparative maps. Bioinformatics 19, i74–i80.CrossRefPubMedGoogle Scholar
  14. 14.
    Darling, A. E., Mau, B., Blattner, F. R., and Perna, N. T. (2004) GRIL: genome rearrangement and inversion locator. Bioinformatics 20, 122–124.CrossRefPubMedGoogle Scholar
  15. 15.
    Durbin, R. (1998) Biological Sequence Analysis: Probabalistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, UK, pp. xi, 356.Google Scholar
  16. 16.
    Fitch, W. M. (2000) Homology a personal view on some of the problems. Trends Genet. 16, 227–231.CrossRefPubMedGoogle Scholar
  17. 17.
    Larget, B., Kadane, J. B., and Simon, D. L. (2005) A Bayesian approach to the estimation of ancestral genome arrangements. Mol. Phylogenet. Evol. 36, 214–223.CrossRefPubMedGoogle Scholar
  18. 18.
    Bourque, G. and Pevzner, P. A. (2002) Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res. 12, 26–36.PubMedGoogle Scholar
  19. 19.
    Wu, S. and Gu, X. (2003) Algorithms for multiple genome rearrangement by signed reversals. Pac. Symp. Biocomput. 363–374.Google Scholar
  20. 20.
    Lu, C. L., Wang, T. C., Lin, Y. C., and Tang, C. Y. (2005) ROBIN: a tool for genome rearrangement of block-interchanges. Bioinformatics 21, 2780–2782.CrossRefPubMedGoogle Scholar
  21. 21.
    Yancopoulos, S., Attie, O., and Friedberg, R. (2005) Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21, 3340–3346.CrossRefPubMedGoogle Scholar
  22. 22.
    Holder, M. and Lewis, P. O. (2003) Phylogeny estimation: traditional and Bayesian approaches. Nat. Rev. Genet. 4, 275–284.CrossRefPubMedGoogle Scholar
  23. 23.
    Yang, Z., Ro, S., and Rannala, B. (2003) Likelihood models of somatic mutation and codon substitution in cancer genes. Genetics 165, 695–705.PubMedGoogle Scholar
  24. 24.
    Lunter, G., Ponting, C. P., and Hein, J. (2006) Genome-Wide Identification of Human Functional DNA Using a Neutral Indel Model. PLoS Comput. Biol. 2, e5.CrossRefPubMedGoogle Scholar
  25. 25.
    Darling, A. C., Mau, B., Blattner, F. R., and Perna, N. T. (2004) Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 14, 1394–1403.CrossRefPubMedGoogle Scholar
  26. 26.
    Brudno, M., Malde, S., Poliakov, A., et al. (2003) Glocal alignment: finding rearrangements during alignment. Bioinformatics 19, i54–i62.CrossRefPubMedGoogle Scholar
  27. 27.
    Ovcharenko, I., Loots, G. G., Giardine, B. M., et al. (2005) Mulan: multiple-sequence local alignment and visualization for studying function and evolution. Genome Res. 15, 184–194.CrossRefPubMedGoogle Scholar
  28. 28.
    Treangen, T. J. and Messeguer, X. (2006) M-GCAT: interactively and efficiency constructing large-scale multiple genome comparision frameworks in closely related species. BMC Bioinformatics 7, 433.CrossRefPubMedGoogle Scholar
  29. 29.
    Kurtz, S., Phillippy, A., Delcher, A. L., et al. (2004) Versatile and open software for comparing large genomes. Genome Biol. 5, R12.CrossRefPubMedGoogle Scholar
  30. 30.
    Hohl, M., Kurtz, S., and Ohlebusch, E. (2002) Efficient multiple genome alignment. Bioinformatics 18, S312–S320.PubMedGoogle Scholar
  31. 31.
    Blanchette, M., Kent, W. J., Riemer, C., et al. (2004) Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715.CrossRefPubMedGoogle Scholar
  32. 32.
    Bray, N. and Pachter, L. (2004) MAVID: constrained ancestral alignment of multiple sequences. Genome Res. 14, 693–699.CrossRefPubMedGoogle Scholar
  33. 33.
    Brudno, M., Do, C. B., Cooper, G. M., et al. (2003) LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. 13, 721–731.CrossRefPubMedGoogle Scholar
  34. 34.
    Choi, K. P., Zeng, F., and Zhang, L. (2004) Good spaced seeds for homology search. Bioinformatics 20, 1053–1059.CrossRefPubMedGoogle Scholar
  35. 35.
    Henz, S. R., Huson, D. H., Auch, A. F., Nieselt-Struwe, K., and Schuster, S. C. (2005) Whole-genome prokaryotic phylogeny. Bioinformatics 21, 2329–2335.CrossRefPubMedGoogle Scholar
  36. 36.
    Saitou, N. and Nei, M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425.PubMedGoogle Scholar
  37. 37.
    Cleri, D. J., Vernaleo, J. R., Lombardi, L. J., et al. (1997) Plague pneumonia disease caused by Yersinia pestis. Semin. Respir. Infect. 12, 12–23.PubMedGoogle Scholar
  38. 38.
    Carniel, E. (2003) Evolution of pathogenic Yersinia, some lights in the dark. Adv. Exp. Med. Biol. 529, 3–12.CrossRefPubMedGoogle Scholar
  39. 39.
    Chain, P. S., Carniel, E., Larimer, F. W., et al. (2004) Insights into the evolution of Yersinia pestis through whole-genome comparison with Yersinia pseudotuberculosis. Proc. Natl. Acad. Sci. USA 101, 13,826–13,831.CrossRefGoogle Scholar
  40. 40.
    Hinnebusch, B. J. (2005) The evolution of flea-borne transmission in Yersinia pestis. Curr. Issues Mol. Biol. 7, 197–212.PubMedGoogle Scholar
  41. 41.
    Perna, N. T., Plunkett, G., 3rd, Burland, V., et al. (2001) Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature 409, 529–533.CrossRefPubMedGoogle Scholar
  42. 42.
    Hsiao, W. W., Ung, K., Aeschliman, D., Bryan, J., Finlay, B. B., and Brinkman, F. S. (2005) Evidence of a large novel gene pool associated with prokaryotic genomic islands. PLoS Genet. 1, e62.CrossRefPubMedGoogle Scholar
  43. 43.
    Tettelin, H., Masignani, V., Cieslewicz, M. J., et al. (2005) Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proc. Natl. Acad. Sci. USA 102, 13,950–13,955.Google Scholar
  44. 44.
    Terzian, C., Ferraz, C., Demaille, J., and Bucheton, A. 2000) Evolution of the Gypsy endogenous retrovirus in the Drosophila melanogaster subgroup. Mol. Biol. Evol. 17, 908–914.PubMedGoogle Scholar
  45. 45.
    Lord, P. W., Selley, J. N., and Attwood, T. K. (2002) CINEMA-MX: a modular multiple alignment editor. Bioinformatics 18, 1402–1403.CrossRefPubMedGoogle Scholar

Copyright information

© Humana Press Inc. 2007

Authors and Affiliations

  • Aaron E Darling
    • 1
  • Todd J Treangen
    • 2
  • Xavier Messeguer
    • 3
  • Nicole T Perna
    • 4
  1. 1.Department of Computer SciencUniversity of Wisconsin-MadisonMadison
  2. 2.Department of SoftwareTechnical University of Catalonia-BarcelonaBarcelona
  3. 3.Department of SoftwareTechnical University of Catalonia-Barcelona, Barcelona Supercomputing Center (BSC)Barcelona
  4. 4.Department of Animal Health and Biomedical Sciences Genome CenterUniversity of Wisconsin-MadisonBarcelona

Personalised recommendations