Abstract
We describe a new method for reliably identifying conserved segments among genome sequences that have undergone rearrangement, horizontal transfer, and substantial nucleotide-level divergence. A Gibbs-like sampler explores different combinations of sequence-based markers shared by the genomes under study. The sampler assigns each marker a posterior probability based on how frequently it participates in some collinear group of markers. Markers with high p.p. values are likely members of conserved segments. The method identifies both large-scale and local trends in segmental collinearity, providing suitable input for genome alignment and rearrangement history inference tools. Applying our method to genomes of four Streptococci reveals that rearranged segments in these organisms belong in two size categories: large conserved segments that are interrupted by a staccato of single gene or operon-size small segments. The rearrangement pattern of large segments is best explained by symmetric inversions about the origin of replication while the pattern of small segments is not.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Nadeau, J.H., Taylor, B.A.: Lengths of chromosomal segments conserved sincedivergence of man and mouse. Proc. Natl. Acad. Sci. U.S.A. 81, 814–818 (1984)
Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P., et al.: Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)
Pevzner, P., Tesler, G.: Genome rearrangements in Mammalian evolution: lessons from human and mouse genomes. Genome Res. 13(2), 37–45 (2003)
Schmid, M.B., Roth, J.R.: Selection and endpoint distribution of bacterial inversion mutations. Genetics 105, 539–557 (1983)
Eisen, J.A., Heidelberg, J.F., White, O., Salzberg, S.L.: Evidence of symmetric chromosomal inversions around the replication origin in bacteria. Genome Biology 1, 1–9 (2000)
Tillier, E.R., Collins, R.A.: Genome rearrangement by replication-directed translocation. Nat. Genet. 26, 195–197 (2000)
Blanchette, M., Kunisawa, T., Sankoff, D.: Gene order breakpoint evidence in animal mitochondrial phylogeny. J. Mol. Evol. 49, 193–203 (1999)
Fitch, W.M.: Homology a personal view on some of the problems. Trends Genet. 16, 227–231 (2000)
Darling, A.C.E., Mau, B., Blattner, F.R., Perna, N.T.: Mauve: Multiple Alignment of Conserved Genomic Sequence with Rearrangements. Genome Res. 14, 1394–1403 (2004)
Calabrese, P.P., Chakravarty, S., Vision, T.J.: Fast identification and statistical evaluation of segmental homologies in comparative maps. Bioinformatics 19(suppl.), 74–80 (2003)
Darling, A., Mau, B., Blattner, F.R., Perna, N.T.: Genome rearrangement and inversion locator. Bioinformatics 20, 122–124 (2003)
Buhler, J.: Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics 17, 419–428 (2001)
Ma, B., Tromp, J., Li, M.: PatternHunter: faster and more sensitive homology search. Bioinformatics 18, 440–445 (2002)
Brudno, M., Steinkamp, R., Morgenstern, B.: The CHAOS/DIALIGN WWW server for multiple alignment of genomic sequences. Nucleic Acids Res. (Web Server issue), W41–W44 (2004)
Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R.C., Haussler, D., Miller, W.: Human-mouse alignments with BLASTZ. Genome Res. 13, 103–107 (2003)
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)
Besag, J.: Spatial Interaction and the Statistical Analysis of Lattice Systems. Journal of the Royal Statistical Society Series B 36, 192–236 (1974)
Tierney, L.: Markov chains for exploring posterior distributions. Annals of Statistics 22, 1701–1762 (1994)
Ferretti, J.J., McShan, W.M., Ajdic, D., Savic, D.J., Savic, G., Lyon, K., Primeaux, C., Sezate, S., Suvorov, A.N., Kenton, S., et al.: Complete genome sequence of an M1 strain of Streptococcus pyogenes. Proc. Natl. Acad. Sci. U.S.A. 98, 4658–4663 (2001)
Tettelin, H., Masignani, V., Cieslewicz, M.J., Eisen, J.A., Peterson, S., Wessels, M.R., Paulsen, I.T., Nelson, K.E., Margarit, I., Read, T.D., et al.: Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae. Proc. Natl. Acad. Sci. U.S.A. 99, 12391–12396 (2002)
Tettelin, H., Nelson, K.E., Paulsen, I.T., Eisen, J.A., Read, T.D., Peterson, S., Heidelberg, J., DeBoy, R.T., Haft, D.H., Dodson, R.J., et al.: Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science 293, 498–506 (2001)
Ajdic, D., McShan, W.M., McLaughlin, R.E., Savic, G., Chang, J., Carson, M.B., Primeaux, C., Tian, R., Kenton, S., Jia, H., et al.: Genome sequence of Streptococcus mutans UA159, a cariogenic dental pathogen. Proc. Natl. Acad. Sci. U.S.A. 99, 14434–14439 (2002)
Smoot, J.C., Barbian, K.D., Van Gompel, J.J., Smoot, L.M., Chaussee, M.S., Sylva, G.L., Sturdevant, D.E., Ricklefs, S.M., Porcella, S.F., Parkins, L.D., et al.: Genome sequence and comparative microarray analysis of serotype M18 group A Streptococcus strains associated with acute rheumatic fever outbreaks. Proc. Natl. Acad. Sci. U.S.A. 99, 4668–4673 (2002)
Ferretti, J.J., Ajdic, D., McShan, W.M.: Comparative genomics of streptococcal species. Indian J. Med. Res. 119(suppl.), 1–6 (2004)
Tatusov, R.L., Koonin, E.V., Lipman, D.J.: A genomic perspective on protein families. Science 278, 631–637 (1997)
Tatusov, R.L., Natale, D.A., Garkavtsev, I.V., Tatusova, T.A., Shankavaram, U.T., Rao, B.S., Kiryutin, B., Galperin, M.Y., Fedorova, N.D., Koonin, E.V.: The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 29, 22–28 (2001)
Rogozin, I.B., Makarova, K.S., Murvai, J., Czabarka, E., Wolf, Y.I., Tatusov, R.L., Szekely, L.A., Koonin, E.V.: Connected gene neighborhoods in prokaryotic genomes. Nucleic Acids Res. 30, 2212–2223 (2002)
Omelchenko, M.V., Makarova, K.S., Wolf, Y.I., Rogozin, I.B., Koonin, E.V.: Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ. Genome Biol. 4, R55 (2003)
Hampson, S., McLysaght, A., Gaut, B.S., Baldi, P.F.: LineUp: Statistical Detection of Chromosomal Homology with Application to Plant Comparative Genomics. Genome Research 13, 999–1010 (2003)
Durand, D., Sankoff, D.: Tests for Gene Clustering. Journal of Computational Biology 10, 453–482 (2003)
Guijo, M.I., Patte, J., del Mar Campos, M., Louarn, J.M., Rebollo, J.E.: Localized Remodeling of the Escherichia coli Chromosome. The patchwork of segments refractory and tolerant to inversion near the replication terminus. Genetics 157, 1413–1423 (2001)
Ajana, Y., Lefebvre, J.F., Tillier, E., El-Mabrouk, N.: Exploring the set of all minimal sequences of reversals - An application to test the replication-directed reversal hypothesis. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 300–315. Springer, Heidelberg (2002)
Lefebvre, J.F., El-Mabrouk, N., Tillier, E.R., Sankoff, D.: Detection and validation of single gene inversions. Bioinformatics 19(suppl.1) special issue (2003); 11th International Conference on Intelligent Systems for Molecular Biology, pp. 190–196 (2003)
Tesler, G.: GRIMM: genome rearrangements web server. Bioinformatics 18, 492–493 (2002)
Bourque, G., Pevzner, P.A.: Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res. 12, 26–36 (2002)
Deng, W., Burland, V., Plunkett III, G., Boutin, A., Mayhew, G.F., Liss, P., Perna, N.T., Rose, D.J., Mau, B., Zhou, S., Schwartz, D.C., Fetherston, J.D., Lindler, L.E., Brubaker, R.R., Plano, G.V., Straley, S.C., McDonough, K.A., Nilles, M.L., Matson, J.S., Blattner, F.R., Perry, R.: Genome sequence of Yersinia pestis KIM. J. Bacteriol. 184, 4601–4611 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mau, B., Darling, A.E., Perna, N.T. (2005). Identifying Evolutionarily Conserved Segments Among Multiple Divergent and Rearranged Genomes. In: Lagergren, J. (eds) Comparative Genomics. RCG 2004. Lecture Notes in Computer Science(), vol 3388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-32290-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-32290-0_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24455-4
Online ISBN: 978-3-540-32290-0
eBook Packages: Computer ScienceComputer Science (R0)