Abstract
Comparison of closely related bacterial genomes has revealed the presence of highly conserved sequences forming a ”backbone” that is interrupted by numerous, less conserved, DNA fragments. Segmentation of bacterial genomes into backbone and variable regions is particularly useful to investigate bacterial genome evolution. Several software tools have been designed to compare complete bacterial chromosomes and a few online databases store pre-computed genome comparisons. However, very few statistical methods are available to evaluate the reliability of these software tools and to compare the results obtained with them. To fill this gap, we have developed two local scores to measure the robustness of bacterial genome segmentations. Our method uses a simulation procedure based on random perturbations of the compared genomes. The scores presented in this paper are simple to implement and our results show that they allow to discriminate easily between robust and non-robust bacterial genome segmentations when using aligners such as MAUVE and MGA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kellis, M., Patterson, N., Endrizzi, M., Birren, B., Lander, E.S.: Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003)
Goto, N., Kurokawa, K., Yasunaga, T.: Analysis of invariant sequences in 266 complete genomes. Gene 401, 172–180 (2007)
Halpern, D., Chiapello, H., Schbath, S., Robin, S., Hennequet-Antier, C., Gruss, A., El Karoui, M.: Identification of DNA motifs implicated in maintenance of bacterial core genomes by predictive modeling. PLoS genet. 9, 153–160 (2007)
Hayashi, T., Makino, K., Ohnishi, M., Kurokawa, K., Ishii, K., Yokoyama, K., Han, C.G., Ohtsubo, E., Nakayama, K., Murata, T., et al.: Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res. 8, 11–22 (2001)
Chiapello, H., Bourgait, I., Sourivong, F., Heuclin, G., Gendrault-Jacquemard, A., Petit, M.A., El Karoui, M.: Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops. BMC Bioinformatics 6, 171–180 (2005)
Touchon, M., Hoede, C., Tenaillon, O., Barbe, V., Baeriswyl, S., Bidet, P., Bingen, E., Bonacorsi, S., Bouchier, C., Bouvet, O., et al.: Organised genome dynamics in the Escherichia colispecies results in highly diverse adaptive paths. Plos Genet. 5, 1000344 (2009)
Canchaya, C., Duperchy, E., Brussow, H.: The impact of prophages on bacterial chromosomes. Mol. Microbiol. 53, 9–18 (2004)
Prentice, M.B.: Bacterial comparative genomics. Genome Biol. 5, 338 (2004)
Touzain, F., Denamur, E., Médique, C., Barbe, V., El Karoui, M., Petit, M.A.: Small variable segments constitute a major type of diversity of bacterial genomes at the species level. Genome Biol. 11, R45 (2010)
Miller, W.: Comparison of genomic DNA sequences: solved and unsolved problems. Bioinformatics 17, 391–397 (2001)
Delcher, A.L., Kasif, S., Fleischmann, R.D., Peterson, J., White, O., Salzberg, S.L.: Alignment of whole genomes. Nucleic Acids Res. 27, 2369–2376 (1999)
Kurtz, S., Phillipy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., Salzberg, S.L.: Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004)
Treangen, T.J., Messeguer, X.: M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species. BMC Bioinformatics 7, 433–447 (2006)
Dubchak, I., Pachter, L.: The computational challenges of applying comparative-based computational methods to whole genomes. Brief. Bioinformatics 3, 18–22 (2002)
Prakash, A., Tompa, M.: Measuring the accuracy of genome-size multiple alignments. Genome Biol. 8, R124 (2007)
Margulies, E.H., Cooper, G.M., Asimenos, G., Thomas, D.J., Dewey, C.N., Siepel, A., Birney, E., Keefe, D., Schwartz, A.S., Hou, M., et al.: Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res. 17, 760–774 (2007)
Swidan, F., Shamir, R.: Assessing the quality of whole genome alignments in bacteria. Adv. Bioinformatics 1, 1–8 (2009)
Kitano, H.: Biological robustness. Nat. Rev. Genet. 5, 826–837 (2004)
Guyon, F., Guénoche, A.: Comparing bacterial genomes from linear orders of paterns. Discrete Appl. Math. 156, 1251–1262 (2008)
Hoebeke, M., Nicolas, P., Bessieres, P.: MuGeN: simultaneous exploration of multiple genomes and computer analysis results. Bioinformatics 19, 859–864 (2003)
Deloger, M., El Karoui, M., Petit, M.A.: A genomic distance based on MUM indicates discontinuity between most bacterial species and genera. J. Bacteriol. 191, 91–99 (2009)
Höhl, M., Kurtz, S., Ohlebusch, E.: Efficient multiple genome alignment. Bioinformatics 18, S312–S320 (2002)
Darling, A.C., Mau, B., Blattner, F.R., Perna, N.T.: Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 15, 184–194 (2004)
Chiapello, H., Gendrault, A., Caron, C., Blum, J., Petit, M.A., El Karoui, M.: MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level. BMC Bioinformatics 9, 498–506 (2008)
Dubchak, I., Poliakov, A., Kislyuk, A., Brudno, M.: Multiple whole-genome alignments without a reference organism. Genome Res. 19, 682–689 (2009)
Chaudhuri, R.R., Pallen, M.J.: xBASE, a collection of online databases for bacterial comparative genomics. Nucleic Acids Res. 34, 335–337 (2006)
Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: CoCoNUT: an efficient system for the comparison and analysis of genomes. BMC Bioinformatics 9, 456–462 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Devillers, H., Chiapello, H., Schbath, S., El Karoui, M. (2010). Assessing the Robustness of Complete Bacterial Genome Segmentations. In: Tannier, E. (eds) Comparative Genomics. RECOMB-CG 2010. Lecture Notes in Computer Science(), vol 6398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16181-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-16181-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16180-3
Online ISBN: 978-3-642-16181-0
eBook Packages: Computer ScienceComputer Science (R0)