Archives of Microbiology

, Volume 193, Issue 4, pp 287–297 | Cite as

In silico prediction of horizontal gene transfer in Streptococcus thermophilus

  • Catherine Eng
  • Annabelle ThibessardEmail author
  • Morten Danielsen
  • Thomas Bovbjerg Rasmussen
  • Jean-François Mari
  • Pierre LeblondEmail author
Original Paper


A combination of gene loss and acquisition through horizontal gene transfer (HGT) is thought to drive Streptococcus thermophilus adaptation to its niche, i.e. milk. In this study, we describe an in silico analysis combining a stochastic data mining method, analysis of homologous gene distribution and the identification of features frequently associated with horizontally transferred genes to assess the proportion of the S. thermophilus genome that could originate from HGT. Our mining approach pointed out that about 17.7% of S. thermophilus genes (362 CDSs of 1,915) showed a composition bias; these genes were called ‘atypical’. For 22% of them, their functional annotation strongly support their acquisition through HGT and consisted mainly in genes encoding mobile genetic recombinases, exopolysaccharide (EPS) biosynthesis enzymes or resistance mechanisms to bacteriophages. The distribution of the atypical genes in the Firmicutes phylum as well as in S. thermophilus species was sporadic and supported the HGT prediction for more than a half (52%, 189). Among them, 46 were found specific to S. thermophilus. Finally, by combining our method, gene annotation and sequence specific features, new genome islands were suggested in the S. thermophilus genome.


Gene transfer Genome mining Streptococcus thermophilus 

Supplementary material

203_2010_671_MOESM1_ESM.doc (100 kb)
Supplementary material 1 (DOC 100 kb)
203_2010_671_MOESM2_ESM.doc (148 kb)
Supplementary material 2 (DOC 148 kb)
203_2010_671_MOESM3_ESM.doc (29 kb)
Supplementary material 3 (DOC 29 kb)
203_2010_671_MOESM4_ESM.pdf (432 kb)
Supplementary material 4 (PDF 431 kb)
203_2010_671_MOESM5_ESM.doc (68 kb)
Supplementary material 5 (DOC 67 kb)


  1. Ammann A, Neve H, Geis A, Heller KJ (2008) Plasmid transfer via transduction from Streptococcus thermophilus to Lactococcus lactis. J Bacteriol 190:3083–3087PubMedCrossRefGoogle Scholar
  2. Angel CS, Ruzek M, Hostetter MK (1994) Degradation of C3 by Streptococcus pneumoniae. J Infect Dis 170:600–608Google Scholar
  3. Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann Math Stat 41:164–171CrossRefGoogle Scholar
  4. Blomqvist T, Steinmoen H, Havarstein LS (2006) Natural genetic transformation: a novel tool for efficient genetic engineering of the dairy bacterium Streptococcus thermophilus. Appl Environ Microbiol 72:6751–6756PubMedCrossRefGoogle Scholar
  5. Bourgoin F, Pluvinet A, Gintz B, Decaris B, Guedon G (1999) Are horizontal transfers involved in the evolution of the Streptococcus thermophilus exopolysaccharide synthesis loci? Gene 233:151–161PubMedCrossRefGoogle Scholar
  6. Brochet M, Couve E, Glaser P, Guedon G, Payot S (2008) Integrative conjugative elements and related elements are major contributors to the genome diversity of Streptococcus agalactiae. J Bacteriol 190:6913–6917PubMedCrossRefGoogle Scholar
  7. Burrus V, Pavlovic G, Decaris B, Guedon G (2002) The ICESt1 element of Streptococcus thermophilus belongs to a large family of integrative and conjugative elements that exchange modules and change their specificity of integration. Plasmid 48:77–97PubMedCrossRefGoogle Scholar
  8. Delorme C, Poyart C, Ehrlich SD, Renault P (2007) Extent of horizontal gene transfer in evolution of Streptococci of the salivarius group. J Bacteriol 189:1330–1341PubMedCrossRefGoogle Scholar
  9. Doolittle WF (1999) Phylogenetic classification and the universal tree. Science 284:2124–2129PubMedCrossRefGoogle Scholar
  10. Du Preez JA (1998) Efficient training of high-order hidden Markov model using first-order representations. Comput Speech Lang 12:23–39CrossRefGoogle Scholar
  11. Eng C, Asthana C, Aigle B, Hergalant S, Mari JF, Leblond P (2009) A new data mining approach for the detection of bacterial promoters combining stochastic and combinatorial methods. J Comput Biol 16:1211–1225PubMedCrossRefGoogle Scholar
  12. Fernandez A, Thibessard A, Borges F, Gintz B, Decaris B, Leblond-Bourget N (2004) Characterization of oxidative stress-resistant mutants of Streptococcus thermophilus CNRZ368. Arch Microbiol 182:364–372PubMedCrossRefGoogle Scholar
  13. Fontaine L et al (2007) Quorum-sensing regulation of the production of Blp bacteriocins in Streptococcus thermophilus. J Bacteriol 189:7195–7205PubMedCrossRefGoogle Scholar
  14. Fontaine L et al (2010) A novel pheromone quorum-sensing system controls the development of natural competence in Streptococcus thermophilus and Streptococcus salivarius. J Bacteriol 192:1444–1454PubMedCrossRefGoogle Scholar
  15. Garcia-Vallve S, Guzman E, Montero MA, Romeu A (2003) HGT-DB: a database of putative horizontally transferred genes in prokaryotic complete genomes. Nucleic Acids Res 31:187–189PubMedCrossRefGoogle Scholar
  16. He Y (1988) Extended Viterbi algorithm for second-order hidden Markov process. Proc IEEE Int Conf Pattern Recognit 2:718–720Google Scholar
  17. Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86Google Scholar
  18. Layec S, Decaris B, Leblond-Bourget N (2008) Diversity of firmicutes peptidoglycan hydrolases and specificities of those involved in daughter cell separation. Res Microbiol 159:507–515PubMedCrossRefGoogle Scholar
  19. Le Ber F, Benoît M, Schott C, Mari JF, Mignolet C (2006) Studying crop sequences with carrotage, a HMM-based data mining software. Ecol Modell 191:170–185CrossRefGoogle Scholar
  20. Liu M, Siezen RJ, Nauta A (2009) In silico prediction of horizontal gene transfer events in Lactobacillus bulgaricus and Streptococcus thermophilus reveals protocooperation in yogurt manufacturing. Appl Environ Microbiol 75(12):4120–4129Google Scholar
  21. Mari J-F, Haton J-P, Kriouile A (1997) Automatic word recognition based on second-order hidden Markov models. IEEE Trans. Speech Audio Process 5:22–25Google Scholar
  22. Nakamura Y, Itoh T, Matsuda H, Gojobori T (2004) Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat Genet 36:760–766PubMedCrossRefGoogle Scholar
  23. Nicolas P et al (2002) Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models. Nucleic Acids Res 30:1418–1426PubMedCrossRefGoogle Scholar
  24. Nielsen H, Engelbrecht J, Brunak S, von Heijne G (1997) A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Int J Neural Syst 8:581–599PubMedCrossRefGoogle Scholar
  25. Pavlovic G, Burrus V, Toulmay A, Choulet F, Decaris B, Guedon G (2004) Characterization and evolution of a family of integrative and potentially conjugative or mobilizable elements from Streptococcus thermophilus. Lait 84:7–14CrossRefGoogle Scholar
  26. Rasmussen TB, Danielsen M, Valina O, Garrigues C, Johansen E, Pedersen MB (2008) Streptococcus thermophilus core genome: comparative genome hybridization study of 47 strains. Appl Environ Microbiol 74:4703–4710PubMedCrossRefGoogle Scholar
  27. Rocha EP, Danchin E (2002) Base composition bias might result from competition for metabolic resources. Trends Genet 18:291–294PubMedCrossRefGoogle Scholar
  28. Rutherford K et al (2000) Artemis: sequence visualization and annotation. Bioinformatics 16:944–945PubMedCrossRefGoogle Scholar
  29. Sonnhammer EL, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175–182PubMedGoogle Scholar
  30. Vernikos GS, Parkhill J (2006) Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands. Bioinformatics 22:2196–2203PubMedCrossRefGoogle Scholar
  31. Waack S et al (2006) Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models. BMC Bioinformatics 7:142PubMedCrossRefGoogle Scholar
  32. Yoon SH, Hur CG, Kang HY, Kim YH, Oh TK, Kim JF (2005) A computational approach for identifying pathogenicity islands in prokaryotic genomes. BMC Bioinformatics 6:184PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Catherine Eng
    • 1
    • 2
  • Annabelle Thibessard
    • 1
    Email author
  • Morten Danielsen
    • 3
    • 4
  • Thomas Bovbjerg Rasmussen
    • 3
    • 4
  • Jean-François Mari
    • 2
  • Pierre Leblond
    • 1
    Email author
  1. 1.Génétique et Microbiologie, UMR UHP-INRA 1128, IFR 110 EFABA, Université de Lorraine, Faculté des Sciences et TechnologiesVandœuvre-lès-NancyFrance
  2. 2.LORIA, UMR CNRS 7503 et INRIA Lorraine, Campus scientifiqueVandœuvre-lès-NancyFrance
  3. 3.Department of AssaysChr. Hansen A/SHørsholmDenmark
  4. 4.Department of Physiology, innovationChr. Hansen A/SHørsholmDenmark

Personalised recommendations