Skip to main content

Long Simple Sequence Repeats in Host-Adapted Pathogens Localize Near Genes Encoding Antigens, Housekeeping Genes, and Pseudogenes

Abstract

Simple sequence repeats (SSRs) in DNA sequences are tandem iterations of a single nucleotide or a short oligonucleotide. SSRs are subject to slipped-strand mutations and a common source of phase variation in bacteria and antigenic variation in pathogens. Significantly long SSRs are generally rare in prokaryotic genomes, and long SSRs composed of iterations of mono-, di-, tri-, and tetranucleotides are mostly restricted to host-adapted pathogens. We present new results concerning associations between long SSRs and genes related to different cellular functions in genomes of host-adapted pathogens. We found that in the majority of the analyzed genomes, at least some of the genes associated with SSRs encode potential antigens, which is expected if the primary function of SSRs is their contribution to antigenic variation. However, we also found a number of long SSRs associated with housekeeping genes, including rRNA and tRNA genes, genes encoding ribosomal proteins, amino acyl-tRNA synthetases, chaperones, and important metabolic enzymes. Many of these genes are probably essential and it is unlikely that they are phase-variable. Few statistically significant associations between SSRs and gene functional classifications were detected, suggesting that most long SSRs are not related to a particular cellular function or process. Long SSRs in Mycobacterium leprae are mostly associated with pseudogenes and may be contributing to gene loss following the adaptation to an obligate pathogenic lifestyle. We speculate that LSSRs may have played a similar role in genome reduction of other host-adapted pathogens.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  • Amieva MR, Vogelmann R, Covacci A, Tompkins LS, Nelson WJ, Falkow S (2003) Disruption of the epithelial apical-junctional complex by Helicobacter pylori CagA. Science 300:1430–1434

    PubMed  Article  CAS  Google Scholar 

  • Blanchard B, Saillard C, Kobisch M, Bove JM (1996) Analysis of putative ABC transporter genes in Mycoplasma hyopneumoniae. Microbiology 142(Pt 7):1855–1862

    PubMed  CAS  Article  Google Scholar 

  • Cole ST, Eiglmeier K, Parkhill J, James KD, Thomson NR, Wheeler PR, Honore N, Garnier T, Churcher C, Harris D, Mungall K, Basham D, Brown D, Chillingworth T, Connor R, Davies RM, Devlin K, Duthoy S, Feltwell T, Fraser A, Hamlin N, Holroyd S, Hornsby T, Jagels K, Lacroix C, Maclean J, Moule S, Murphy L, Oliver K, Quail MA, Rajandream MA, Rutherford KM, Rutter S, Seeger K, Simon S, Simmonds M, Skelton J, Squares R, Squares S, Stevens K, Taylor K, Whitehead S, Woodward JR, Barrell BG (2001) Massive gene decay in the leprosy bacillus. Nature 409:1007–1011

    PubMed  Article  CAS  Google Scholar 

  • Dunker AK, Cortese MS, Romero P, Iakoucheva LM, Uversky VN (2005) Flexible nets. The roles of intrinsic disorder in protein interaction networks. Febs J 272:5129–5148

    Google Scholar 

  • Field D, Wills C (1998) Abundant microsatellite polymorphism in Saccharomyces cerevisiae, and the different distributions of microsatellites in eight prokaryotes and S. cerevisiae, result from strong mutation pressures and a variety of selective forces. Proc Natl Acad Sci USA 95:1647–1652

    PubMed  Article  CAS  Google Scholar 

  • Groisman EA, Casadesus J (2005) The origin and evolution of human pathogens. Mol Microbiol 56:1–7

    PubMed  Article  CAS  Google Scholar 

  • Gur-Arie R, Cohen CJ, Eitan Y, Shelef L, Hallerman EM, Kashi Y (2000) Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism. Genome Res 10:62–71

    PubMed  CAS  Google Scholar 

  • Htun H, Dahlberg JE (1989) Topology and formation of triple-stranded H-DNA. Science 243:1571–1576

    PubMed  Article  CAS  Google Scholar 

  • Karlin S, Mrázek J, Campbell AM (1996) Frequent oligonucleotides and peptides of the Haemophilus influenzae genome. Nucleic Acids Res 24:4263–4272

    PubMed  Article  CAS  Google Scholar 

  • Kashi Y, King DG (2006) Simple sequence repeats as advantageous mutators in evolution. Trends Genet 22:253–259

    PubMed  Article  CAS  Google Scholar 

  • Lerat E, Ochman H (2004) Ψ-Φ: Exploring the outer limits of bacterial pseudogenes. Genome Res 14:2273–2278

    PubMed  Article  CAS  Google Scholar 

  • Li YC, Korol AB, Fahima T, Nevo E (2004) Microsatellites within genes: structure, function, and evolution. Mol Biol Evol 21:991–1007

    PubMed  Article  CAS  Google Scholar 

  • McGinnis S, Madden TL (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 32:W20–W25

    PubMed  Article  CAS  Google Scholar 

  • Moran NA (2002) Microbial minimalism: genome reduction in bacterial pathogens. Cell 108:583–586

    PubMed  Article  CAS  Google Scholar 

  • Moran NA (2003) Tracing the evolution of gene loss in obligate bacterial symbionts. Curr Opin Microbiol 6:512–518

    PubMed  Article  CAS  Google Scholar 

  • Moxon ER, Rainey PB, Nowak MA, Lenski RE (1994) Adaptive evolution of highly mutable loci in pathogenic bacteria. Curr Biol 4:24–33

    PubMed  Article  CAS  Google Scholar 

  • Mrázek J (2006) Analysis of distribution indicates diverse functions of simple sequence repeats in Mycoplasma genomes. Mol Biol Evol 23:1370–1385

    PubMed  Article  CAS  Google Scholar 

  • Mrázek J, Guo X, Shah A (2007) Simple sequence repeats in prokaryotic genomes. Proc Natl Acad Sci USA 104:8472–8477

    PubMed  Article  CAS  Google Scholar 

  • Perutz MF (1999) Glutamine repeats and neurodegenerative diseases: molecular aspects. Trends Biochem Sci 24:58–63

    PubMed  Article  CAS  Google Scholar 

  • Price MN, Huang KH, Alm EJ, Arkin AP (2005) A novel method for accurate operon predictions in all sequenced prokaryotes. Nucleic Acids Res 33:880–892

    PubMed  Article  CAS  Google Scholar 

  • Raherison S, Gonzalez P, Renaudin H, Charron A, Bebear C, Bebear CM (2002) Evidence of active efflux in resistance to ciprofloxacin and to ethidium bromide by Mycoplasma hominis. Antimicrob Agents Chemother 46:672–679

    PubMed  Article  CAS  Google Scholar 

  • Rocha EP (2003) An appraisal of the potential for illegitimate recombination in bacterial genomes and its consequences: from duplications to genome reduction. Genome Res 13:1123–1132

    PubMed  Article  CAS  Google Scholar 

  • Rocha EP, Blanchard A (2002) Genomic repeats, genome plasticity and the dynamics of Mycoplasma evolution. Nucleic Acids Res 30:2031–2042

    PubMed  Article  CAS  Google Scholar 

  • Röske K, Blanchard A, Chambaud I, Citti C, Jacobs E (2001) Phase variation among major surface antigens of Mycoplasma penetrans. Infect Immun 69:7642–7651

    PubMed  Article  Google Scholar 

  • Shafer RH, Smirnov I (2000) Biological aspects of DNA/RNA quadruplexes. Biopolymers 56:209–227

    PubMed  Article  CAS  Google Scholar 

  • Sinden RR (1994) DNA structure and function. Academic Press, San Diego, CA

    Google Scholar 

  • Subramaniam S, Frey J, Huang B, Djordjevic S, Kwang J (2000) Immunoblot assays using recombinant antigens for the detection of Mycoplasma hyopneumoniae antibodies. Vet Microbiol 75:99–106

    PubMed  Article  CAS  Google Scholar 

  • Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631–637

    PubMed  Article  CAS  Google Scholar 

  • Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA (2003) The COG database: an updated version includes eukaryotes. BMC Bioinform 4:41

    Article  Google Scholar 

  • Tóth G, Gáspári Z, Jurka J (2000) Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res 10:967–981

    PubMed  Article  Google Scholar 

  • van der Woude MW, Bäumler AJ (2004) Phase and antigenic variation in bacteria. Clin Microbiol Rev 17:581–611

    PubMed  Article  CAS  Google Scholar 

  • van Passel MW, Ochman H (2007) Selection on the genic location of disruptive elements. Trends Genet 23:601–604

    PubMed  Article  CAS  Google Scholar 

  • Wise KS, Foecking MF, Roske K, Lee YJ, Lee YM, Madan A, Calcutt MJ (2006) Distinctive repertoire of contingency genes conferring mutation–based phase variation and combinatorial expression of surface lipoproteins in Mycoplasma capricolum subsp. capricolum of the Mycoplasma mycoides phylogenetic cluster. J Bacteriol 188:4926–4941

    CAS  Google Scholar 

Download references

Acknowledgments

We thank Dr. Anne Summers for critical reading of the manuscript and Drs. Mark Schell, Duncan Krause, and other colleagues at the UGA Department of Microbiology for stimulating discussions.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jan Mrázek.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Guo, X., Mrázek, J. Long Simple Sequence Repeats in Host-Adapted Pathogens Localize Near Genes Encoding Antigens, Housekeeping Genes, and Pseudogenes. J Mol Evol 67, 497 (2008). https://doi.org/10.1007/s00239-008-9166-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00239-008-9166-5

Keywords

  • Tandem repeats
  • Phase variation
  • Contingency loci
  • Antigenic variation
  • Genome reduction
  • Pathogen evolution