Antonie van Leeuwenhoek

, Volume 92, Issue 4, pp 417–427

Sequences downstream of the start codon and their relations to G + C content and optimal growth temperature in prokaryotic genomes

Original Paper


The mechanism of translation initiation is responsible for shaping the mRNA sequences downstream of the start codon. However, this region has not been systematically analyzed in prokaryotes. We used sequence logos and statistic methods to analyze the patterns of overrepresented sequences in this region for 125 species of bacteria and 23 species of archaea. The specific positions are compared to the first 33 amino acids in the proteins. At the 2nd amino acid position, Lys, Ser or Thr is highly overrepresented for 68% to 84% of the genomes examined and Ala is highly overrepresented for 57% of the genomes. Overrepresentation of Lys2 is negatively correlated with the G + C content and overrepresentation of Ser2 or Thr2 is positively correlated with the G + C content of genomes. Ile at the 4th to the 8th positions were found to be overrepresented for 91% of the genomes analyzed and this seemed to be conserved for both bacteria and archaea. Organisms growing at high temperatures have relatively low extent of nucleotides bias at 5′ termini of open reading frames (ORFs). The extent of overrepresenting A and underrepresenting G at ORF 5′ termini is reduced in thermophiles and hyperthermophiles for both archaea and bacteria.


G + C content Optimal growth temperature Sequence logos Translation initiation 



Ribosomal-binding site


Coding sequences


Open reading frames

Supplementary material

10482_2007_9170_MOESM2_ESM.doc (3.1 mb)
(DOC 3140 kb)
10482_2007_9170_MOESM3_ESM.doc (3 mb)
(DOC 3040 kb)

Copyright information

© Springer Science+Business Media B.V. 2007

Authors and Affiliations

  1. 1.State Key Laboratory of Agricultural MicrobiologyHuazhong Agricultural UniversityWuhanChina
  2. 2.College of Life Science and TechnologyHuazhong Agricultural UniversityWuhanChina

Personalised recommendations