Antonie van Leeuwenhoek

, Volume 92, Issue 4, pp 417–427 | Cite as

Sequences downstream of the start codon and their relations to G + C content and optimal growth temperature in prokaryotic genomes

Original Paper


The mechanism of translation initiation is responsible for shaping the mRNA sequences downstream of the start codon. However, this region has not been systematically analyzed in prokaryotes. We used sequence logos and statistic methods to analyze the patterns of overrepresented sequences in this region for 125 species of bacteria and 23 species of archaea. The specific positions are compared to the first 33 amino acids in the proteins. At the 2nd amino acid position, Lys, Ser or Thr is highly overrepresented for 68% to 84% of the genomes examined and Ala is highly overrepresented for 57% of the genomes. Overrepresentation of Lys2 is negatively correlated with the G + C content and overrepresentation of Ser2 or Thr2 is positively correlated with the G + C content of genomes. Ile at the 4th to the 8th positions were found to be overrepresented for 91% of the genomes analyzed and this seemed to be conserved for both bacteria and archaea. Organisms growing at high temperatures have relatively low extent of nucleotides bias at 5′ termini of open reading frames (ORFs). The extent of overrepresenting A and underrepresenting G at ORF 5′ termini is reduced in thermophiles and hyperthermophiles for both archaea and bacteria.


G + C content Optimal growth temperature Sequence logos Translation initiation 



Ribosomal-binding site


Coding sequences


Open reading frames



This work was supported by a grant from the National Natural Science Foundation of China (NSFC No. 30200005).

Supplementary material

10482_2007_9170_MOESM1_ESM.xls (94 kb)
(XLS 93 kb)
10482_2007_9170_MOESM2_ESM.doc (3.1 mb)
(DOC 3140 kb)
10482_2007_9170_MOESM3_ESM.doc (3 mb)
(DOC 3040 kb)
10482_2007_9170_MOESM4_ESM.ppt (89 kb)
(PPT 89 kb)


  1. Bendtsen JD, Nielsen H, von Heijne G, Brunak S (2004) Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 340:783–795PubMedCrossRefGoogle Scholar
  2. Berezovsky IN, Kilosanidze GT, Tumanyan VG, Kisselev LL (1999) Amino acid composition of protein termini are biased in different manners. Protein Eng 12:23–30PubMedCrossRefGoogle Scholar
  3. Bradshaw RA, Brickey WW, Walker KW (1998) N-terminal processing: the methionine aminopeptidase and N alpha-acetyl transferase families. Trends Biochem Sci 23:263–267PubMedCrossRefGoogle Scholar
  4. Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14:1188–1190PubMedCrossRefGoogle Scholar
  5. De Smit MH, van Duin J (1990) Secondary structure of the ribosome binding site determines translational efficiency: A quantitative analysis. Proc Natl Acad Sci USA 87:7668–7672PubMedCrossRefGoogle Scholar
  6. Dennis PP (1997) Ancient ciphers: translation in Archaea. Cell 89:1007–1010PubMedCrossRefGoogle Scholar
  7. Eyre-Walker A, Bulmer M (1993) Reduced synonymous substitution rate at the start of enterobacterial genes. Nucleic Acids Res 21:4599–4603PubMedCrossRefGoogle Scholar
  8. Ganoza MC, Louis BG (1994) Potential secondary structure at the translational start domain of eukaryotic and prokaryotic mRNAs. Biochimie 76:428–439PubMedCrossRefGoogle Scholar
  9. Gorodkin J, Heyer LJ, Brunak S, Stormo GD (1997) Displaying the information contents of structural RNA alignments: the structure logos. Comput Appl Biosci 13:583–586PubMedGoogle Scholar
  10. Guillerez J, Gazeau M, Dreyfus M (1991) In the Escherichia coli lacZ gene, the spacing between the translating ribosomes is insensitive to the efficiency of translation initiation. Nucleic Acids Res 19:6743–6750PubMedCrossRefGoogle Scholar
  11. Huang S, Elliott RC, Liu PS et al (1987) Specificity of cotranslational amino-terminal processing of proteins in yeast. Biochemistry 26:8242–8246PubMedCrossRefGoogle Scholar
  12. Jacques N, Guillerez J, Dreyfus M (1992) Culture conditions differentially affect the translation of individual Escherichia coli mRNAs. J Mol Biol 226:597–608PubMedCrossRefGoogle Scholar
  13. Jenni S, Ban N (2003) The chemistry of protein synthesis and voyage through the ribosomal tunnel. Curr Opin Struct Biol 13:212–219PubMedCrossRefGoogle Scholar
  14. Kozak M (1999) Initiation of translation in procaryotes and eukaruotes. Gene 234:187–208PubMedCrossRefGoogle Scholar
  15. Londei P (2005) Evolution of translational initiation: new insights from the archaea. FEMS Microbiol Rev 29:185–200PubMedCrossRefGoogle Scholar
  16. Martin-Farmer J, Janssen GR (1999) A downstream CA repeat sequence increases translation from leadered and unleadered mRNA in Escherichia coli. Mol Microbiol 31:1025–1038PubMedCrossRefGoogle Scholar
  17. Moll I, Huber M, Grill S, Sairafi P, Mueller F, Brimacombe R, Londei P, Blasi U (2001) Evidence against an Interaction between the mRNA downstream box and 16S rRNA in translation initiation. J Bacteriol 183:3499–3505PubMedCrossRefGoogle Scholar
  18. Nielsen P, Krogh A (2005) Large-scale prokaryotic gene prediction and comparison to genome annotation. Bioinformatics 21:4322–4329PubMedCrossRefGoogle Scholar
  19. Niimura Y, Terabe M, Gojobori T, Miura K (2003) Comparative analysis of the base biases at the gene terminal portions in seven eukaryote genomes. Nucleic Acids Res 31:5195–5201PubMedCrossRefGoogle Scholar
  20. Pal D, Chakrabarti P (2000) Terminal residues in protein chains: residue preference, conformation, and interaction. Biopolymers 53:467–475PubMedCrossRefGoogle Scholar
  21. Rocha EP, Danchin A, Viari A (1999) Translation in Bacillus subtilis: roles and trends of initiation and termination, insights from a genome analysis. Nucleic Acids Res 27:3567–3576PubMedCrossRefGoogle Scholar
  22. Rocha EP, Danchin A, Viari A (2000) The DB case: pattern matching evidence is not significant. Mol Microbiol 37:216–218PubMedCrossRefGoogle Scholar
  23. Sacerdot C, Chiaruttini C, Engst K, Graffe M, Milet M, Mathy N, Dondon J, Springer M (1996) The role of the AUU initiation codon in the negative feedback regulation of the gene for translation initiation factor IF3 in Escherichia coli. Mol Microbiol 21:331–346PubMedCrossRefGoogle Scholar
  24. Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18:6097–6100PubMedCrossRefGoogle Scholar
  25. Schneider TD, Stormo GD, Gold L, Ehrenfeucht A (1986) Information content of binding sites on nucleotide sequences. J Mol Biol 188:415–431PubMedCrossRefGoogle Scholar
  26. Serero A, Giglione C, Sardini A, Martinez-Sanz J, Meinnel T (2003) An unusual peptide deformylase features in the human mitochondrial N-terminal methionine excision pathway. J Biol Chem 278:52953–52963PubMedCrossRefGoogle Scholar
  27. Shine J, Dalgarno L (1974) The 3′-terminal sequence of Escherichia coli 16S ribosomal RNA: complementary to nonsense triplets and ribosomal binding site. Proc Natl Acad Sci U S A 71:1342–1346PubMedCrossRefGoogle Scholar
  28. Sprengart ML, Fatscher HP, Fuchs E (1990) The initiation of translation in E. coli: apparent base pairing between the 16S rRNA and downstream sequences of the mRNA. Nucleic Acids Res 18:1719–1723PubMedCrossRefGoogle Scholar
  29. Stenström CM, Jin H, Major LL, Tate WP, Isaksson LA (2001) Codon bias at the 3′-side of the initiation codon is correlated with translation initiation efficiency in Escherichia coli. Gene 263:273–284PubMedCrossRefGoogle Scholar
  30. Stormo GD (1998) Information content and free energy in DNA–protein interactions. J Theor Biol 195:135–137PubMedCrossRefGoogle Scholar
  31. Stormo GD, Schneider TD, Gold LM (1982) Characterization of translational initiation sites in E. Coli. Nucleic Acids Res 10:2971–2996PubMedCrossRefGoogle Scholar
  32. Tats A, Remm M, Tenson T (2006) Highly expressed proteins have an increased frequency of alanine in the second amino acid position. BMC Genomics 7:28PubMedCrossRefGoogle Scholar
  33. Tenson T, Ehrenberg M (2002) Regulatory nascent peptides in the ribosomal tunnel. Cell 108:591–594PubMedCrossRefGoogle Scholar
  34. Torarinsson E, Klenk HP, Garrett RA (2005) Divergent transcriptional and translational signals in Archaea. Environ Microbiol 7:47–54PubMedCrossRefGoogle Scholar
  35. Tsalkova T, Kramer G, Hardesty B (1999) The effect of a hydrophobic N-terminal probe on translational pausing of chloramphenicol acetyl transferase and rhodanese. J Mol Biol 286:71–81PubMedCrossRefGoogle Scholar
  36. Varshavsky A (1996) The N-end rule: functions, mysteries, uses. Proc Natl Acad Sci USA 93:12142–12149PubMedCrossRefGoogle Scholar
  37. Xiaohui C, Jin W (2004) A unique ATG triplet downstream of gene start in archaea: implications for translation initiation and evolution. Gene 327:75–79PubMedCrossRefGoogle Scholar
  38. Yarchuk O, Jacques N, Guillerez J, Dreyfus M (1992) Interdependence of translation, transcription and mRNA degradation in the lacZ gene. Journal of Molecular Biology 226:581–596PubMedCrossRefGoogle Scholar
  39. Zeldovich KB, Berezovsky IN, Shakhnovich EI (2007) Protein and DNA Sequence Determinants of Thermophilic Adaptation. PLoS Comput Biol 3:e5PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2007

Authors and Affiliations

  1. 1.State Key Laboratory of Agricultural MicrobiologyHuazhong Agricultural UniversityWuhanChina
  2. 2.College of Life Science and TechnologyHuazhong Agricultural UniversityWuhanChina

Personalised recommendations