Skip to main content
Log in

Sequences downstream of the start codon and their relations to G + C content and optimal growth temperature in prokaryotic genomes

  • Original Paper
  • Published:
Antonie van Leeuwenhoek Aims and scope Submit manuscript

Abstract

The mechanism of translation initiation is responsible for shaping the mRNA sequences downstream of the start codon. However, this region has not been systematically analyzed in prokaryotes. We used sequence logos and statistic methods to analyze the patterns of overrepresented sequences in this region for 125 species of bacteria and 23 species of archaea. The specific positions are compared to the first 33 amino acids in the proteins. At the 2nd amino acid position, Lys, Ser or Thr is highly overrepresented for 68% to 84% of the genomes examined and Ala is highly overrepresented for 57% of the genomes. Overrepresentation of Lys2 is negatively correlated with the G + C content and overrepresentation of Ser2 or Thr2 is positively correlated with the G + C content of genomes. Ile at the 4th to the 8th positions were found to be overrepresented for 91% of the genomes analyzed and this seemed to be conserved for both bacteria and archaea. Organisms growing at high temperatures have relatively low extent of nucleotides bias at 5′ termini of open reading frames (ORFs). The extent of overrepresenting A and underrepresenting G at ORF 5′ termini is reduced in thermophiles and hyperthermophiles for both archaea and bacteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Abbreviations

RBS:

Ribosomal-binding site

CDSs:

Coding sequences

ORFs:

Open reading frames

References

  • Bendtsen JD, Nielsen H, von Heijne G, Brunak S (2004) Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 340:783–795

    Article  PubMed  Google Scholar 

  • Berezovsky IN, Kilosanidze GT, Tumanyan VG, Kisselev LL (1999) Amino acid composition of protein termini are biased in different manners. Protein Eng 12:23–30

    Article  PubMed  CAS  Google Scholar 

  • Bradshaw RA, Brickey WW, Walker KW (1998) N-terminal processing: the methionine aminopeptidase and N alpha-acetyl transferase families. Trends Biochem Sci 23:263–267

    Article  PubMed  CAS  Google Scholar 

  • Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14:1188–1190

    Article  PubMed  CAS  Google Scholar 

  • De Smit MH, van Duin J (1990) Secondary structure of the ribosome binding site determines translational efficiency: A quantitative analysis. Proc Natl Acad Sci USA 87:7668–7672

    Article  PubMed  Google Scholar 

  • Dennis PP (1997) Ancient ciphers: translation in Archaea. Cell 89:1007–1010

    Article  PubMed  CAS  Google Scholar 

  • Eyre-Walker A, Bulmer M (1993) Reduced synonymous substitution rate at the start of enterobacterial genes. Nucleic Acids Res 21:4599–4603

    Article  PubMed  CAS  Google Scholar 

  • Ganoza MC, Louis BG (1994) Potential secondary structure at the translational start domain of eukaryotic and prokaryotic mRNAs. Biochimie 76:428–439

    Article  PubMed  CAS  Google Scholar 

  • Gorodkin J, Heyer LJ, Brunak S, Stormo GD (1997) Displaying the information contents of structural RNA alignments: the structure logos. Comput Appl Biosci 13:583–586

    PubMed  CAS  Google Scholar 

  • Guillerez J, Gazeau M, Dreyfus M (1991) In the Escherichia coli lacZ gene, the spacing between the translating ribosomes is insensitive to the efficiency of translation initiation. Nucleic Acids Res 19:6743–6750

    Article  PubMed  CAS  Google Scholar 

  • Huang S, Elliott RC, Liu PS et al (1987) Specificity of cotranslational amino-terminal processing of proteins in yeast. Biochemistry 26:8242–8246

    Article  PubMed  CAS  Google Scholar 

  • Jacques N, Guillerez J, Dreyfus M (1992) Culture conditions differentially affect the translation of individual Escherichia coli mRNAs. J Mol Biol 226:597–608

    Article  PubMed  CAS  Google Scholar 

  • Jenni S, Ban N (2003) The chemistry of protein synthesis and voyage through the ribosomal tunnel. Curr Opin Struct Biol 13:212–219

    Article  PubMed  CAS  Google Scholar 

  • Kozak M (1999) Initiation of translation in procaryotes and eukaruotes. Gene 234:187–208

    Article  PubMed  CAS  Google Scholar 

  • Londei P (2005) Evolution of translational initiation: new insights from the archaea. FEMS Microbiol Rev 29:185–200

    Article  PubMed  CAS  Google Scholar 

  • Martin-Farmer J, Janssen GR (1999) A downstream CA repeat sequence increases translation from leadered and unleadered mRNA in Escherichia coli. Mol Microbiol 31:1025–1038

    Article  PubMed  CAS  Google Scholar 

  • Moll I, Huber M, Grill S, Sairafi P, Mueller F, Brimacombe R, Londei P, Blasi U (2001) Evidence against an Interaction between the mRNA downstream box and 16S rRNA in translation initiation. J Bacteriol 183:3499–3505

    Article  PubMed  CAS  Google Scholar 

  • Nielsen P, Krogh A (2005) Large-scale prokaryotic gene prediction and comparison to genome annotation. Bioinformatics 21:4322–4329

    Article  PubMed  CAS  Google Scholar 

  • Niimura Y, Terabe M, Gojobori T, Miura K (2003) Comparative analysis of the base biases at the gene terminal portions in seven eukaryote genomes. Nucleic Acids Res 31:5195–5201

    Article  PubMed  CAS  Google Scholar 

  • Pal D, Chakrabarti P (2000) Terminal residues in protein chains: residue preference, conformation, and interaction. Biopolymers 53:467–475

    Article  PubMed  CAS  Google Scholar 

  • Rocha EP, Danchin A, Viari A (1999) Translation in Bacillus subtilis: roles and trends of initiation and termination, insights from a genome analysis. Nucleic Acids Res 27:3567–3576

    Article  PubMed  CAS  Google Scholar 

  • Rocha EP, Danchin A, Viari A (2000) The DB case: pattern matching evidence is not significant. Mol Microbiol 37:216–218

    Article  PubMed  CAS  Google Scholar 

  • Sacerdot C, Chiaruttini C, Engst K, Graffe M, Milet M, Mathy N, Dondon J, Springer M (1996) The role of the AUU initiation codon in the negative feedback regulation of the gene for translation initiation factor IF3 in Escherichia coli. Mol Microbiol 21:331–346

    Article  PubMed  CAS  Google Scholar 

  • Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18:6097–6100

    Article  PubMed  CAS  Google Scholar 

  • Schneider TD, Stormo GD, Gold L, Ehrenfeucht A (1986) Information content of binding sites on nucleotide sequences. J Mol Biol 188:415–431

    Article  PubMed  CAS  Google Scholar 

  • Serero A, Giglione C, Sardini A, Martinez-Sanz J, Meinnel T (2003) An unusual peptide deformylase features in the human mitochondrial N-terminal methionine excision pathway. J Biol Chem 278:52953–52963

    Article  PubMed  CAS  Google Scholar 

  • Shine J, Dalgarno L (1974) The 3′-terminal sequence of Escherichia coli 16S ribosomal RNA: complementary to nonsense triplets and ribosomal binding site. Proc Natl Acad Sci U S A 71:1342–1346

    Article  PubMed  CAS  Google Scholar 

  • Sprengart ML, Fatscher HP, Fuchs E (1990) The initiation of translation in E. coli: apparent base pairing between the 16S rRNA and downstream sequences of the mRNA. Nucleic Acids Res 18:1719–1723

    Article  PubMed  CAS  Google Scholar 

  • Stenström CM, Jin H, Major LL, Tate WP, Isaksson LA (2001) Codon bias at the 3′-side of the initiation codon is correlated with translation initiation efficiency in Escherichia coli. Gene 263:273–284

    Article  PubMed  Google Scholar 

  • Stormo GD (1998) Information content and free energy in DNA–protein interactions. J Theor Biol 195:135–137

    Article  PubMed  CAS  Google Scholar 

  • Stormo GD, Schneider TD, Gold LM (1982) Characterization of translational initiation sites in E. Coli. Nucleic Acids Res 10:2971–2996

    Article  PubMed  CAS  Google Scholar 

  • Tats A, Remm M, Tenson T (2006) Highly expressed proteins have an increased frequency of alanine in the second amino acid position. BMC Genomics 7:28

    Article  PubMed  Google Scholar 

  • Tenson T, Ehrenberg M (2002) Regulatory nascent peptides in the ribosomal tunnel. Cell 108:591–594

    Article  PubMed  CAS  Google Scholar 

  • Torarinsson E, Klenk HP, Garrett RA (2005) Divergent transcriptional and translational signals in Archaea. Environ Microbiol 7:47–54

    Article  PubMed  CAS  Google Scholar 

  • Tsalkova T, Kramer G, Hardesty B (1999) The effect of a hydrophobic N-terminal probe on translational pausing of chloramphenicol acetyl transferase and rhodanese. J Mol Biol 286:71–81

    Article  PubMed  CAS  Google Scholar 

  • Varshavsky A (1996) The N-end rule: functions, mysteries, uses. Proc Natl Acad Sci USA 93:12142–12149

    Article  PubMed  CAS  Google Scholar 

  • Xiaohui C, Jin W (2004) A unique ATG triplet downstream of gene start in archaea: implications for translation initiation and evolution. Gene 327:75–79

    Article  PubMed  Google Scholar 

  • Yarchuk O, Jacques N, Guillerez J, Dreyfus M (1992) Interdependence of translation, transcription and mRNA degradation in the lacZ gene. Journal of Molecular Biology 226:581–596

    Article  PubMed  CAS  Google Scholar 

  • Zeldovich KB, Berezovsky IN, Shakhnovich EI (2007) Protein and DNA Sequence Determinants of Thermophilic Adaptation. PLoS Comput Biol 3:e5

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported by a grant from the National Natural Science Foundation of China (NSFC No. 30200005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meifeng Tao.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, W., Zou, H. & Tao, M. Sequences downstream of the start codon and their relations to G + C content and optimal growth temperature in prokaryotic genomes. Antonie van Leeuwenhoek 92, 417–427 (2007). https://doi.org/10.1007/s10482-007-9170-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10482-007-9170-6

Keywords

Navigation