Abstract
The mechanism of translation initiation is responsible for shaping the mRNA sequences downstream of the start codon. However, this region has not been systematically analyzed in prokaryotes. We used sequence logos and statistic methods to analyze the patterns of overrepresented sequences in this region for 125 species of bacteria and 23 species of archaea. The specific positions are compared to the first 33 amino acids in the proteins. At the 2nd amino acid position, Lys, Ser or Thr is highly overrepresented for 68% to 84% of the genomes examined and Ala is highly overrepresented for 57% of the genomes. Overrepresentation of Lys2 is negatively correlated with the G + C content and overrepresentation of Ser2 or Thr2 is positively correlated with the G + C content of genomes. Ile at the 4th to the 8th positions were found to be overrepresented for 91% of the genomes analyzed and this seemed to be conserved for both bacteria and archaea. Organisms growing at high temperatures have relatively low extent of nucleotides bias at 5′ termini of open reading frames (ORFs). The extent of overrepresenting A and underrepresenting G at ORF 5′ termini is reduced in thermophiles and hyperthermophiles for both archaea and bacteria.
Similar content being viewed by others
Abbreviations
- RBS:
-
Ribosomal-binding site
- CDSs:
-
Coding sequences
- ORFs:
-
Open reading frames
References
Bendtsen JD, Nielsen H, von Heijne G, Brunak S (2004) Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 340:783–795
Berezovsky IN, Kilosanidze GT, Tumanyan VG, Kisselev LL (1999) Amino acid composition of protein termini are biased in different manners. Protein Eng 12:23–30
Bradshaw RA, Brickey WW, Walker KW (1998) N-terminal processing: the methionine aminopeptidase and N alpha-acetyl transferase families. Trends Biochem Sci 23:263–267
Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator. Genome Res 14:1188–1190
De Smit MH, van Duin J (1990) Secondary structure of the ribosome binding site determines translational efficiency: A quantitative analysis. Proc Natl Acad Sci USA 87:7668–7672
Dennis PP (1997) Ancient ciphers: translation in Archaea. Cell 89:1007–1010
Eyre-Walker A, Bulmer M (1993) Reduced synonymous substitution rate at the start of enterobacterial genes. Nucleic Acids Res 21:4599–4603
Ganoza MC, Louis BG (1994) Potential secondary structure at the translational start domain of eukaryotic and prokaryotic mRNAs. Biochimie 76:428–439
Gorodkin J, Heyer LJ, Brunak S, Stormo GD (1997) Displaying the information contents of structural RNA alignments: the structure logos. Comput Appl Biosci 13:583–586
Guillerez J, Gazeau M, Dreyfus M (1991) In the Escherichia coli lacZ gene, the spacing between the translating ribosomes is insensitive to the efficiency of translation initiation. Nucleic Acids Res 19:6743–6750
Huang S, Elliott RC, Liu PS et al (1987) Specificity of cotranslational amino-terminal processing of proteins in yeast. Biochemistry 26:8242–8246
Jacques N, Guillerez J, Dreyfus M (1992) Culture conditions differentially affect the translation of individual Escherichia coli mRNAs. J Mol Biol 226:597–608
Jenni S, Ban N (2003) The chemistry of protein synthesis and voyage through the ribosomal tunnel. Curr Opin Struct Biol 13:212–219
Kozak M (1999) Initiation of translation in procaryotes and eukaruotes. Gene 234:187–208
Londei P (2005) Evolution of translational initiation: new insights from the archaea. FEMS Microbiol Rev 29:185–200
Martin-Farmer J, Janssen GR (1999) A downstream CA repeat sequence increases translation from leadered and unleadered mRNA in Escherichia coli. Mol Microbiol 31:1025–1038
Moll I, Huber M, Grill S, Sairafi P, Mueller F, Brimacombe R, Londei P, Blasi U (2001) Evidence against an Interaction between the mRNA downstream box and 16S rRNA in translation initiation. J Bacteriol 183:3499–3505
Nielsen P, Krogh A (2005) Large-scale prokaryotic gene prediction and comparison to genome annotation. Bioinformatics 21:4322–4329
Niimura Y, Terabe M, Gojobori T, Miura K (2003) Comparative analysis of the base biases at the gene terminal portions in seven eukaryote genomes. Nucleic Acids Res 31:5195–5201
Pal D, Chakrabarti P (2000) Terminal residues in protein chains: residue preference, conformation, and interaction. Biopolymers 53:467–475
Rocha EP, Danchin A, Viari A (1999) Translation in Bacillus subtilis: roles and trends of initiation and termination, insights from a genome analysis. Nucleic Acids Res 27:3567–3576
Rocha EP, Danchin A, Viari A (2000) The DB case: pattern matching evidence is not significant. Mol Microbiol 37:216–218
Sacerdot C, Chiaruttini C, Engst K, Graffe M, Milet M, Mathy N, Dondon J, Springer M (1996) The role of the AUU initiation codon in the negative feedback regulation of the gene for translation initiation factor IF3 in Escherichia coli. Mol Microbiol 21:331–346
Schneider TD, Stephens RM (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18:6097–6100
Schneider TD, Stormo GD, Gold L, Ehrenfeucht A (1986) Information content of binding sites on nucleotide sequences. J Mol Biol 188:415–431
Serero A, Giglione C, Sardini A, Martinez-Sanz J, Meinnel T (2003) An unusual peptide deformylase features in the human mitochondrial N-terminal methionine excision pathway. J Biol Chem 278:52953–52963
Shine J, Dalgarno L (1974) The 3′-terminal sequence of Escherichia coli 16S ribosomal RNA: complementary to nonsense triplets and ribosomal binding site. Proc Natl Acad Sci U S A 71:1342–1346
Sprengart ML, Fatscher HP, Fuchs E (1990) The initiation of translation in E. coli: apparent base pairing between the 16S rRNA and downstream sequences of the mRNA. Nucleic Acids Res 18:1719–1723
Stenström CM, Jin H, Major LL, Tate WP, Isaksson LA (2001) Codon bias at the 3′-side of the initiation codon is correlated with translation initiation efficiency in Escherichia coli. Gene 263:273–284
Stormo GD (1998) Information content and free energy in DNA–protein interactions. J Theor Biol 195:135–137
Stormo GD, Schneider TD, Gold LM (1982) Characterization of translational initiation sites in E. Coli. Nucleic Acids Res 10:2971–2996
Tats A, Remm M, Tenson T (2006) Highly expressed proteins have an increased frequency of alanine in the second amino acid position. BMC Genomics 7:28
Tenson T, Ehrenberg M (2002) Regulatory nascent peptides in the ribosomal tunnel. Cell 108:591–594
Torarinsson E, Klenk HP, Garrett RA (2005) Divergent transcriptional and translational signals in Archaea. Environ Microbiol 7:47–54
Tsalkova T, Kramer G, Hardesty B (1999) The effect of a hydrophobic N-terminal probe on translational pausing of chloramphenicol acetyl transferase and rhodanese. J Mol Biol 286:71–81
Varshavsky A (1996) The N-end rule: functions, mysteries, uses. Proc Natl Acad Sci USA 93:12142–12149
Xiaohui C, Jin W (2004) A unique ATG triplet downstream of gene start in archaea: implications for translation initiation and evolution. Gene 327:75–79
Yarchuk O, Jacques N, Guillerez J, Dreyfus M (1992) Interdependence of translation, transcription and mRNA degradation in the lacZ gene. Journal of Molecular Biology 226:581–596
Zeldovich KB, Berezovsky IN, Shakhnovich EI (2007) Protein and DNA Sequence Determinants of Thermophilic Adaptation. PLoS Comput Biol 3:e5
Acknowledgements
This work was supported by a grant from the National Natural Science Foundation of China (NSFC No. 30200005).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Li, W., Zou, H. & Tao, M. Sequences downstream of the start codon and their relations to G + C content and optimal growth temperature in prokaryotic genomes. Antonie van Leeuwenhoek 92, 417–427 (2007). https://doi.org/10.1007/s10482-007-9170-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10482-007-9170-6