Prokaryotic sequences are responsible for more than just protein coding. There are two 10- to 11-base periodical patterns superimposed on the protein coding message within the same sequence. Positional auto- and cross-correlation analysis of the sequences shows that these two patterns are a short-range counter-phase oscillation of AA and TT dinucleotides and a medium-range in-phase oscillation of the same dinucleotides, spanning distances of up to ∼30 and ∼100 bases, respectively. The short-range oscillation is encoded by the amino acid sequences themselves, apparently, due to the presence of amphipathic α-helices in the proteins. The medium-range oscillation, related to DNA folding in the cell, is created largely by a special choice of the bases in the third positions of the codons. Interestingly, the amino acid sequences do contribute to that signal as well. That is, the very amino acid sequences are, to some extent, degenerate to serve the same oscillating pattern that is associated with the degenerate third codon positions.
Prokaryotic genomes DNA periodicity Dinucleotides Codon bias Codon usage Third codon positions Supercoiling
This is a preview of subscription content, log in to check access.
We are grateful to anonymous reviewers for raising several important issues.
D’Onofrio G, Mouchiroud D, Aissani B, Gautier C, Bernardi G (1991) Correlations between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J Mol Evol 32:504–510PubMedCrossRefGoogle Scholar
Duret L, Mouchiroud D (1999) Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci USA 96:4482–4487PubMedCrossRefGoogle Scholar
Duret L, Mouchiroud D (2000) Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol Biol Evol 17:68–74PubMedGoogle Scholar
Engel DE, DeGrado WF (2004) Amino acid propensities are position-dependent throughout the length of α-helix. J Mol Biol 337:1195–1205PubMedCrossRefGoogle Scholar
Garnier J, Osguthorpe DJ, Robson B (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120:95–120CrossRefGoogle Scholar
Goldman E, Rosenberg AH, Zubay G, Studier FW (1995) Consecutive low-usage leucine codons block translation only when near the 5′ end of a message in Escherichia coli. J Mol Biol 245:467–473PubMedCrossRefGoogle Scholar
Grantham R, Gautier C, Gouy M (1980) Codon frequencies in 119 individual genes confirm consistent choices of degenerate bases according to genome type. Nucleic Acids Res 8:1893–1912PubMedGoogle Scholar
Guisez Y, Robbens J, Remaut E, Fiers W (1993) Folding of the MS2 coat protein in Escherichia coli is modulated by translational pauses resulting from mRNA secondary structure and codon usage: a hypothesis. J Theor Biol 162:243–252PubMedCrossRefGoogle Scholar
Herzel H, Trifonov EN, Weiss O, Grosse I (1998a) Interpreting correlations in biosequences. Physica A 249:449–459CrossRefGoogle Scholar
Herzel H, Weiss O, Trifonov EN (1998b) Sequence periodicity in complete genomes of Archaea suggests positive supercoiling. J Biomol Struct Dyn 16:341–345Google Scholar
Herzel H, Weiss O, Trifonov EN (1999) 10–11 bp periodicities in complete genomes reflect protein structure and DNA folding. Bioinformatics 15(3):187–193PubMedCrossRefGoogle Scholar
Hosid S, Trifonov EN, Bolshoy A (2004) Sequence periodicity of Escherichia coli is concentrated in intergenic regions. BMC Mol Biol 5(14):1–7Google Scholar
Ikemura T (1985) Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol 2:13–34PubMedGoogle Scholar
Komar A, Jaenicke R (1995) Kinetics of translation of gamma B crystallin and its circularly permuted variant in an in vitro cell-free system: possible relations to codon distribution and protein folding. FEBS Lett 376:195–198PubMedCrossRefGoogle Scholar
Makhoul CH, Trifonov EN (2002) Distribution of rare triplets along mRNA and their relation to protein folding. J Biomol Struct Dyn 20:413–420PubMedGoogle Scholar
Murray EE, Lotzer J, Eberle M (1989) Codon usage in plant genes. Nucleic Acids Res 17:477–498PubMedGoogle Scholar
Pal L, Chakrabarti P, Basu G (2003) Sequence and structure patterns in proteins from an analysis of the shortest helices: implications for helix nucleation. J Mol Biol 326:273–291PubMedCrossRefGoogle Scholar
Penel S, Morisson RG, Mortishire-Smith RJ, Doig AJ (1999) Periodicity in alpha-helix lengths and C-capping preferences. J Mol Biol 293:1211–1219PubMedCrossRefGoogle Scholar