Abstract
Three problems on mRNA information in protein-coding regions are discussed: first, how the mRNA sequence information (tRNA gene copy number) is related to protein secondary structure; second, how the mRNA structure information (stem/loop content) is related to protein secondary structure; third, how the specific selection for mRNA folding energy is made among genomes. From statistical analyses of protein sequences for humans and E. coli we have found that the m-codon segments (for m = 2 to 6) with averagely high tRNA copy number (TCN) (larger than 10.5 for humans or 1.95 for E. coli) preferably code for the alpha helix and that with low TCN (smaller than 7.5 for humans or 1.7 for E. coli) preferably code for the coil. Between them there is an intermediate region without structure preference. In the meantime, we have demonstrated that the helices and strands on proteins tend to be preferably “coded” by the mRNA stem region, while the coil on proteins tends to be preferably “coded” by the mRNA loop region. The occurrence frequencies of stems in helix and strand fragments have attained 6 standard deviations more than the expected. The relation between mRNA stem/loop content and protein structure can be seen from the point of mRNA folding energy. Both for E. coli and humans, the mRNA folding energy in protein regular structure is statistically lower than that in randomized sequence, but for irregular structure (coil) the Z scores are near their control values. We also have studied the folding energy of native mRNA sequence in 28 genomes from a broad view. By use of the analysis of covariance, taking the covariable G+C content or base correlation into account, we demonstrate that the intraspecific difference of the mRNA folding free energy is much smaller than the interspecific difference. The distinction between intraspecific homogeneity and interspecific inhomogeneity is extremely significant (p > .0001). This means the selection for local mRNA structure is specific among genomes. The high intraspecific homogeneity of mRNA folding energy as compared with its large interspecific inhomogeneity can be explained by concerted evolution. The above result also holds for the folding energy of native mRNA relative to randomized sequences. This means the robustness of the distinction between intraspecific homogeneity and interspecific inhomogeneity of mRNA folding under the perturbation of sequential and structural variation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rost B, Sander C (2000) Third generation prediction of secondary structure. In:Webster DM (ed) Methods in Molecular Biology vol 143. Humana Press, New Jersey.
An.nsen CB (1973) Principles that govern the folding of protein chains. Science 181:223–230.
Brunak S, Engelbrecht J (1996) Protein structure and the sequential structure of mRNA:alpha-helix and beta-sheet signals at the nucleotide level. Proteins 25:237–252.
Oresic M, Shalloway D (1998) Specific correlations between relative synonymous codon usage and protein secondary structure. J Mol Biol 281:31–48.
Adzhubei IA, Adzhubei AA, Neidle S (1998) An integrated sequence-structure database incorporating matching mRNA sequence, amino acid sequence and protein three-dimensional structure data. Nucleic Acids Res 26:327–331.
Xie T, Ding DF (1998) The relationship between synonymous codon usage and protein structure. FEBS Lett 434:93–96.
Li XQ, Luo LF, Liu CQ (2003) Abnormal preference of synonymous codons for protein secondary structure types.??? Chinese J Biochem Mol Biol 19(4):441–444 (in Chinese)
Jia MW, Luo LF, Liu CQ (2004) Statistical correlation between protein secondary structure and messenger RNA stem-loop structure. Biopolymers 73:16–26.
Luo LF, Jia MW, Li XQ (2004) Protein structure preference, tRNA copy number and mRNA stem/loop content. Biopolymers 74:432–447.
Luo LF (2004) Theoretic-Physical Approach to Molecular Biology. Shanghai Science Technical Publishers, Shanghai.
Tukey JW (1949) One degree of freedom for non-additivity. Biometrics 5:232–242.
Mathews DH, Sabina J, Zucker M, Turner H (1999) Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol 288:911–940.
Kabsch W, Sander C (1983) Dictionary of protein secondary structure:Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637.
Hofacker IL, Fontana W, Stadler PF, Bonhoeffer S, Tacker M, Schuster P (1994) Fast folding and comparison of RNA secondary structures. Monatsh Chem 125:167–188.
Zuker M, Stiegler P (1981) Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acid Res 9:133–148.
Katz L, Burge CB (2003) Widespread selection for local RNA secondary structure in coding regions of bacterial genes. Genome Res 13:2042–2051.
Luo LF, Lee WJ, Jia LJ, Ji FM, Tsai L (1998) Statistical correlation of nucleotides in a DNA sequence. Phys Rev E 58:861–871.
Seffens W, Digby D (1999) mRNAs have greater negative folding free energies than shufled or codon choice randomized sequences. Nucleic Acid Res 27:1578–1584.
Workman C, Krogh A (1999) No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution. Nucleic Acid Res 27:4816–4822.
Li WH (1997) Concerted evolution of multigene families. In:Li,WH (ed) Molecular evolution, pp. 309–334. Sinauer Associates, Sunderland, Massachusetts, and references cited therein.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer
About this chapter
Cite this chapter
Luo, L., Jia, M. (2007). Messenger RNA Information: Its Implication in Protein Structure Determination and Others. In: Feng, J., Jost, J., Qian, M. (eds) Networks: From Biology to Theory. Springer, London. https://doi.org/10.1007/978-1-84628-780-0_14
Download citation
DOI: https://doi.org/10.1007/978-1-84628-780-0_14
Publisher Name: Springer, London
Print ISBN: 978-1-84628-485-4
Online ISBN: 978-1-84628-780-0
eBook Packages: Computer ScienceComputer Science (R0)