Skip to main content
Log in

Relationship between G + C in silent sites of codons and amino acid composition of human proteins

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Summary

We have investigated the relationship between the G + C content of silent (synonymous) sites in codons and the amino acid composition of encoded proteins for approximately 1,600 human genes. There are positive correlations between silent site G + C and the proportions of codons for Arg, Pro, Ala, Trp, His, Gln, and Leu and negative ones for Tyr, Phe, Asn, Ile, Lys, Asp, Thr, and Glu. The median proteins coded by groups of genes that differ in silent-site G + C content also differ in amino acid composition, as do some proteins coded by homologous genes. The pattern of compositional change can be largely explained by directional mutation pressure, the genetic code, and differences in the frequencies of accepted amino acid substitutions; the shifts in protein composition are likely to be selectively neutral.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aïssani B et al (1991) The compositional properties of human genes. J Mol Evol 32:493–503

    Google Scholar 

  • Aota S, Ikemura T (1986) Diversity of G + C content at the third position of codons in vertebrate genes and its cause. Nucleic Acids Res 14:6345–6355

    Google Scholar 

  • Baker AR et al (1988) Cloning and expression of full-length cDNA encoding human vitamin D receptor. Proc Natl Acad Sci USA 85:3294–3298

    Google Scholar 

  • Bernardi G et al (1985) The mosaic genome of vertebrates. Science 228:953–958

    Google Scholar 

  • Bernardi G, Bernardi G (1986a) Compositional constraints and genome evolution. J Mol Evol 24:1–11

    Google Scholar 

  • Bernardi G, Bernardi G (1986b) The human genome and its evolutionary context. Cold Spring Harbor Symp Quant Biol 51:479–487

    Google Scholar 

  • Bernardi G, Bernardi G (1991) Compositional properties of nuclear genes from cold-blooded vertebrates. J Mol Evol 33:57–67

    Google Scholar 

  • Bilofsky HS, Burks C (1988) The GenBank genetic sequence data bank. Nucleic Acids Res 16:1861–1864

    Google Scholar 

  • Brown R et al (1984) Mechanism of activation of an N-ras gene in the human fibrosarcoma cell line HT 1080. EMBO J 3:1321–1326

    Google Scholar 

  • Collins DW et al (1992) Numerical classification of coding sequences. Nucleic Acids Res 20(6):1405–1410

    Google Scholar 

  • Cox EC, Yanofsky C (1967) Altered base ratios in the DNA of an Escherichia coli mutator strain. Proc Natl Acad Sci USA 58:1895–1902

    Google Scholar 

  • Dayhoff MO (1978) Atlas of protein sequence and structure, vol 5, suppl 3, National Biomedical Research Foundation, Silver Spring, MD

    Google Scholar 

  • de The H et al (1987) A novel steroid thyroid hormone receptor-related gene inappropriately expressed in human hepatocellular carcinoma. Nature 330:667–670

    Google Scholar 

  • de Vos AM et al (1988) Three-dimensional structure of an oncogene protein: catalytic domain of human c-H-ras p21. Science 239:888–893

    Google Scholar 

  • D'Onofrio G et al (1991) Correlation between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J Mol Evol 32:504–510

    Google Scholar 

  • Filipski J (1990) Evolution of DNA sequence. Contribution of mutation bias and selection to the origin of chromosomal compartments. Adv Mutagenesis Res 2:1–54

    Google Scholar 

  • Fischer R et al (1988) Multiple divergent mRNAs code for a single human calmodulin. J Biol Chem 263:17055–17062

    Google Scholar 

  • Hirai H et al (1985) Activation of the c-K-ras oncogene in a human pancreas carcinoma. Biochem Biophy Res Commun 127:168–174

    Google Scholar 

  • Ikemura T, Aota S (1988) Global variation in G + C content along vertebrate genome DNA. Possible correlation with chromosome band structures. J Mol Biol 203:1–13

    Google Scholar 

  • Ikemura T et al (1990) Giant G + C% mosaic structures of the human genome found by arrangement of GenBank human DNA sequences according to genetic positions. Genomics 8:207–216

    Google Scholar 

  • Ikemura T, Wada K (1991) Evident diversity of codon usage patterns of human genes with respect to chromosome banding patterns and chromosome numbers; relation between nucleotide sequence data and cytogenetic data. Nucleic Acids Res 19:4333–4339

    Google Scholar 

  • Jukes TH, Kimura M (1984) Evolutionary constraints and the neutral theory. J Mol Evol 21:90–92

    Google Scholar 

  • Jukes TH, Bhushan V (1986) Silent nucleotide substitutions and G + C content of some mitochondrial and bacterial genes. J Mol Evol 24:39–44

    Google Scholar 

  • Karlin S et al (1990) Contrasts in codon usage of latent versus productive genes of Epstein-Barn virus: data and hypotheses. J Virol 64(9):4264–4273

    Google Scholar 

  • Li W-H et al (1985) A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol Biol Evol 2(2):150–174

    Google Scholar 

  • Maki H, Sekiguchi M (1992) MutT protein specifically hydrolyses a potent mutagenic substrate for DNA synthesis. Nature 355:273–275

    Google Scholar 

  • Miyajima N et al (1988) Identification of two novel members of erbA superfamily by molecular cloning: the gene products of the two are highly related to each other. Nucleic Acids Res 16:11057–11074

    Google Scholar 

  • Muto A, Osawa S (1987) The guanine and cytosine content of genomic DNA and bacterial evolution. Proc Natl Acad Sci USA 84:166–169

    Google Scholar 

  • Nathans J et al (1986) Molecular genetics of human color vision: the genes encoding blue, green and red pigments. Nature 232:193–202

    Google Scholar 

  • Perutz M (1983) Species adaptation in a protein molecule. Mol Biol Evol 1:1–28

    Google Scholar 

  • Rolfe R, Meselson M (1959) The relative homogeneity of microbial DNA. Proc Natl Acad Sci USA 45:1039–1043

    Google Scholar 

  • Sekiya T et al (1984) Molecular cloning and the total nucleotide sequence of the human c-Ha-ras-1 gene activated in a melanoma from a Japanese patient. Proc Natl Acad Sci USA 81: 5384–5388

    Google Scholar 

  • Sueoka N et al (1959) Heterogeneity in deoxyribonucleic acids. II. Dependency of the density of deoxyribonucleic acids on guanine-cytosine content. Nature 183:1429–1431

    Google Scholar 

  • Sueoka N (1961) Correlation between base composition of deoxyribonucleic acid and amino acid composition of protein. Proc Natl Acad Sci USA 47:1141–1149

    Google Scholar 

  • Sueoka N (1962) On the genetic basis of variation and heterogeneity of DNA base composition. Proc Natl Acad Sci USA 48:582–592

    Google Scholar 

  • Sueoka N (1988) Directional mutation pressure and neutral molecular evolution. Proc Natl Acad Sci 85:2633–2657

    Google Scholar 

  • Sueoka N (1992) Directional mutation pressure, selective constraints and genetic equilibria. J Mol Evol 34:95–114

    Google Scholar 

  • Wada K et al (1991) Codon usage tabulated from the GenBank genetic sequence data. Nucleic Acids Res 19 (Suppl):1981–1986

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Offprint requests to: D.W. Collins

Rights and permissions

Reprints and permissions

About this article

Cite this article

Collins, D.W., Jukes, T.H. Relationship between G + C in silent sites of codons and amino acid composition of human proteins. J Mol Evol 36, 201–213 (1993). https://doi.org/10.1007/BF00160475

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00160475

Key words

Navigation