Skip to main content
Log in

Biased distribution of adenine and thymine in gene nucleotide sequences

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

We analyzed occurrences of bases in 20,352 introns, exons of 25,574 protein-coding genes, and among the three codon positions in the protein-coding sequences. The nucleotide sequences originated from the whole spectrum of organisms from bacteria to primates. The analysis revealed the following: (1) In most exons, adenine dominates over thymine. In other words, adenine and thymine are distributed in an asymmetric way between the exon and the complementary strand, and the coding sequence is mostly located in the adenine-rich strand. (2) Thymine dominates over adenine not only in the strand complementary to the exon but also in introns. (3) A general bias is further revealed in the distribution of adenine and thymine among the three codon positions in the exons, where adenine dominates over thymine in the second and mainly the first codon position while the reverse holds in the third codon position. The product (A1/T1) × (A2/T2) × (T3/A3) is smaller than one in only a few analyzed genes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aota S, Ikemura T (1986) Diversity in G + C content at the third codon position in vertebrate genes and its cause. Nucleic Acids Res 14: 6345–6355

    Google Scholar 

  • Bernardi G, Mouchiroud D, Gautier C (1993) Silent substitutions in mammalian genomes and their evolutionary implications. J Mol Evol 37:583–589

    Google Scholar 

  • Curran JF, Gross BL (1994) Evidence that GHN phase bias does not constitute a framing code. J Mol Biol 235:389–395

    Google Scholar 

  • Dutton MJ (1985) Genetic code redundancy and the evolutionary stability of protein secondary structure. J Theor Biol 116:343–348

    Google Scholar 

  • Fickett JW, Torney CT, Wolf DR (1992) Base compositional structure of genomes. Genomics 13:1056–1064

    Google Scholar 

  • Fickett JW, Tung CS (1992) Assessment of protein coding measures. Nucleic Acids Res 20:6441–6450

    Google Scholar 

  • Grantham R, Perrin P, Mouchiroud D (1986). Patterns in codon usage of different kinds of species. Oxford Surv Evol Biol 3:48–81

    Google Scholar 

  • Ikehara K, Okazawa E (1993) Unusually biased nucleotide sequences on sense strands of Flavobacterium sp. genes produce nonstop frames on the corresponding antisense strands. Nucleic Acids Res 21:2193–2199

    Google Scholar 

  • Ikemura T (1985) Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol 2:13–34

    Google Scholar 

  • Kano A, Ohama T, Abe R, Osawa S (1993) Unassigned or nonsense codons in Micrococcus luteus. J Mol Biol 230:51–56

    Google Scholar 

  • Kypr J (1986) A part of codon bias in genes protects protein spatial structures from destabilization by random single point mutations. Biochem Biophys Res Commun 139:1094–1097

    Google Scholar 

  • Kypr J (1990) Possible reason for the preferential insertion of adenine opposite abasic lesions in DNA. J Theor Biol 135:125–126

    Google Scholar 

  • Kypr J, Mrázek J (1987a) Occurrence of nucleotide triplets in genes and secondary structure of the coded proteins. Int J Biol Macromol 9:49–53

    Google Scholar 

  • Kypr J, Mrázek J (1987b) Unusual codon usage of HIV. Nature 327: 20

    Google Scholar 

  • Kypr J, Mrázek J, Reich J (1989) Nucleotide composition bias and CpG dinucleotide content in the genomes of HIV and HTLV 1/2. Biochim Biophys Acta 1009:280–282

    Google Scholar 

  • Lacey JC, Hall LM, Mullins DW (1985) Rationalization of some genetic anticodonic assignments. Orig Life 16:69–79

    Google Scholar 

  • Lagunez-Otero J, Trifonov EN (1992) mRNA periodical infrastructure complementary to the proofreading site in the ribosome. J Biomol Struct Dyn 10:455–464

    Google Scholar 

  • Leskiw BW, Bibb MJ, Chater KF (1991) The use of a rare codon specifically during development? Mol Microbiol 5:2861–2867

    Google Scholar 

  • Mrázek J, Kypr J (1992) Nucleotide composition of genes and hydrophobicity of the encoded proteins. FEBS Lett 305:163–165

    Google Scholar 

  • Morijama EN, Gojobori T (1992) Rates of synonymous substitution and base composition of nuclear genes in Drosophila. Genetics 130:855–864

    Google Scholar 

  • Ohkubo S, Muto A, Kawauchi Y, Yamao F, Osawa S (1987) The ribosomal protein gene cluster of Mycoplasma capricolum. Mol Gen Genet 210:314–322

    Google Scholar 

  • Randall SK, Eritja R, Kaplan BE, Petruska J, Goodman MF (1987) Nucleotide insertion kinetics opposite a basic lesions in DNA. J Biol Chem 262:6864–6870

    Google Scholar 

  • Rice CM, Fuchs R, Higgins DG, Stoehr PJ, Cameron GN (1993) The EMBL Data Library. Nucleic Acids Res 21:2967–2971

    Google Scholar 

  • Sagher D, Strauss B (1983) Insertion of nucleotides opposite apurinic/ apyrimidinic sites in deoxyribonucleic acid during in vitro synthesis: uniqueness of adenine nucleotides. Biochemistry 22:4518–4526

    Google Scholar 

  • Sharp PM, Li W-H (1987) The codon adaptation index—a measure of directional synonymous codon usage bias, and its potential application. Nucleic Acids Res 15:1281–1295

    Google Scholar 

  • Shepherd JCW (1981) Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc Natl Acad Sci USA 78:1596–1600

    Google Scholar 

  • Soto MA, Sepúlveda A, Tohá J (1985) Conservation of the secondary structure of protein during evolution and the role of the genetic code. Orig Life 16:157–164

    Google Scholar 

  • Stephens RM, Schneider TD (1992) Features of spliceosome evolution and function inferred from an analysis of the information at human splice sites. J Mol Biol 228:1124–1136

    Google Scholar 

  • Taylor FJR, Coates D (1989) The code within codons. Biosystems 22:177–187

    Google Scholar 

  • Trifonov EN (1987) Translation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16S rRNA nucleotide sequences. J Mol Biol 194:643–652

    Google Scholar 

  • Veaute X, Fuchs RPP (1993) Greater susceptibility to mutations in lagging strand of DNA replication in Escherichia coli than in leading strand. Science 261:598–600

    Google Scholar 

  • Volkenstein MV (1966) The genetic coding of the protein structure. Biochim Biophys Acta 119:421–424

    Google Scholar 

  • Wada K, Wada Y, Doi H, Ishibashi F, Gojobori T, Ikemura T (1991) Codon usage tabulated from the GenBank genetic data. Nucleic Acids Res 19:1981–1986

    Google Scholar 

  • Weber AL, Lacey JC Jr (1978) Genetic code correlations: amino acids and their anticodon nucleotides. J Mol Evol 11:199–211

    Google Scholar 

  • Weber JL (1987) Analysis of sequences from the extremely A + T-rich genome of Plasmodium falciparum. Gene 52:103–109

    Google Scholar 

  • Woese CR, Dugre DH, Saxinger WC, Dugre SA (1966) The molecular basis for the genetic code. Proc Natl Acad Sci USA 55:966–974

    Google Scholar 

  • Yomo T, Urabe I, Okada H (1992) No stop codons in the antisense strands of the genes for nylon oligomer degradation. Proc Nall Acad Sci USA 89:3780–3784

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Correspondence to: J. Kypr

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mrázek, J., Kypr, J. Biased distribution of adenine and thymine in gene nucleotide sequences. J Mol Evol 39, 439–447 (1994). https://doi.org/10.1007/BF00173412

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00173412

Key words

Navigation