An evolutionary analytical model of a complementary circular code simulating the protein coding genes, the 5′ and 3′ regions

Arqués, Didier G.; Fallot, Jean-Paul; Michel, Christian J.

doi:10.1006/bulm.1997.0033

An evolutionary analytical model of a complementary circular code simulating the protein coding genes, the 5′ and 3′ regions

Published: January 1998

Volume 60, pages 163–194, (1998)
Cite this article

Bulletin of Mathematical Biology Aims and scope Submit manuscript

Didier G. Arqués¹,
Jean-Paul Fallot² &
Christian J. Michel²

53 Accesses
18 Citations
Explore all metrics

Abstract

The self-complementary subset \(\mathcal{T}_0 = \mathcal{X}_0 \)∪{AAA,TTT} with \(\mathcal{X}_0 \) = {AAC, AAT, ACC, ATC, ATT, CAG, CTC, CTG, GAA, GAC, GAG, GAT, GCC, GGC, GGT, GTA, GTC, GTT, TAC, TTC} of 22 trinucleotides has a preferential occurrence in the frame 0 (reading frame established by the ATG start trinucleotide) of protein (coding) genes of both prokaryotes and eukaryotes. The subsets \(\mathcal{T}_1 = \mathcal{X}_1 \)∪{CCC} and \(\mathcal{T}_2 = \mathcal{X}_2 \)∪{GGG} of 21 trinucleotides have a preferential occurrence in the shifted frames 1 and 2 respectively (frame 0 shifted by one and two nucleotides respectively in the 5′-3′ direction). \(\mathcal{T}_1 \) and \(\mathcal{T}_2 \) are complementary to each other. The subset \(\mathcal{T}_0 \) contains the subset \(\mathcal{X}_0 \) which has the rarity property (6 × 10⁻⁸) to be a complementary maximal circular code with two permutated maximal circular codes \(\mathcal{X}_1 \) and \(\mathcal{X}_2 \) in the frames 1 and 2 respectively. \(\mathcal{X}_0 \) is called a C³ code.

A quantitative study of these three subsets \(\mathcal{T}_0 ,\mathcal{T}_1 ,\mathcal{T}_2 \) in the three frames 0, 1, 2 of protein genes, and the 5′ and 3′ regions of eukaryotes, shows that their occurrence frequencies are constant functions of the trinucleotide positions in the sequences. The frequencies of \(\mathcal{T}_0 ,\mathcal{T}_1 ,\mathcal{T}_2 \) in the frame 0 of protein genes are 49, 28.5 and 22.5% respectively. In contrast, the frequencies of \(\mathcal{T}_0 ,\mathcal{T}_1 ,\mathcal{T}_2 \) in the 5′ and 3′ regions of eukaryotes, are independent of the frame. Indeed, the frequency of \(\mathcal{T}_0 \) in the three frames of 5′ (respectively 3′) regions is equal to 35.5% (respectively 38%) and is greater than the frequencies \(\mathcal{T}_1 \) and \(\mathcal{T}_2 \), both equal to 32.25% (respectively 31%) in the three frames.

Several frequency asymmetries unexpectedly observed (e.g. the frequency difference between \(\mathcal{T}_1 \) and \(\mathcal{T}_2 \) in the frame 0), are related to a new property of the subset \(\mathcal{T}_0 \) involving substitutions. An evolutionary analytical model at three parameters (p, q, t) based on an independent mixing of the 22 codons (trinucleotides in frame 0) of \(\mathcal{T}_0 \) with equiprobability (1/22) followed by t ≈ 4 substitutions per codon according to the proportions p ≈ 0.1; q ≈ 0.1 and r = 1 − p − q ≈ 0.8 in the three codon sites respectively, retrieves the frequencies of \(\mathcal{T}_0 ,\mathcal{T}_1 ,\mathcal{T}_2 \) observed in the three frames of protein genes and explains these asymmetries. Furthermore, the same model (0.1, 0.1, t) after t ≈ 22 substitutions per codon, retrieves the statistical properties observed in the three frames of the 5′ and 3′ regions. The complex behaviour of these analytical curves is totally unexpected and a priori difficult to imagine.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pentamers with Non-redundant Frames: Bias for Natural Circular Code Codons

Article 07 January 2020

Circular Tessera Codes in the Evolution of the Genetic Code

Article Open access 04 April 2020

A role for circular code properties in translation

Article Open access 28 April 2021

References

Arquès, D. G. and C. J. Michel (1987). A purine-pyrimidine motif verifying an identical presence in almost all gene taxonomic groups. J. Theor. Biol. 128, 457–461.
Google Scholar
Arquès, D. G. and C. J. Michel (1990). A model of DNA sequence evolution, Part 1: Statistical features and classification of gene populations, Part 2: Simulation model, Part 3: Return of the model to the reality. Bull. Math. Biol. 52, 741–772.
Article Google Scholar
Arquès, D. G. and C. J. Michel (1992). A simulation of the genetic periodicities modulo 2 and 3 with processes of nucleotide insertions and deletions. J. Theor. Biol. 156, 113–127.
Google Scholar
Arquès, D. G. and C. J. Michel (1993). Identification and simulation of new non-random statistical properties common to different eukaryotic gene subpopulations. Biochimie 75, 399–407.
Article Google Scholar
Arquès, D. G. and C. J. Michel (1994). Analytical expression of the purine/pyrimidine autocorrelation function after and before random mutations. Math. Biosci. 123, 103–125.
Article Google Scholar
Arquès, D. G. and C. J. Michel (1996). A complementary circular code in the protein coding genes. J. Theor. Biol. 182, 45–58.
Article Google Scholar
Béal, M.-P. (1993). Codage Symbolique. Paris: Masson.
Google Scholar
Béland, P. and T. F. H. Allen (1994). The origin and evolution of the genetic code. J. Theor. Biol. 170, 359–365.
Article Google Scholar
Benne, R. (1989). RNA-editing in trypanosome mitochondria. Biochem. Biophys. Acta 1007, 131–139.
Google Scholar
Benne, R., J. Van Den Burg, J. P. J. Brakenhoff, P. Sloof, J. H. Van Boom and M. C. Tromp (1986). Major transcript of the frameshifted coxII gene from trypanosome mitochondria contains four nucleotides that are not encoded in the DNA. Cell 46, 819–826.
Article Google Scholar
Berstel, J. and D. Perrin (1985). Theory of Codes. New York: Academic Press.
Google Scholar
Blaisdell, B. E. (1983). A prevalent persistent nonrandomness that distinguishes coding and non-coding eukaryotic nuclear DNA sequences. J. Mol. Evol. 19, 122–133.
Article Google Scholar
Crick, F. H. C., S. Brenner, A. Klug and G. Pieczenik (1976). A speculation on the origin of protein synthesis. Origins of Life 7, 389–397.
Article Google Scholar
Crick, F. H. C., J. S. Griffith and L. E. Orgel (1957). Codes without commas. Proc. Natl. Acad. Sci. 43, 416–421.
Article MathSciNet Google Scholar
Dounce, A. L. (1952). Duplicating mechanism for peptide chain and nucleic acid synthesis. Enzymologia 15, 251–258.
Google Scholar
Eigen, M. and P. Schuster (1978). The hypercycle. A principle of natural self-organization. Part C: The realistic hypercycle. Naturwissenschaften 65, 341–369.
Article Google Scholar
Feagin, J. E. (1990). RNA editing in kinetoplastid mitochondria. J. Biol. Chem. 265, 19373–19376.
Google Scholar
Feagin, J. E., J. M. Abraham and K. Stuart (1988). Extensive editing of the cytochrome c oxidase III transcript in trypanosoma brucei. Cell 53, 413–422.
Article Google Scholar
Fickett, J. W. (1982). Recognition of protein coding regions in DNA sequences. Nucl. Acids Res. 10, 5303–5318.
Google Scholar
Jukes, T. H. and V. Bhushan (1986). Silent nucleotide substitutions and G+C content of some mitochondrial and bacterial genes. J. Mol. Evol. 24, 39–44.
Article Google Scholar
Konecny, J., M. Eckert, M. Schöniger and G. L. Hofacker (1993). Neutral adaptation of the genetic code to double-strand coding. J. Mol. Evol. 36, 407–416.
Article Google Scholar
Konecny, J., M. Schöniger and G. L. Hofacker (1995). Complementary coding conforms to the primeval comma-less code. J. Theor. Biol. 173, 263–270.
Article Google Scholar
Nirenberg, M. W. and J. H. Matthaei (1961). The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proc. Natl. Acad. Sci. 47, 1588–1602.
Article Google Scholar
Shaw, J. M., J. E. Feagin, K. Stuart and L. Simpson (1988). Editing of kinetoplastid mitochondrial mRNAs by uridine addition and deletion generates conserved amino acid sequences and AUG initiation codons. Cell 53, 401–411.
Article Google Scholar
Shepherd, J. C. W. (1981). Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc. Natl. Acad. Sci. 78, 1596–1600.
Article Google Scholar
Shulman, M. J., C. M. Steinberg and N. Westmoreland (1981). The coding function of nucleotide sequences can be discerned by statistical analysis. J. Theor. Biol. 88, 409–420.
Article Google Scholar
Simpson, L. (1990). RNA editing—A novel genetic phenomenon? Science 250, 512–513.
Google Scholar
Smith, T. F., M. S. Waterman and J. R. Sadler (1983). Statistical characterization of nucleic acid sequence functional domains. Nucl. Acids Res. 11, 2205–2220.
Google Scholar
Staden, R. and A. D. McLachlan (1982). Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucl. Acids Res. 10, 141–156.
Google Scholar
Stuart, K. (1991). RNA editing in mitochondrial mRNA of trypanosomatids. Trends Biochem. Sci. 16, 68–72.
Article MathSciNet Google Scholar
Watson, J. D. and F. H. C. Crick (1953). A structure for deoxyribose nucleic acid. Nature 171, 737–738.
Article Google Scholar
Zull, J. E. and S. K. Smith (1990). Is genetic code redundancy related to retention of structural information in both DNA strands? Trends Biochem. Sci. 15, 257–261.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Equipe de Biologie Théorique, Université de Marne la Vallée, Institut Gaspard Monge, 2 rue de la Butte Verte, 93160, Noisy Le Grand, France
Didier G. Arqués
Equipe de Biologie Théorique, Institut Polytechnique de Sévenans, Rue du Château, Sévenans, 90010, Belfort, France
Jean-Paul Fallot & Christian J. Michel

Authors

Didier G. Arqués
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Paul Fallot
View author publications
You can also search for this author in PubMed Google Scholar
Christian J. Michel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian J. Michel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arqués, D.G., Fallot, JP. & Michel, C.J. An evolutionary analytical model of a complementary circular code simulating the protein coding genes, the 5′ and 3′ regions. Bull. Math. Biol. 60, 163–194 (1998). https://doi.org/10.1006/bulm.1997.0033

Download citation

Received: 16 February 1997
Accepted: 19 November 1997
Issue Date: January 1998
DOI: https://doi.org/10.1006/bulm.1997.0033

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An evolutionary analytical model of a complementary circular code simulating the protein coding genes, the 5′ and 3′ regions

Abstract

Access this article

Similar content being viewed by others

Pentamers with Non-redundant Frames: Bias for Natural Circular Code Codons

Circular Tessera Codes in the Evolution of the Genetic Code

A role for circular code properties in translation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An evolutionary analytical model of a complementary circular code simulating the protein coding genes, the 5′ and 3′ regions

Abstract

Access this article

Similar content being viewed by others

Pentamers with Non-redundant Frames: Bias for Natural Circular Code Codons

Circular Tessera Codes in the Evolution of the Genetic Code

A role for circular code properties in translation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation