Skip to main content
Log in

A model of DNA sequence evolution

  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

Statistical studies of gene populations on the purine/pyrimidine alphabet have shown that the mean occurrence probability of thei-motif YRY(N) i YRY (R=purine, Y=pyrimidine, N=R or Y) is not uniform by varyingi in the range [1,99], but presents a maximum ati=6 in the following populations: protein coding genes of eukaryotes, prokaryotes, chloroplasts and mitrochondria, and also viral introns, ribosomal RNA genes and transfer RNA genes (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). From the “universality” of this observation, we suggested that the oligonucleotide YRY(N)6 is a primitive one and that it has a central function in DNA sequence evolution (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). Following this idea, we introduce a concept of a model of DNA sequence evolution which will be validated according to a shema presented in three parts.

In the first part, using the last version of the gene database, the YRY(N)6YRY preferential occurrence (maximum ati=6) is confirmed for the populations mentioned above and is extended to some newly analysed populations: chloroplast introns, chloroplast 5′ regions, mitochondrial 5′ regions and small nuclear RNA genes. On the other hand, the YRY(N)6YRY preferential occurrence and periodicities are used in order to classify 18 gene populations.

In the second part, we will demonstrate that several statistical features characterizing different gene populations (in particular the YRY(N)6YRY preferential occurrence and the periodicities) can be retrieved from a simple Markov model based on the mixing of the two oligonucleotides YRY(N)6 and YRY(N)3 and based on the percentages of RYR and YRY in the unspecified trinucleotides (N)3 of YRY(N)6 and YRY(N)3. Several properties are identified and prove in particular that the oligonucleotide mixing is an independent process and that several different features are functions of a unique parameter.

In the third part, the return of the model to the reality shows a strong correlation between reality and simulation concerning the presence of large alternating purine/pyrimidine stretches and of periodicities. It also contributes to a greater understanding of biological reality, e.g. the presence or the absence of large alternating purine/pyrimidine stretches can be explained as being a simple consequence of the mixing of two particular oligonucleotides.

Finally, we believe that such an approach is the first step toward a unified model of DNA sequence evolution allowing the molecular understanding of both the origin of life and the actual biological reality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Literature

  • Arquès, D. G. and C. J. Michel. 1987a. Study of a pertubation in the coding periodicity.Math. Biosci. 86, 1–14.

    Article  MATH  Google Scholar 

  • Arquès, D. G. and C. J. Michel. 1987b. A purine-pyrimidine motif verifying an identical presence in almost all gene taxonomic groups.J. theor. Biol. 128, 457–461.

    Google Scholar 

  • Arquès, D. G. and C. J. Michel. 1987c. Periodicities in introns.Nucl. Acids Res. 15, 7581–7592.

    Google Scholar 

  • DeGroot, M. H. 1986.Probability and Statistics. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Eigen, M. and P. Schuster. 1978. The hypercycle: a principle of natural self-organization. Part C: the realistic hypercycle.Naturwissenschaften 65, 341–369.

    Article  Google Scholar 

  • Fickett, J. W. 1982. Recognition of protein coding regions in DNA sequences.Nucl. Acids Res. 10, 5303–5318.

    Google Scholar 

  • Feller, W. 1968.An Introduction to Probability Theory and Its Applications. New York: Wiley.

    Google Scholar 

  • Kimura, M. 1987.The Neutral Theory of Molecular Evolution. Cambridge University Press.

  • Lazowska, J., C. Jacq and P. P. Slonimski. Sequence of introns and flanking exons in wild-type and box3 mutants of cytochromeb reveals an interlaced splicing protein coded by an intron. 1980.Cell 22, 333–348.

    Article  Google Scholar 

  • Lewontin, R. C. and J. L. Hubby. 1966. A molecular approach to the study of genic heterozygosity in natural populations. II. Amount of variation and degree of heterozygosity in natural populations ofDrosophila Pseudoobscura.Genetics 54, 595–609.

    Google Scholar 

  • Nei, M. 1987.Molecular Evolutionary Genetics. Washington, DC: Columbia University Press.

    Google Scholar 

  • Shepherd, J. C. W. 1981. Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification.Proc. natl Acad. Sci. U.S.A. 78, 1596–1600.

    Article  Google Scholar 

  • Ziff, E. B. 1980. Transcription and RNA processing by the DNA tumour viruses.Nature 287, 491–499.

    Article  Google Scholar 

  • Zuckerkandl, E. and L. Pauling. 1965. Molecules as documents of evolutionary history.J. theor. Biol. 8, 357–366.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arquès, D.G., Michel, C.J. A model of DNA sequence evolution. Bltn Mathcal Biology 52, 741–772 (1990). https://doi.org/10.1007/BF02460807

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02460807

Keywords

Navigation