Abstract
Statistical studies of gene populations on the purine/pyrimidine alphabet have shown that the mean occurrence probability of thei-motif YRY(N) i YRY (R=purine, Y=pyrimidine, N=R or Y) is not uniform by varyingi in the range [1,99], but presents a maximum ati=6 in the following populations: protein coding genes of eukaryotes, prokaryotes, chloroplasts and mitrochondria, and also viral introns, ribosomal RNA genes and transfer RNA genes (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). From the “universality” of this observation, we suggested that the oligonucleotide YRY(N)6 is a primitive one and that it has a central function in DNA sequence evolution (Arquès and Michel, 1987b,J. theor. Biol. 128, 457–461). Following this idea, we introduce a concept of a model of DNA sequence evolution which will be validated according to a shema presented in three parts.
In the first part, using the last version of the gene database, the YRY(N)6YRY preferential occurrence (maximum ati=6) is confirmed for the populations mentioned above and is extended to some newly analysed populations: chloroplast introns, chloroplast 5′ regions, mitochondrial 5′ regions and small nuclear RNA genes. On the other hand, the YRY(N)6YRY preferential occurrence and periodicities are used in order to classify 18 gene populations.
In the second part, we will demonstrate that several statistical features characterizing different gene populations (in particular the YRY(N)6YRY preferential occurrence and the periodicities) can be retrieved from a simple Markov model based on the mixing of the two oligonucleotides YRY(N)6 and YRY(N)3 and based on the percentages of RYR and YRY in the unspecified trinucleotides (N)3 of YRY(N)6 and YRY(N)3. Several properties are identified and prove in particular that the oligonucleotide mixing is an independent process and that several different features are functions of a unique parameter.
In the third part, the return of the model to the reality shows a strong correlation between reality and simulation concerning the presence of large alternating purine/pyrimidine stretches and of periodicities. It also contributes to a greater understanding of biological reality, e.g. the presence or the absence of large alternating purine/pyrimidine stretches can be explained as being a simple consequence of the mixing of two particular oligonucleotides.
Finally, we believe that such an approach is the first step toward a unified model of DNA sequence evolution allowing the molecular understanding of both the origin of life and the actual biological reality.
Similar content being viewed by others
Literature
Arquès, D. G. and C. J. Michel. 1987a. Study of a pertubation in the coding periodicity.Math. Biosci. 86, 1–14.
Arquès, D. G. and C. J. Michel. 1987b. A purine-pyrimidine motif verifying an identical presence in almost all gene taxonomic groups.J. theor. Biol. 128, 457–461.
Arquès, D. G. and C. J. Michel. 1987c. Periodicities in introns.Nucl. Acids Res. 15, 7581–7592.
DeGroot, M. H. 1986.Probability and Statistics. Reading, MA: Addison-Wesley.
Eigen, M. and P. Schuster. 1978. The hypercycle: a principle of natural self-organization. Part C: the realistic hypercycle.Naturwissenschaften 65, 341–369.
Fickett, J. W. 1982. Recognition of protein coding regions in DNA sequences.Nucl. Acids Res. 10, 5303–5318.
Feller, W. 1968.An Introduction to Probability Theory and Its Applications. New York: Wiley.
Kimura, M. 1987.The Neutral Theory of Molecular Evolution. Cambridge University Press.
Lazowska, J., C. Jacq and P. P. Slonimski. Sequence of introns and flanking exons in wild-type and box3 mutants of cytochromeb reveals an interlaced splicing protein coded by an intron. 1980.Cell 22, 333–348.
Lewontin, R. C. and J. L. Hubby. 1966. A molecular approach to the study of genic heterozygosity in natural populations. II. Amount of variation and degree of heterozygosity in natural populations ofDrosophila Pseudoobscura.Genetics 54, 595–609.
Nei, M. 1987.Molecular Evolutionary Genetics. Washington, DC: Columbia University Press.
Shepherd, J. C. W. 1981. Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification.Proc. natl Acad. Sci. U.S.A. 78, 1596–1600.
Ziff, E. B. 1980. Transcription and RNA processing by the DNA tumour viruses.Nature 287, 491–499.
Zuckerkandl, E. and L. Pauling. 1965. Molecules as documents of evolutionary history.J. theor. Biol. 8, 357–366.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Arquès, D.G., Michel, C.J. A model of DNA sequence evolution. Bltn Mathcal Biology 52, 741–772 (1990). https://doi.org/10.1007/BF02460807
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02460807