Abstract
A method allowing to measure the inhomogeneous distribution of purines/pyrimidines in nucleotide sequences is developed. We show that this measure relates to the coding or non-coding character of the considered sequence. Coding sequences present a near to the random Pu or Py distribution. This property is shared by both protein-coding DNA and functional RNA-coding DNA. Non-coding sequences present a highly clustered inhomogeneity. We propose the hypothesis, corroborated with appropriate computer simulations, that this is due to the action of various transposition events accumulated for long time periods.
Similar content being viewed by others
References
Almirantis, Y. and S. Papageorgiou. 1983. Long or short range correlations in DNA sequences? InProceedings of the European Conference on Artificial Life. Brussels: Free University of Brussels.
Buldyrev, S. V., A. L. Goldberg, S. Havlin, C.-K. Peng, M. Simons and H. E. Stanley. 1993. Generalized Levy-walk model for DNA nucleotide sequences.Phys. Rev. E. 47, 4514–4523.
Chatzidimitriou-Dreismann, C. A. and D. Larhammar. 1992. Long-range correlations in DNA.Nature 361, 782.
Kapitonov, V. V. and I. I. Titov. 1994. The order of Introns and long-range correlations in nucleotide sequences.Dokl. Bio. Sci. 337, 403–405.
Karlin, S. and V. Brendel. 1993. Patchiness and correlations in DNA sequences.Science 259, 677–680.
Knuth, D. E. 1981.The Art of Computer Programming. Reading, MA: Addison-Wesley
Li, W. and K. Kaneko. 1992. Long-range correlations and partial 1/f α spectrum in a noncoding DNA sequence.Europhys. Lett. 17, 655–660.
Mani, G. S. 1992a. Correlations between the coding and non-coding regions in DNA.J. Theor. Biol. 158, 429–445.
Mani, G. S. 1992b. Long-range doublet correlations in DNA and the coding regions.J. Theor. Biol. 158, 447–464.
Mulligan, M. E., D. K. Hawley, R. Entriken and W. R. McClure. 1984.E. coli promoter sequences predictin vitro RNA polymerase activity.Nucleic Acids Res. 12, 789–800.
Munson, P. J., R. C. Taylor and G. S. Mickaels. 1992. DNA correlations.Nature 360, 636.
Nee, S. 1992. Uncorrelated DNA walks.Nature 357, 450.
Nicolis, J. S. and A. A. Katsikas. 1994. Chaotic dynamics in linguistic-like processes at the syntactic and semantic levels: in pursuit of a multifractal attractor. InCooporation and Conflict in General Evolutionary Processes, J. L. Custi and A. Karlqvist (Eds). New York: Wiley.
Peng, C.-K., S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons and H. E. Stanley. 1992. Long-range correlations in nucleotide sequences.Nature 356, 168–170.
Popov, O., D. M. Segal and E. N. Trifonov. 1996. Linguistic complexity of protein sequences as compared to texts of human languages.BioSystems 38, 65–74.
Prabhu, V. V. and J.-M. Clavert. 1992. Correlations in intronless DNA.Nature 359, 782.
Provata, A. and Y. Almirantis. Unpublished manuscript.
Staden, R. 1984a. Graphic methods to determine the function of nucleic acid sequences.Nucleic. Acids Res. 12, 521–538.
Staden, R. 1984b. Measurements of the effects that coding for a protein has on a DNA sequence and their use for finding genes.Nucleic Acids Res. 12, 551–567.
Trifonov, E. N. 1989. The multiple codes of nucleotide sequences.Bull. Math. Biol. 51, 417–432.
Trifonov, E. N. 1993. Gene splicing: spatial separation of overlapping messages.Comput. Chem. 17, 27–31.
Tsonis, A. A., J. B. Elsner and P. A. Tsonis. 1991. Periodicity in DNA coding sequences: implications in gene evolution.J. Theor. Biol. 151, 323–331.
Voss, R. F. 1992. Evolution of long-range fractal correlations and 1/f noice in DNA base sequences.Phys. Rev. Lett. 68, 3805–3808.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Almirantis, Y., Provata, A. The “clustered structure” of the purines/pyrimidines distribution in DNA distinguishes systematically between coding and non-coding sequences. Bltn Mathcal Biology 59, 975–992 (1997). https://doi.org/10.1007/BF02460002
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02460002