Skip to main content
Log in

The “clustered structure” of the purines/pyrimidines distribution in DNA distinguishes systematically between coding and non-coding sequences

  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

A method allowing to measure the inhomogeneous distribution of purines/pyrimidines in nucleotide sequences is developed. We show that this measure relates to the coding or non-coding character of the considered sequence. Coding sequences present a near to the random Pu or Py distribution. This property is shared by both protein-coding DNA and functional RNA-coding DNA. Non-coding sequences present a highly clustered inhomogeneity. We propose the hypothesis, corroborated with appropriate computer simulations, that this is due to the action of various transposition events accumulated for long time periods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Almirantis, Y. and S. Papageorgiou. 1983. Long or short range correlations in DNA sequences? InProceedings of the European Conference on Artificial Life. Brussels: Free University of Brussels.

    Google Scholar 

  • Buldyrev, S. V., A. L. Goldberg, S. Havlin, C.-K. Peng, M. Simons and H. E. Stanley. 1993. Generalized Levy-walk model for DNA nucleotide sequences.Phys. Rev. E. 47, 4514–4523.

    Article  Google Scholar 

  • Chatzidimitriou-Dreismann, C. A. and D. Larhammar. 1992. Long-range correlations in DNA.Nature 361, 782.

    Google Scholar 

  • Kapitonov, V. V. and I. I. Titov. 1994. The order of Introns and long-range correlations in nucleotide sequences.Dokl. Bio. Sci. 337, 403–405.

    Google Scholar 

  • Karlin, S. and V. Brendel. 1993. Patchiness and correlations in DNA sequences.Science 259, 677–680.

    Google Scholar 

  • Knuth, D. E. 1981.The Art of Computer Programming. Reading, MA: Addison-Wesley

    Google Scholar 

  • Li, W. and K. Kaneko. 1992. Long-range correlations and partial 1/f α spectrum in a noncoding DNA sequence.Europhys. Lett. 17, 655–660.

    Google Scholar 

  • Mani, G. S. 1992a. Correlations between the coding and non-coding regions in DNA.J. Theor. Biol. 158, 429–445.

    Article  Google Scholar 

  • Mani, G. S. 1992b. Long-range doublet correlations in DNA and the coding regions.J. Theor. Biol. 158, 447–464.

    Article  Google Scholar 

  • Mulligan, M. E., D. K. Hawley, R. Entriken and W. R. McClure. 1984.E. coli promoter sequences predictin vitro RNA polymerase activity.Nucleic Acids Res. 12, 789–800.

    Google Scholar 

  • Munson, P. J., R. C. Taylor and G. S. Mickaels. 1992. DNA correlations.Nature 360, 636.

    Article  Google Scholar 

  • Nee, S. 1992. Uncorrelated DNA walks.Nature 357, 450.

    Article  Google Scholar 

  • Nicolis, J. S. and A. A. Katsikas. 1994. Chaotic dynamics in linguistic-like processes at the syntactic and semantic levels: in pursuit of a multifractal attractor. InCooporation and Conflict in General Evolutionary Processes, J. L. Custi and A. Karlqvist (Eds). New York: Wiley.

    Google Scholar 

  • Peng, C.-K., S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons and H. E. Stanley. 1992. Long-range correlations in nucleotide sequences.Nature 356, 168–170.

    Article  Google Scholar 

  • Popov, O., D. M. Segal and E. N. Trifonov. 1996. Linguistic complexity of protein sequences as compared to texts of human languages.BioSystems 38, 65–74.

    Article  Google Scholar 

  • Prabhu, V. V. and J.-M. Clavert. 1992. Correlations in intronless DNA.Nature 359, 782.

    Article  Google Scholar 

  • Provata, A. and Y. Almirantis. Unpublished manuscript.

  • Staden, R. 1984a. Graphic methods to determine the function of nucleic acid sequences.Nucleic. Acids Res. 12, 521–538.

    Google Scholar 

  • Staden, R. 1984b. Measurements of the effects that coding for a protein has on a DNA sequence and their use for finding genes.Nucleic Acids Res. 12, 551–567.

    Google Scholar 

  • Trifonov, E. N. 1989. The multiple codes of nucleotide sequences.Bull. Math. Biol. 51, 417–432.

    MATH  Google Scholar 

  • Trifonov, E. N. 1993. Gene splicing: spatial separation of overlapping messages.Comput. Chem. 17, 27–31.

    Article  Google Scholar 

  • Tsonis, A. A., J. B. Elsner and P. A. Tsonis. 1991. Periodicity in DNA coding sequences: implications in gene evolution.J. Theor. Biol. 151, 323–331.

    Google Scholar 

  • Voss, R. F. 1992. Evolution of long-range fractal correlations and 1/f noice in DNA base sequences.Phys. Rev. Lett. 68, 3805–3808.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Almirantis, Y., Provata, A. The “clustered structure” of the purines/pyrimidines distribution in DNA distinguishes systematically between coding and non-coding sequences. Bltn Mathcal Biology 59, 975–992 (1997). https://doi.org/10.1007/BF02460002

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02460002

Keywords

Navigation