Abstract
We study the size distribution of coding and non-coding regions in DNA sequences. For most organisms we observe that the size distribution P c(S) of the coding regions of size S shows short range distribution, whereas the size distribution of the non-coding regions follows a power-law decay P nc(S) ∼ S −1 − μ, with power exponents indicating clear long-range behavior. We argue, using the Generalized Central Limit Theorem, that the long-range distributions observed in the non-coding are related to the lower level clustering of purines and pyrimidines (1d islands) which follow similar long-range laws. We also address the question of clustering of coding segments in the two complementary strands of DNA. We observe a short-range clustering of coding regions in both strands, expressed by an exponential decay in the clustering size distribution. The decay exponent expresses the degree of short-range correlations and the deviation from random clustering.
Similar content being viewed by others
REFERENCES
C. K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons, and H. E. Stanley, Nature 356:168 (1992).
R. N. Mantegna, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K Peng, M. Simons, and H. E. Stanley, Phys. Rev. Lett. 73:3169 (1994); A. Czirók, R. N. Mantegna, S. Havlin, and H. E. Stanley, Phys. Rev. E 52:446 (1995); C. A. Chatzidimitriou-Dreismann, R. M. Streffer, and D. Larhammar, Nucleic Acid Research 24:1676 (1996); Y. Almirantis and S. Papageorgiou, Proceedings of the European Conference on Artificial Life, J.-L. Deneubourg, page 9, ed. (U.L.B, Brussels, 1993); W. Li and K. Kaneko, Europhys. Lett. 17:655 (1992).
A. Provata and Y. Almirantis, Physica A 247:482 (1997).
A. Provata, Physica A 264:570 (1999).
S. V. Buldyrev, A. L. Goldberger, C.-K. Peng, M. Simons, and H. E. Stanley, Phys. Rev. E 47:4514 (1993).
N. V. Dokholyan, V. Sergey, S. V. Buldyrev, S. Havlin, and H. E. Stanley, Phys. Rev. Lett. 79:5182 (1997).
A. A. Tsonis, J. B. Elsner, and P. A. Tsonis, J. Theor. Biol. 151:323 (1991).
H. Herzel and I. Grosse, Physica A 216:518 (1995).
O. Popov, D. M. Segal, and E. N. Trifonov, BioSystems 38:65 (1996); E. N. Trifonov, Bull. Math. Biol. 51:417 (1989).
B. Alberts, D. Bray, J. Lewis, M. Raff, K. Roberts, and J. D. Watson, Molecular Biology of the Cell (Garland Publishing, Inc., New York, 1994).
B. Lewin, Genes VI (Oxford University Press, Oxford, 1997).
Y. Almirantis and A. Provata, Bull. Math. Biol. 59:975 (1997).
Y. Almirantis, J. Theor. Biol. 196:297 (1999).
W. Feller, An Introduction to Probability Theory and Its Applications (Wiley, New York, 1966).
H. Takayasu, Fractals in the Physical Sciences (Manchester University Press, 1990).
H. Takayasu, M. Takayasu, A. Provata, and G. Huber, J. Stat. Phys. 65:725 (1991); H. Takayasu, Phys. Rev. Lett. 63:2563 (1989).
S. Brenner, G. Elgar, R. Sandford, A. Macrae, B. Venkatesh, and S. Aparicio, Nature 366:265 (1993).
D. Stauffer, Introduction to Percolation Theory (Taylor and Francis, London, 1985).
P.-G. De Gennes, Scaling Concepts in Polymer Physics (Cornell University Press, London, 1979).
G. Nicolis, Self-organization in non-equilibrium systems (Wiley, New York 1977).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Almirantis, Y., Provata, A. Long- and Short-Range Correlations in Genome Organization. Journal of Statistical Physics 97, 233–262 (1999). https://doi.org/10.1023/A:1004671119400
Issue Date:
DOI: https://doi.org/10.1023/A:1004671119400