Advertisement

Il Nuovo Cimento D

, Volume 16, Issue 9, pp 1339–1356 | Cite as

Statistical and linguistic features of noncoding DNA: A heterogeneous «Complex system»

  • H. E. Stanley
  • S. V. Buldyrev
  • A. L. Goldberger
  • S. Havlin
  • R. N. Mantegna
  • C. K. Peng
  • M. Simons
Article

Summary

We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range-indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene; we utilize this fact to build a Coding Sequence Finder algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. We resolve the problem of the «non-stationarity» feature of the sequence of base pairs (that the relative concentration of purines and pyrimidines changes in different regions of the mosaic-like chain) by describing a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33 301 coding and 29 453 non-coding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power law correlations (and the systematic variation of the scaling exponent α with evolution) which is based upon a generalization of the classic Lévy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the «redundancy» of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in eukaryotes may display a smaller entropy and larger redundancy than coding regions for plants and invertebrates, further supporting the possibility that noncoding regions of DNA may carry biological information.

PACS 87.10

General, theoretical, and mathematical biophysics (including logic of biosystems, quantum biology, and relevant aspects of thermodynamics, information theory, cybernetics, and bionics) 

PACS 05.20

Statistical mechanics 

PACS 01.30.Cc

Conference proceedings 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    B. B. Mandelbrot: The Fractal Geometry of Nature (W. H. Freeman, San Francisco, Cal., 1982).zbMATHGoogle Scholar
  2. [2]
    A. Bunde and S. Havlin (Editors): Fractals and Disordered Systems (Springer-Verlag, Berlin, 1991); Fractal in Science (Springer-Verlag, Berlin, 1994); T. Vicsek, M. Shlesinger and M. Matsushita (Editors): Fractals, in Natural Sciences (World Scientific, Singapore, 1994).zbMATHGoogle Scholar
  3. [3]
    J. M. Garcia-Ruiz, E. Louis, P. Meakin and L. Sander (Editors): Growth patterns in physical sciences and biology, in Proc. 1991 NATO Advanced Research Workshop, Granada, Spain, October 1991 (Plenum, New York, N.Y., 1993).Google Scholar
  4. [4]
    A. Yu. Grosberg and A. R. Khokhlov: Statistical Physics of macromolecules, translated by Y. A. Atanov (AIP Press, New York, N.Y., 1993).Google Scholar
  5. [5]
    J. B. Bassingthwaighte, L. S. Liebovitch and B. J. West: Fractal Physiology (Oxford University Press, New York, N.Y., 1994).Google Scholar
  6. [6]
    A.-L. Barabási and H. E. Stanley: Fractal Concepts in Surface Growth (Cambridge University Press, Cambridge, 1995).zbMATHGoogle Scholar
  7. [7]
    B. J. West and A. L. Goldberger: J. Appl. Physiol., 60, 189 (1986); Am. Sci75, 354 (1987); A. L. Goldberger and B. J. West: Yale J. Biol. Med., 60, 421 (1987); A. L. Golberger, D. R. Rigney and B. J. West: Sci Am., 262, 42 (1990); B. J. West and M. F. Shlesinger: Am. Sci., 78, 40 (1990); B. J. West: Fractal Physiology and Chaos in Medicine (World Scientific, Singapore, 1990); B. J. West and W. Deering: Phys. Rep., 246, 1 (1994); S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng and H. E. Stanley: in Fractals in Science, edited by A. Bunde and S. Havlin (Springer-Verlag, Berlin, 1994), p. 49.CrossRefGoogle Scholar
  8. [8]
    T. Vicsek: Fractal Growth Phenomena., 2nd edition (World Scientific, Singapore, 1992).zbMATHGoogle Scholar
  9. [9]
    J. Feder: Fractals (Plenum, New York, N.Y., 1988).zbMATHGoogle Scholar
  10. [10]
    D. Stauffer and H. E. Stanley: From Newton to Mandelbrot: A Primer in Theoretical Physics (Springer-Verlag, Heidelberg, New York, 1990).zbMATHGoogle Scholar
  11. [11]
    E. Guyon and H. E. Stanley: Les formes fractales (Palais de la Découverte, Paris, 1991); English translation: Fractal Forms (Elsevier North Holland, Amsterdam, 1991).Google Scholar
  12. [12]
    H. E. Stanley and N. Ostrowsky (Editors): Random, fluctuations and pattern growth: experiments and models, in Proceedings 1988 Cargèse NATO ASI (Kluwer Academic Publishers, Dordrecht, 1988).Google Scholar
  13. [13]
    H. E. Stanley: Introduction to Phase Transitions and Critical Phenomena (Oxford University Press, London, 1971).Google Scholar
  14. [14]
    H. E. Stanley and N. Ostrowsky (Editors): Correlations and connectivity: geometric aspects of physics, chemistry and biology, in Proceedings 1990 Cargèse Nato ASI, Series E: Applied Sciences (Kluwer, Dordrecht, 1990).Google Scholar
  15. [15]
    C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons and H. E. Stanley: Nature, 356, 168 (1992).CrossRefADSGoogle Scholar
  16. [16]
    W. Li and K. Kaneko: Europhys. Lett., 17, 655 (1992) M. Y. Azbel: Biopolymers, 21, 1687 (1982).CrossRefADSGoogle Scholar
  17. [17]
    S. Nee: Nature, 357, 450 (1992).CrossRefADSGoogle Scholar
  18. [18]
    R. Voss: Phys. Rev. Lett., 68, 3805 (1992); Fractals, 2, 1 (1994).CrossRefADSGoogle Scholar
  19. [19]
    J. Maddox: Nature, 358, 103 (1992.CrossRefADSGoogle Scholar
  20. [20]
    P. J. Munson, R. C. Taylor and G.S. Michaels: Nature, 360, 636 (1992).CrossRefADSGoogle Scholar
  21. [21]
    I. Amato: Science, 257, 747 (1992).CrossRefADSGoogle Scholar
  22. [22]
    V. V. Prabhu and J.-M. Claverie: Nature, 357 782 (1992).CrossRefADSGoogle Scholar
  23. [23]
    P. Yam: Sci. Am., 267, 23 (1992).CrossRefGoogle Scholar
  24. [24]
    C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons and H. E. Stanley: Physica A, 191, 25 (1992); H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, J. M. Hausdorff, S. Havlin, J. Mietus, C.-K.Peng, F. Sciortino and M. Simons: Physica A, 191, 1 (1992); H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, R. N. Mantegna, S. M. Ossadnik, C.-K. Peng, F. Sciortino and M. Simons: Fractals in biology and medicine, in Diffusion Processes: Experiment, Theory, Simulations, in Proceedings of the V M. Born Symposium, edited by A. Pekalski (Springer-Verlag, Berlin, 1994), p. 147; H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, Z. D. Goldberger, S. Havlin, R. N. Mantegna, S. M. Ossadnik, C.-K. Peng and M. Simons: Statistical mechanics in biology: how ubiquitous are long-range correlations?, in Proceedings of the International Conference on Statistical Mechanics, Physica A, 205, 214 (1994).CrossRefADSGoogle Scholar
  25. [25]
    C. A. Chatzidimitriou-Dreismann and D. Larhammar: Nature, 361, 212 (1993); D. Larhammar and C. A. Chatzidimitriou-Dreismann: Nucleic Acids Res., 21, 5167 (1993); C. A. Chatzidimitriou-Dreismann, R. M. F. Streffer and D. Larhammar: Biochim. Biophys. Acta, 1217, 181 (1994); Eur. J. Biochem., 224, 365 (1994).CrossRefADSGoogle Scholar
  26. [26]
    A. Yu. Grosberg, Y. Rabin, S. Havlin and A. Neer: Europhs. Lett., 23, 373 (1993).CrossRefADSGoogle Scholar
  27. [27]
    S. Karlin and V. Brendel: Science, 259, 677 (1993).CrossRefADSGoogle Scholar
  28. [28]
    C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, M. Simons and H. E. Stanley: Phys. Rev. E, 47, 3730 (1993).CrossRefADSGoogle Scholar
  29. [29]
    N. Shnerb and E. Eisenberg: Phys. Rev. E, 49, R1005 (1994).CrossRefADSGoogle Scholar
  30. [30]
    S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng, M. Simons and H. E. Stanley: Phys Rev., 47, 4514 (1993).ADSGoogle Scholar
  31. [31]
    A. S. Borovik, A. Yu. Grosberg and M. D. Frank Kamenezki: J. Biomolec. Struct. Dyn., 12, 655 (1994).Google Scholar
  32. [32]
    S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng, H. E. Stanley, M. H. R. Stanley and M. Simons: Biophys. J, 65, 2673 (1993).CrossRefGoogle Scholar
  33. [33]
    C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley and A. L. Goldberger: Phys. Rev. E, 49, 1685 (1994).CrossRefADSGoogle Scholar
  34. [34]
    S. M. Ossadnik, S. V. Buldyrev, A. L. Goldberger, S. Havlin, R. N. Mantegna, C.-K. Peng, M. Simons and H. E. Stanley: Biophys. J., 67, 64 (1994); H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng and M. Simons: Proceedings of the International Conference on Condensed Matter Physics, Bar-Ilan, Physica A, 200, 4 (1993); H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, S. M. Ossadnik, C.-K. Peng and M. Simons: Fractals, 1, 283 (1993); S. Havlin, V. Buldyrev, A. L. Goldberger, R. N. Mantegna, S. M. Ossadnik, C.-K. Peng, S. M. Simons and H. E. Stanley: Chaos, Solitons, Fractals, 6, 171 (1995).CrossRefADSGoogle Scholar
  35. [35]
    S. V. Buldyrev, A. L. Goldberger, S. Havlin, R. N. Mantegna, M. E. Matsa, C.-K. Peng, M. Simons and H. E. Stanley: Phys. Rev. E, 51 (1995).Google Scholar
  36. [36]
    R. N. Mantegna, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng, M. Simons and H. E. Stanley: Phys. Rev. Lett., 73, 3169 (1994); F. Flam: Science, 266, 1320 (1994); E. Pennisi: Sci. News, 146, 391 (1994); P. Yam: Sci. Am., 272, 24 (1995). Recent work (Mantegnaet al. (submitted)), suggests the conclusions of this paper may hold for the entire GenBank for plants and invertebrates, but the evidence is somewhat less conclusive for higher forms of life.CrossRefADSGoogle Scholar
  37. [37]
    S. Tavaré and B. W. Giddings; in Mathematical Methods for DNA Sequences edited by M. S. Waterman (CRC Press, Boca Raton, Fla., 1989), p. 117; J. D. Watson, M. Gilman, J. Witkowski and M. Zoller: Recombinant DNA (Scientific American Books, New York, N.Y., 1992).Google Scholar
  38. [38]
    E. W. Montroll and M. F. Shlesinger: The wonderful world of random walks, in Nonequilibrium Phenomena II. From Stochastics to Hydrodynamics, edited by J. L. Lebowitz and E. W. Montroll (North-Holland, Amsterdam, 1984), p. 1.Google Scholar
  39. [39]
    G. H. Weiss: Random Walks (North-Holland, Amsterdam, 1994).Google Scholar
  40. [40]
    S. Havlin, R. Selinger, M. Schwartz, H. E. Stanley and A. BundePhys. Rev. Lett., 61, 1438 (1988); S. Havlin, M. Schawartz, R. Blumberg Selinger, A. Bunde and H. E. Stanley: Phys. Rev. A, 40, 1717 (1989); R. B. Selinger, S. Havlin, F. Leyvraz, M. Schwartz and H. E. Stanley: Phys. Rev. A, 40, 6755, (1989).CrossRefADSGoogle Scholar
  41. [41]
    C.-K. Peng, S. Havlin, M. Schwartz, H. E. Stanley and G. Weiss: Physica A, 178, 401 (1991); C.-K. Peng, S. Havlin, M. Schwartz and H. E. Stanley: Phys. Rev. A, 44, 2239 (1991).CrossRefADSGoogle Scholar
  42. [42]
    M. Araujo, S. Havlin, G. H. Weiss and H. E. Stanley: Phys. Rev. A, 43, 5207 (1991); S. Havlin, S. V. Buldyrev, H. E. Stanley and G. H. Weiss: J. Phys. A, 24, L925 (1991); S. Prakash, S. Havlin, M. Schwartz and H. E. Stanley: Phys. Rev. A, 46, R1724 (1992).CrossRefADSGoogle Scholar
  43. [43]
    C. L. Berthelsen, J. A. Glazier and M. H. Skolnick: Phys. Rev. A, 45, 8902 (1992); M. Y. Azbel: Phys. Rev. Let., 31, 589 (1973).CrossRefADSGoogle Scholar
  44. [44]
    E. C. Uberbacher and R. J. Mural: Proc. Natl. Acad. Sci. USA, 88, 11261 (1991).CrossRefADSGoogle Scholar
  45. [45]
    J. Jurka, T. Walichiewicz and A. Milosavljevic: J. Mol. Evol., 35, 286 (1992).CrossRefGoogle Scholar
  46. [46]
    M. F. Shlesinger and J. Klafter: in On Growth and Form: Fractal and Non-Fractal Patterns in Physics, edited by H. E. Stanley and N. Ostrowsky (Martinus Nijhoff, Dordrecht 1986), p. 279.Google Scholar
  47. [47]
    M. F. Shlesinger, J. Klafter and Y. M. Wong: J. Stat. Phys., 27, 499 (1982).zbMATHCrossRefADSMathSciNetGoogle Scholar
  48. [48]
    M. F. Shlesinger and J. Klafter: Phys. Rev. Lett., 54, 2551 (1985).CrossRefADSGoogle Scholar
  49. [49]
    R. N. Mantegna: Physica A, 179, 232 (1991).CrossRefADSGoogle Scholar
  50. [50]
    J. Jurka: J. Mol. Evol., 29, 496 (1989).CrossRefGoogle Scholar
  51. [51]
    R. H. Hwu, J. W. Roberts, E. H. Davidson and R. J. Britten: Proc. Natl. Acad. Sci. USA, 83, 3875 (1986).CrossRefADSGoogle Scholar
  52. [52]
    E. Zuckerkandl, G. Latter and J. Jurka: J. Mol. Evol., 29, 504 (1989).CrossRefGoogle Scholar
  53. [53]
    B. Levin: Genes IV (Oxford University Press Oxford, 1990).Google Scholar
  54. [54]
    P.-G. de Gennes: Scaling Concepts in Polymer Physics (Cornell University Press, Ithaca, N.Y., 1979).Google Scholar
  55. [55]
    J. de Cloiseaux: J. Phys. (Paris), 41, 223 (1980).Google Scholar
  56. [56]
    S. Redner: J. Phys. A, 13, 3525 (1980).CrossRefADSGoogle Scholar
  57. [57]
    A. Baumgartner: Z. Phys. B, 42, 265 (1981).CrossRefADSGoogle Scholar
  58. [58]
    T. M. Birshtein and S. V. Buldyrev: Polymer, 32, 3387 (1991).CrossRefGoogle Scholar
  59. [59]
    A. Schenkel, J. Zhang and Y.-C. Zhang: Fractals, 1, 47 (1993); M. Amit, Y. Shmerler, E. Eisenberg, M. Abraham and N. Shnerb: Fractals, 2, 7 (1994).zbMATHCrossRefGoogle Scholar
  60. [60]
    G. K. Zipf: Human Behavior and the Principle of «Least Effort» (Addion-Wesley, New York, N.Y., 1949).Google Scholar
  61. [61]
    L. Brillouin: Science and Information Theory (Academic Press, New York, N.Y., 1956).zbMATHGoogle Scholar
  62. [62]
    C. E. Shannon: Bell Systems Tech. J., 80, 50 (1951).Google Scholar
  63. [63]
    A. Czirók, R. N. Mantegna, S. Havlin and H. E. Stanley: Phys. Rev. E, 52 (1995).Google Scholar
  64. [64]
    J.-P. Bouchaud: More Lévy distributions in physics, in Proceedings of the 1993 International Conference on Lévy Flights, edited by M.FF. Shlesinger, G. Zaslavsky and U. Frisch (Springer, Berlin, 1995).Google Scholar
  65. [65]
    M. H. R. Stanley: 1994 Westinghouse Report (unpublished); H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, R. N. Mantegna, C.-K. Peng, M. Simons and M. H. R. Stanley: Long-range correlations and generalized Lévy walks in DNA sequences, in Proceedings of the 1993 International Conference on Lévy Flights, edited by M. F. Shlesinger, G. Zaslavsky and U. Frisch (Springer, Berlin, 1995).Google Scholar
  66. [66]
    J. Pivinski, R. Tucksmith, A. Such and C. Haight: Fortune, 206, 224 (1994).Google Scholar
  67. [67]
    M. H. R. Stanley, S. V. Buldyrev, S. Havlin, R. Mantegna, M. A. Salinger, H. E. Stanley: Zipf plots and the size distribution of firms,, submitted to Eco. Lett. See also R. N. Mantegna and H. E. Stanley: Ultra-slow convergence to a Gaussian: The truncated Lévy flight, in Proceedings of the 1993 International Conference on Lévy Flights, edited by M. F. Shlesinger, G. Zaslavsky and U. Frisch (Springer, Berlin, 1995); R. N. Mantegna and H. E. Stanley: Scaling and intermittency in the mesoscopic dynamics of an economic index, submitted to Nature.Google Scholar
  68. [68]
    C.-K. Peng, J. Mietus, J. Hausdorff, S. Havlin, H. E. Stanley and A. L. Goldberger: Phys. Rev. Lett., 70, 1343 (1993); C. K. Peng, S. V. Buldyrev, J. M. Hausdorff, S. Havlin, J. E. Mietus, M. Simons, H. E. Stanley and A. L. Goldberger: in Fractals in Biology and Medicine, edited by G. A. Losa, T. F. Nonnenmacher and E. R. Weibel (Birkhauser Verlag, Boston, Mass., 1994); A. A. Aghiliet al.: Phys. Rev. Lett., 74, 1254 (1995).CrossRefADSGoogle Scholar
  69. [69]
    C. K. Peng, S. Havlin, H. E. Stanley and A. L. Goldberger: Quantification of scaling exponents and crossover phenomena in nonstationary heartbeat time series, in Proc. NATO Dynamical Disease Conference, edited by L. Glass, Chaos, 5, 2 (1995); C. K. Peng, J. M. Hausdorff, J. E. Mietus, S. Havlin, H. E. Stanley and A. L. Golberger: Fractals in physiological control: from heartbeat to gait, in Proceedings of the 1993 International Conference on Lévy Flights, edited by M. F. Shlesinger, G. Zaslavsky and U. Frisch (Springer, Berlin, 1995).Google Scholar
  70. [70]
    J. M. Hausdorff, C.-K. Peng, Z. Ladin, J. Y. Wei and A. L. Goldberger: J. Appl. Physiol., 78, 349 (1995).Google Scholar
  71. [71]
    W. B. Cannon: Physiol. Rev., 9, 399 (1929).Google Scholar

Copyright information

© Società Italiana di Fisica 1994

Authors and Affiliations

  • H. E. Stanley
    • 1
  • S. V. Buldyrev
    • 1
  • A. L. Goldberger
    • 3
  • S. Havlin
    • 1
    • 2
  • R. N. Mantegna
    • 1
    • 5
  • C. K. Peng
    • 1
    • 3
  • M. Simons
    • 4
  1. 1.Center for Polymer Studies and Department of PhysicsBoston UniversityBostonUSA
  2. 2.Department of PhysicsBar-Ilan UniversityRamat-GanIsrael
  3. 3.Caradiovascular Div., Harvard Medical SchoolBeth Israel HospitalBostonUSA
  4. 4.Department of Biomedical EngineeringBoston UniversityBostonUSA
  5. 5.Dipartimento di Energetica ed Applicazioni di Fisica dell’UniversitàPalermoItaly

Personalised recommendations