Advertisement

Segmentation of time series with long-range fractal correlations

  • P. Bernaola-GalvánEmail author
  • J. L. Oliver
  • M. Hackenberg
  • A. V. Coronado
  • P. Ch. Ivanov
  • P. Carpena
Regular Article

Abstract

Segmentation is a standard method of data analysis to identify change-points dividing a nonstationary time series into homogeneous segments. However, for long-range fractal correlated series, most of the segmentation techniques detect spurious change-points which are simply due to the heterogeneities induced by the correlations and not to real nonstationarities. To avoid this oversegmentation, we present a segmentation algorithm which takes as a reference for homogeneity, instead of a random i.i.d. series, a correlated series modeled by a fractional noise with the same degree of correlations as the series to be segmented. We apply our algorithm to artificial series with long-range correlations and show that it systematically detects only the change-points produced by real nonstationarities and not those created by the correlations of the signal. Further, we apply the method to the sequence of the long arm of human chromosome 21, which is known to have long-range fractal correlations. We obtain only three segments that clearly correspond to the three regions of different G  +  C composition revealed by means of a multi-scale wavelet plot. Similar results have been obtained when segmenting all human chromosome sequences, showing the existence of previously unknown huge compositional superstructures in the human genome.

Keywords

Statistical and Nonlinear Physics 

References

  1. 1.
    I. Berkes, L. Horvath, P. Kokoszka, Q.M. Shao, Ann. Stat. 34, 1140 (2006)MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    B.J. West, M.F. Shlesinger, Int. J. Mod. Phys. B 3, 795 (1989)MathSciNetADSCrossRefGoogle Scholar
  3. 3.
    Theory and Applications of Long-Range Dependence, edited by P. Doukhan, G. Oppenheim, M.S. Taqqu (Birkhäuser, Boston, 2002)Google Scholar
  4. 4.
    P.Ch. Ivanov, L.A.N. Amaral, A.L. Goldberger, H.E. Stanley, Europhys. Lett. 43, 363 (1998)ADSCrossRefGoogle Scholar
  5. 5.
    Change-point Problems. Lecture notes and Monograph series, edited by E. Carlstein, H.G. Muller, D. Siegmund (Institute of Mathematical Statistics, Hayward, CA, 1994), Vol. 23Google Scholar
  6. 6.
    H. Kantz, T. Schreiber, Nonlinear Time Series Analysis (Cambridge University Press, Cambridge, 1997)Google Scholar
  7. 7.
    T. Schreiber, Phys. Rev. Lett. 78, 843 (1997)ADSCrossRefGoogle Scholar
  8. 8.
    A. Witt, J. Kurths, A. Pikovsky, Phys. Rev. E 58, 1800 (1998)ADSCrossRefGoogle Scholar
  9. 9.
    G. Mayer-Kress, Integr. Physiol. Behav. Sci. 29, 205 (1994)CrossRefGoogle Scholar
  10. 10.
    R. Hegger, H. Kantz, L. Matassini, Phys. Rev. Lett. 84, 3197 (2000)ADSCrossRefGoogle Scholar
  11. 11.
    M.M. Wolf et al., Med. J. Aust. 2, 52 (1978)Google Scholar
  12. 12.
    C. Guilleminault et al., Lancet 1, 126 (1984)CrossRefGoogle Scholar
  13. 13.
    P.Ch. Ivanov et al., Nature 383, 323 (1996)ADSCrossRefGoogle Scholar
  14. 14.
    P. Bernaola-Galván, P.Ch. Ivanov, L.A.N. Amaral, H.E. Stanley, Phys. Rev. Lett. 87, 168105 (2001)ADSCrossRefGoogle Scholar
  15. 15.
    P.Ch. Ivanov et al., Europhys. Lett. 48, 594 (1999)ADSCrossRefGoogle Scholar
  16. 16.
    J.W. Kantelhardt et al., Phys. Rev. E 65, 051908 (2002)ADSCrossRefGoogle Scholar
  17. 17.
    R. Karasik et al., Phys. Rev. E 66, 062902 (2002)ADSCrossRefGoogle Scholar
  18. 18.
    P.Ch. Ivanov, Z. Chen, K. Hu, H.E. Stanley, Physica A 344, 685 (2004)MathSciNetADSCrossRefGoogle Scholar
  19. 19.
    P.Ch. Ivanov et al., Proc. Natl. Acad. Sci. USA 104, 20702 (2007)ADSCrossRefGoogle Scholar
  20. 20.
    D.T. Schmitt, P.K. Stein, P.Ch. Ivanov, IEEE Trans. Biomed. Eng. 56, 1564 (2009)CrossRefGoogle Scholar
  21. 21.
    P.Ch. Ivanov, IEEE Eng. Med. Biol. Mag. 26, 33 (2007)CrossRefGoogle Scholar
  22. 22.
    M. Gardiner-Garden, M. Frommer, J. Mol. Biol. 196, 261 (1987)CrossRefGoogle Scholar
  23. 23.
    P.L. Luque-Escamilla et al., Phys. Rev. E 71, 061925 (2005)ADSCrossRefGoogle Scholar
  24. 24.
    M. Hackenberg et al., BMC Bioinformatics 7, 446 (2006)CrossRefGoogle Scholar
  25. 25.
    M. Ortuño et al., Europhys. Lett. 57, 759 (2002)ADSCrossRefGoogle Scholar
  26. 26.
    P. Carpena et al., Phys. Rev. E 79, 035102 (2009)ADSCrossRefGoogle Scholar
  27. 27.
    J.C. Wong, H. Lian, S.A. Cheong, Phys. A 388, 4635 (2009)CrossRefGoogle Scholar
  28. 28.
    K. Fukuda et al., Europhys. Lett. 62, 189 (2003)ADSCrossRefGoogle Scholar
  29. 29.
    L. Horváth, J. Multivar. Anal. 78, 218 (2001)zbMATHCrossRefGoogle Scholar
  30. 30.
    S. Ben Hariz, J.J. Wylie, C. R. Math. 341, 765 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  31. 31.
    L.H. Wang, J. Stat. Comput. Simul. 78, 653 (2007)CrossRefGoogle Scholar
  32. 32.
    L. Horváth, P. Kokoszka, J. Stat. Plann. Inference 64, 57 (1997)zbMATHCrossRefGoogle Scholar
  33. 33.
    C. Inclán, C. Tiao, J. Am. Stat. Assoc. 89, 913 (1994)zbMATHGoogle Scholar
  34. 34.
    B. Whitcher, P. Guttorp, D.B. Percival, J. Stat. Comput. Simul. 68, 65 (2000)MathSciNetzbMATHCrossRefGoogle Scholar
  35. 35.
    B. Whitcher, S.D. Byers, P. Guttorp, D.B. Percival, Water Resour. Res. 38, 1054 (2002)ADSCrossRefGoogle Scholar
  36. 36.
    E. Andreou, E. Ghysels, J. Appl. Econ. 17, 579 (2002)CrossRefGoogle Scholar
  37. 37.
    J. Beran, N. Terrin, Biometrika 83, 627 (1996)MathSciNetzbMATHCrossRefGoogle Scholar
  38. 38.
    L.H. Wang, J.D. Wang, J. Stat. Comput. Simul. 76, 317 (2006)MathSciNetzbMATHCrossRefGoogle Scholar
  39. 39.
    P. Carpena, P. Bernaola-Galván, Phys. Rev. B 60, 201 (1999)ADSCrossRefGoogle Scholar
  40. 40.
    I. Grosse, P. Bernaola-Galván, P. Carpena, R. Román-Roldán, J.L. Oliver, H.E. Stanley, Phys. Rev. E 65, 041905 (2002)MathSciNetADSCrossRefGoogle Scholar
  41. 41.
    G.L. Feng, Z.Q. Gong, W.J. Dong, J.P. Li, Acta Physica Sinica 54, 5494 (2005)Google Scholar
  42. 42.
    G.L. Feng, Z.Q. Gong, R. Zhi, D.Q. Zhang, Chin. Phys. B 17, 2745 (2008)ADSCrossRefGoogle Scholar
  43. 43.
    J.L. Oliver et al., Gene 276, 47 (2001)CrossRefGoogle Scholar
  44. 44.
    J.L. Oliver et al., Gene 300, 117 (2002)CrossRefGoogle Scholar
  45. 45.
    W. Li, P. Bernaola-Galván, P. Carpena, J.L. Oliver. Comput. Biol. Chem. 27, 5 (2003)CrossRefGoogle Scholar
  46. 46.
    J.L. Oliver et al., Nucleic Acids Res. 32, W287 (2004)CrossRefGoogle Scholar
  47. 47.
    V. Thakur, R.K. Azad, R. Ramaswamy, Phys. Rev. E 75, 011915 (2007)ADSCrossRefGoogle Scholar
  48. 48.
    B. Toth, F. Lillo, J.D. Farmer, Eur. Phys. J. B 78, 235 (2010)ADSCrossRefGoogle Scholar
  49. 49.
    J. Beran, Statistics for long memory processes (Chapman & Wall, 1994)Google Scholar
  50. 50.
    S.B. Lowen, M.C. Teich, Fractal-Based Point Processes (Wiley Interscience, 2005), Chap. 6Google Scholar
  51. 51.
    K. Fukuda, H.E. Stanley, L.A.N. Amaral, Phys. Rev. E 69, 021108 (2004)ADSCrossRefGoogle Scholar
  52. 52.
    W. Wyss, Found. Phys. Lett. 4, 235 (1991)MathSciNetCrossRefGoogle Scholar
  53. 53.
    J.R.M. Hosking, Biometrika 68, 165 (1981)MathSciNetzbMATHCrossRefGoogle Scholar
  54. 54.
    H.A. Makse, S. Havlin, M. Schwartz, H.E. Stanley, Phys. Rev. E 53, 5445 (1996)ADSCrossRefGoogle Scholar
  55. 55.
    C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger, Phys. Rev. E 49, 1685 (1994)ADSCrossRefGoogle Scholar
  56. 56.
    K. Hu et al., Phys. Rev. E 64, 011114 (2001)ADSCrossRefGoogle Scholar
  57. 57.
    Z. Chen et al., Phys. Rev. E 65, 041107 (2002)ADSCrossRefGoogle Scholar
  58. 58.
    Q.D.Y. Ma et al., Phys. Rev. E 81, 031101 (2010)ADSCrossRefGoogle Scholar
  59. 59.
    Z. Chen et al., Phys. Rev. E 71, 011104 (2005)ADSCrossRefGoogle Scholar
  60. 60.
    Y. Xu et al., Physica A 390, 4057 (2011)ADSCrossRefGoogle Scholar
  61. 61.
    L.M. Xu et al., Phys. Rev. E 71, 051101 (2005)ADSCrossRefGoogle Scholar
  62. 62.
    P. Bernaola-Galván, R. Román-Roldán, J.L. Oliver, Phys. Rev. E 53, 5181 (1996)ADSCrossRefGoogle Scholar
  63. 63.
    W.H. Press et al., Numerical Recipes in FORTRAN (Cambridge University Press, Cambridge, 1994)Google Scholar
  64. 64.
    W. Li, Phys. Rev. Lett. 86, 5815 (2001)ADSCrossRefGoogle Scholar
  65. 65.
    W. Li, Gene 276, 57 (2001)CrossRefGoogle Scholar
  66. 66.
    P. Carpena, J.L. Oliver, M. Hackenberg, A.V. Coronado, G. Barturen, P. Bernaola-Galván. Phys. Rev. E 83, 031908 (2011)ADSGoogle Scholar
  67. 67.
    N. Haiminen, H. Manila, E. Terzi, BMC Bioinformatics 8, 171 (2007)CrossRefGoogle Scholar
  68. 68.
    R. Bellman, Coummun ACM 4, 284 (1961)zbMATHCrossRefGoogle Scholar
  69. 69.
    W. Li, Complexity 3, 33 (1998)MathSciNetCrossRefGoogle Scholar
  70. 70.
    R. Román-Roldán, P. Bernaola-Galván, J.L. Oliver, Phys. Rev. Lett. 80, 1344 (1998)ADSCrossRefGoogle Scholar
  71. 71.
    P. Bernaola-Galván, R. Román-Roldán, J.L. Oliver, Phys. Rev. Lett. 83, 3336 (1999)ADSCrossRefGoogle Scholar
  72. 72.
    P. Bernaola-Galván, P. Carpena, R. Román-Roldán, J.L. Oliver, Gene 300, 105 (2002)CrossRefGoogle Scholar
  73. 73.
    P.J. Dandliker, R.E. Holmlin, J.K. Barton, Science 275, 1465 (1997)CrossRefGoogle Scholar
  74. 74.
    P. Carpena, P. Bernaola-Galván, P.Ch. Ivanov, H.E. Stanley, Nature 418, 955 (2002)ADSCrossRefGoogle Scholar
  75. 75.
    M. Rief, H. Clausen-Schaumann, H.E. Gaub, Nat. Struct. Biol. 6, 346 (1999)CrossRefGoogle Scholar
  76. 76.
    J.C. Venter et al., Science 291, 1304 (2001)ADSCrossRefGoogle Scholar
  77. 77.
    N. Cohen, T. Dagan, L. Stone, D. Graur, Mol. Biol. Evol. 22, 1260 (2005)CrossRefGoogle Scholar
  78. 78.
    O. Clay, G. Bernardi, Mol. Biol. Evol. 22, 2315 (2005)CrossRefGoogle Scholar
  79. 79.
    P. Carpena, P. Bernaola-Galván, A.V. Coronado, M. Hackenberg, J.L. Oliver Phys. Rev. E 75, 032903 (2007)ADSCrossRefGoogle Scholar
  80. 80.
    A. Arneodo, E. Bacry, P.V. Graves, J.F. Muzy, Phys. Rev. Lett. 74, 3293 (1995)ADSCrossRefGoogle Scholar

Copyright information

© EDP Sciences, SIF, Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • P. Bernaola-Galván
    • 1
    Email author
  • J. L. Oliver
    • 2
  • M. Hackenberg
    • 2
  • A. V. Coronado
    • 1
  • P. Ch. Ivanov
    • 3
    • 4
    • 5
  • P. Carpena
    • 1
  1. 1.Dpto. de Física Aplicada IIUniversidad de MálagaMálagaSpain
  2. 2.Dpto. de Genética, Inst. de BiotecnologíaUniversidad de GranadaGranadaSpain
  3. 3.Harvard Medical School, Division of Sleep MedicineBostonUSA
  4. 4.Department of Physics and Center for Polymer StudiesBoston UniversityBostonUSA
  5. 5.Institute of Solid State Physics, Bulgarian Academy of SciencesSofiaBulgaria

Personalised recommendations