Abstract
Segmentation is a standard method of data analysis to identify change-points dividing a nonstationary time series into homogeneous segments. However, for long-range fractal correlated series, most of the segmentation techniques detect spurious change-points which are simply due to the heterogeneities induced by the correlations and not to real nonstationarities. To avoid this oversegmentation, we present a segmentation algorithm which takes as a reference for homogeneity, instead of a random i.i.d. series, a correlated series modeled by a fractional noise with the same degree of correlations as the series to be segmented. We apply our algorithm to artificial series with long-range correlations and show that it systematically detects only the change-points produced by real nonstationarities and not those created by the correlations of the signal. Further, we apply the method to the sequence of the long arm of human chromosome 21, which is known to have long-range fractal correlations. We obtain only three segments that clearly correspond to the three regions of different G + C composition revealed by means of a multi-scale wavelet plot. Similar results have been obtained when segmenting all human chromosome sequences, showing the existence of previously unknown huge compositional superstructures in the human genome.
Similar content being viewed by others
References
I. Berkes, L. Horvath, P. Kokoszka, Q.M. Shao, Ann. Stat. 34, 1140 (2006)
B.J. West, M.F. Shlesinger, Int. J. Mod. Phys. B 3, 795 (1989)
Theory and Applications of Long-Range Dependence, edited by P. Doukhan, G. Oppenheim, M.S. Taqqu (Birkhäuser, Boston, 2002)
P.Ch. Ivanov, L.A.N. Amaral, A.L. Goldberger, H.E. Stanley, Europhys. Lett. 43, 363 (1998)
Change-point Problems. Lecture notes and Monograph series, edited by E. Carlstein, H.G. Muller, D. Siegmund (Institute of Mathematical Statistics, Hayward, CA, 1994), Vol. 23
H. Kantz, T. Schreiber, Nonlinear Time Series Analysis (Cambridge University Press, Cambridge, 1997)
T. Schreiber, Phys. Rev. Lett. 78, 843 (1997)
A. Witt, J. Kurths, A. Pikovsky, Phys. Rev. E 58, 1800 (1998)
G. Mayer-Kress, Integr. Physiol. Behav. Sci. 29, 205 (1994)
R. Hegger, H. Kantz, L. Matassini, Phys. Rev. Lett. 84, 3197 (2000)
M.M. Wolf et al., Med. J. Aust. 2, 52 (1978)
C. Guilleminault et al., Lancet 1, 126 (1984)
P.Ch. Ivanov et al., Nature 383, 323 (1996)
P. Bernaola-Galván, P.Ch. Ivanov, L.A.N. Amaral, H.E. Stanley, Phys. Rev. Lett. 87, 168105 (2001)
P.Ch. Ivanov et al., Europhys. Lett. 48, 594 (1999)
J.W. Kantelhardt et al., Phys. Rev. E 65, 051908 (2002)
R. Karasik et al., Phys. Rev. E 66, 062902 (2002)
P.Ch. Ivanov, Z. Chen, K. Hu, H.E. Stanley, Physica A 344, 685 (2004)
P.Ch. Ivanov et al., Proc. Natl. Acad. Sci. USA 104, 20702 (2007)
D.T. Schmitt, P.K. Stein, P.Ch. Ivanov, IEEE Trans. Biomed. Eng. 56, 1564 (2009)
P.Ch. Ivanov, IEEE Eng. Med. Biol. Mag. 26, 33 (2007)
M. Gardiner-Garden, M. Frommer, J. Mol. Biol. 196, 261 (1987)
P.L. Luque-Escamilla et al., Phys. Rev. E 71, 061925 (2005)
M. Hackenberg et al., BMC Bioinformatics 7, 446 (2006)
M. Ortuño et al., Europhys. Lett. 57, 759 (2002)
P. Carpena et al., Phys. Rev. E 79, 035102 (2009)
J.C. Wong, H. Lian, S.A. Cheong, Phys. A 388, 4635 (2009)
K. Fukuda et al., Europhys. Lett. 62, 189 (2003)
L. Horváth, J. Multivar. Anal. 78, 218 (2001)
S. Ben Hariz, J.J. Wylie, C. R. Math. 341, 765 (2005)
L.H. Wang, J. Stat. Comput. Simul. 78, 653 (2007)
L. Horváth, P. Kokoszka, J. Stat. Plann. Inference 64, 57 (1997)
C. Inclán, C. Tiao, J. Am. Stat. Assoc. 89, 913 (1994)
B. Whitcher, P. Guttorp, D.B. Percival, J. Stat. Comput. Simul. 68, 65 (2000)
B. Whitcher, S.D. Byers, P. Guttorp, D.B. Percival, Water Resour. Res. 38, 1054 (2002)
E. Andreou, E. Ghysels, J. Appl. Econ. 17, 579 (2002)
J. Beran, N. Terrin, Biometrika 83, 627 (1996)
L.H. Wang, J.D. Wang, J. Stat. Comput. Simul. 76, 317 (2006)
P. Carpena, P. Bernaola-Galván, Phys. Rev. B 60, 201 (1999)
I. Grosse, P. Bernaola-Galván, P. Carpena, R. Román-Roldán, J.L. Oliver, H.E. Stanley, Phys. Rev. E 65, 041905 (2002)
G.L. Feng, Z.Q. Gong, W.J. Dong, J.P. Li, Acta Physica Sinica 54, 5494 (2005)
G.L. Feng, Z.Q. Gong, R. Zhi, D.Q. Zhang, Chin. Phys. B 17, 2745 (2008)
J.L. Oliver et al., Gene 276, 47 (2001)
J.L. Oliver et al., Gene 300, 117 (2002)
W. Li, P. Bernaola-Galván, P. Carpena, J.L. Oliver. Comput. Biol. Chem. 27, 5 (2003)
J.L. Oliver et al., Nucleic Acids Res. 32, W287 (2004)
V. Thakur, R.K. Azad, R. Ramaswamy, Phys. Rev. E 75, 011915 (2007)
B. Toth, F. Lillo, J.D. Farmer, Eur. Phys. J. B 78, 235 (2010)
J. Beran, Statistics for long memory processes (Chapman & Wall, 1994)
S.B. Lowen, M.C. Teich, Fractal-Based Point Processes (Wiley Interscience, 2005), Chap. 6
K. Fukuda, H.E. Stanley, L.A.N. Amaral, Phys. Rev. E 69, 021108 (2004)
W. Wyss, Found. Phys. Lett. 4, 235 (1991)
J.R.M. Hosking, Biometrika 68, 165 (1981)
H.A. Makse, S. Havlin, M. Schwartz, H.E. Stanley, Phys. Rev. E 53, 5445 (1996)
C.-K. Peng, S.V. Buldyrev, S. Havlin, M. Simons, H.E. Stanley, A.L. Goldberger, Phys. Rev. E 49, 1685 (1994)
K. Hu et al., Phys. Rev. E 64, 011114 (2001)
Z. Chen et al., Phys. Rev. E 65, 041107 (2002)
Q.D.Y. Ma et al., Phys. Rev. E 81, 031101 (2010)
Z. Chen et al., Phys. Rev. E 71, 011104 (2005)
Y. Xu et al., Physica A 390, 4057 (2011)
L.M. Xu et al., Phys. Rev. E 71, 051101 (2005)
P. Bernaola-Galván, R. Román-Roldán, J.L. Oliver, Phys. Rev. E 53, 5181 (1996)
W.H. Press et al., Numerical Recipes in FORTRAN (Cambridge University Press, Cambridge, 1994)
W. Li, Phys. Rev. Lett. 86, 5815 (2001)
W. Li, Gene 276, 57 (2001)
P. Carpena, J.L. Oliver, M. Hackenberg, A.V. Coronado, G. Barturen, P. Bernaola-Galván. Phys. Rev. E 83, 031908 (2011)
N. Haiminen, H. Manila, E. Terzi, BMC Bioinformatics 8, 171 (2007)
R. Bellman, Coummun ACM 4, 284 (1961)
W. Li, Complexity 3, 33 (1998)
R. Román-Roldán, P. Bernaola-Galván, J.L. Oliver, Phys. Rev. Lett. 80, 1344 (1998)
P. Bernaola-Galván, R. Román-Roldán, J.L. Oliver, Phys. Rev. Lett. 83, 3336 (1999)
P. Bernaola-Galván, P. Carpena, R. Román-Roldán, J.L. Oliver, Gene 300, 105 (2002)
P.J. Dandliker, R.E. Holmlin, J.K. Barton, Science 275, 1465 (1997)
P. Carpena, P. Bernaola-Galván, P.Ch. Ivanov, H.E. Stanley, Nature 418, 955 (2002)
M. Rief, H. Clausen-Schaumann, H.E. Gaub, Nat. Struct. Biol. 6, 346 (1999)
J.C. Venter et al., Science 291, 1304 (2001)
N. Cohen, T. Dagan, L. Stone, D. Graur, Mol. Biol. Evol. 22, 1260 (2005)
O. Clay, G. Bernardi, Mol. Biol. Evol. 22, 2315 (2005)
P. Carpena, P. Bernaola-Galván, A.V. Coronado, M. Hackenberg, J.L. Oliver Phys. Rev. E 75, 032903 (2007)
A. Arneodo, E. Bacry, P.V. Graves, J.F. Muzy, Phys. Rev. Lett. 74, 3293 (1995)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bernaola-Galván, P., Oliver, J.L., Hackenberg, M. et al. Segmentation of time series with long-range fractal correlations. Eur. Phys. J. B 85, 211 (2012). https://doi.org/10.1140/epjb/e2012-20969-5
Received:
Revised:
Published:
DOI: https://doi.org/10.1140/epjb/e2012-20969-5