Abstract
In this study, we propose a new segmentation algorithm to partition univariate and multivariate time series, where fuzzy clustering is realized for the segments formed in this way. The clustering algorithm involves a new objective function, which incorporates an extra variable related to segmentation, while dynamic time warping (DTW) is applied to determine distances between non-equal-length series. As optimizing the introduced objective function is a challenging task, we put forward an effective approach using dynamic programming (DP) algorithm. When calculating the DTW distance, a DP-based method is developed to reduce the computational complexity. In a series of experiments, both synthetic and real-world time series are used to evaluate the performance of the proposed algorithm. The results demonstrate higher effectiveness and advantages of the constructed algorithm when compared with the existing segmentation approaches.
Similar content being viewed by others
References
Abonyi J, Feil B, Nemeth S, Arva P (2003) Fuzzy clustering based segmentation of time-series. In: Proceedings of the 5th international symposium on intelligent data analysis. Springer, pp 275–285
Abonyi J, Feil B, Nemeth S, Arva P (2005) Modified gath-geva clustering for fuzzy segmentation of multivariate time-series. Fuzzy Sets Syst 149(1):39–56
Aksoy H, Gedikli A, Unal NE, Kehagias A (2008) Fast segmentation algorithms for long hydrometeorological time series. Hydrol Process 22(23):4600–4608
Barbič J, Safonova A, Pan JY, Faloutsos C, Hodgins JK, Pollard NS (2004) Segmenting motion capture data into distinct behaviors. In: Proceedings of the 2004 graphics interface conference. Canadian Human–Computer Communications Society, pp 185–194
Beeferman D, Berger A, Lafferty J (1999) Statistical models for text segmentation. Mach Learn 34(1–3):177–210
Bertsekas DP (1995) Dynamic programming and optimal control, vol 1. Athena Scientific, Belmont
Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10(2):191–203
Chen L, Ng R (2004) On the marriage of lp-norms and edit distance. In: Proceedings of the 30th international conference on very large data bases. pp 792–803
Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data. ACM, New York, pp 491–502
Chen Y, Nascimento M, Ooi BC, Tung AK et al (2007) Spade: on shape-based pattern detection in streaming time series. In: Proceedings of the 23rd IEEE international conference on data engineering. IEEE, pp 786–795
Das G, Lin KI, Mannila H, Renganathan G, Smyth P (1998) Rule discovery from time series. KDD 98:16–22
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Gedikli A, Aksoy H, Unal NE (2008) Segmentation algorithm for long time series analysis. Stoch Environ Res Risk Assess 22(3):291–302
Gedikli A, Aksoy H, Unal NE, Kehagias A (2010a) Modified dynamic programming approach for offline segmentation of long hydrometeorological time series. Stoch Environ Res Risk Assess 24(5):547–557
Gedikli A, Aksoy H et al (2010b) Aug-segmenter: a user-friendly tool for segmentation of long time series. J Hydroinform 12(3):318–328
Guo H, Liu X, Song L (2015) Dynamic programming approach for segmentation of multivariate time series. Stoch Environ Res Risk Assess 29(1):265–273
Gupta L, Molfese DL, Tammana R, Simos PG (1996) Nonlinear alignment and averaging for estimating the evoked potential. IEEE Trans Biomed Eng 43(4):348–356
Han TS, Ko SK, Kang J (2007) Efficient subsequence matching using the longest common subsequence with a dual match index. In: Proceedings of the 5th international conference on machine learning and data mining in pattern recognition. Springer, pp 585–600
Hubert P (2000) The segmentation procedure as a tool for discrete modeling of hydrometeorological regimes. Stoch Environ Res Risk Assess 14(4–5):297–304
Izakian H, Pedrycz W, Jamal I (2015) Fuzzy clustering of time series data using dynamic time warping distance. Eng Appl Artif Intell 39:235–244
Kehagias A (2004) A hidden Markov model segmentation procedure for hydrological and environmental time series. Stoch Environ Res Risk Assess 18(2):117–130
Kehagias A, Fortin V (2006) Time series segmentation with shifting means hidden Markov models. Nonlinear Process Geophys 13(3):339–352
Kehagias A, Nidelkou E, Petridis V (2006) A dynamic programming segmentation procedure for hydrological and environmental time series. Stoch Environ Res Risk Assess 20(1–2):77–94
Keogh E, Chu S, Hart D, Pazzani M (2001) An online algorithm for segmenting time series. In: Proceedings of the 2001 IEEE international conference on data mining. IEEE, pp 289–296
Niennattrakul V, Ratanamahatana CA (2007) Inaccuracies of shape averaging method using dynamic time warping for time series data. In: Proceedings of the 7th international conference on computational science. Springer, pp 513–520
Niennattrakul V, Wanichsan D, Ratanamahatana CA (2007) Hand geometry verification using time series representation. In: Proceedings of the 11th international conference on knowledge-based intelligent information and engineering systems. Springer, pp 824–831
Niennattrakul V, Srisai D, Ratanamahatana CA (2012) Shape-based template matching for time series data. Knowl Based Syst 26:1–8
Sakoe H, Chiba S (1971) A dynamic programming approach to continuous speech recognition. Proc Seventh Int Congr Acoust 3:65–69
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49
Sakurai Y, Faloutsos C, Yamamuro M (2007) Stream monitoring under the time warping distance. In: Proceedings of the 23rd IEEE international conference on data engineering. IEEE, pp 1046–1055
Schwarz G et al (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Wang N, Liu X, Yin J (2012) Improved Gath–Geva clustering for fuzzy segmentation of hydrometeorological time series. Stoch Environ Res Risk Assess 26(1):139–155
Wang W, Pedrycz W, Liu X (2015) Time series long-term forecasting model based on information granules and fuzzy clustering. Eng Appl Artif Intell 41:17–24
Zhou M, Wong MH (2008) Efficient online subsequence searching in data streams under dynamic time warping distance. In: Proceedings of the 24th IEEE international conference on data engineering. IEEE, pp 686–695
Zhou F, De la Torre F, Hodgins JK (2013) Hierarchical aligned cluster analysis for temporal clustering of human motion. IEEE Trans Pattern Anal Mach Intell 35(3):582–596
Acknowledgments
This work is supported by the Natural Science Foundation of China under Grant 61175041 and 61533005, and Boshidian Funds 20110041110017. We are grateful to the editor and reviewers for their constructive comments. We would also like to express thanks to Witold Pedrycz, whose suggestions helped us to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Appendix: Cubic-spline dynamic time warping (CDTW)
Appendix: Cubic-spline dynamic time warping (CDTW)
Rights and permissions
About this article
Cite this article
Guo, H., Liu, X. Dynamic programming-based optimization for segmentation and clustering of hydrometeorological time series. Stoch Environ Res Risk Assess 30, 1875–1887 (2016). https://doi.org/10.1007/s00477-015-1192-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-015-1192-4