Efficient segmentation-based methods for anomaly detection in static and streaming time series under dynamic time warping

Thuy, Huynh Thi Thu; Anh, Duong Tuan; Chau, Vo Thi Ngoc

doi:10.1007/s10844-020-00609-6

Efficient segmentation-based methods for anomaly detection in static and streaming time series under dynamic time warping

Published: 07 July 2020

Volume 56, pages 121–146, (2021)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Huynh Thi Thu Thuy ORCID: orcid.org/0000-0001-7108-8095^1,2,
Duong Tuan Anh^1,2 &
Vo Thi Ngoc Chau^1,2

1098 Accesses
5 Citations
Explore all metrics

Abstract

The problem of time series anomaly detection has attracted a lot of attention due to its usefulness in various application domains. However, most of the methods proposed so far used Euclidean distance to deal with this problem. Dynamic Time Warping (DTW) distance is more suitable than Euclidean distance because of its capability in shape-based similarity checking in many practical fields, for example those with multimedia data. In this paper, we propose two efficient anomaly detection methods, EP-Leader-DTW and SEP-Leader-DTW, for static and streaming time series under DTW, respectively. Our methods are based on time series segmentation, subsequence clustering, and anomaly scoring. For segmentation, the major extrema method is used to obtain subsequences. For clustering, we apply Leader algorithm to cluster the subsequences along with a lower bounding technique to accelerate DTW distance computation. Experimental results on several benchmark time series datasets reveal that our method for anomaly detection in static time series under DTW can perform very fast and accurately on large time series datasets. For streaming time series, our method can meet the instantaneous requirement with high accuracy. As a result, our anomaly detection methods are applicable to both static and streaming time series in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Subsequence Join Over Time Series Under Dynamic Time Warping

Bound smoothing based time series anomaly detection using multiple similarity measures

Article 18 May 2020

Wenqing Wang, Junpeng Bao & Tao Li

Improving SPRING Method in Similarity Search Over Time-Series Streams by Data Normalization

References

Anh, D.T., & Thanh, L.H. (2015). An efficient implementation of k-means clustering for time series data with DTW distance. International Journal of Business Intelligence and Data Mining, (Scopus), 10 (3), 213–232. https://doi.org/10.1504/IJBIDM.2015.071311, .
Article Google Scholar
Berndt, D.J., & Clifford, J. (1994). Using dynamic time warping to find patterns in time series. In Proceedings of AAAI-94 Workshop on Knowledge Discovery in Databases, Seattle (pp. 229–248).
Bu, Y., Leung, T.W., Fu, A., Keogh, E., Pei, J, & Meshkin S. (2007). WAT: Finding Top-K Discords in Time Series Database. In Proceedings of the 2007 SIAM International Conference on Data Mining (SDM’07), Minneapolis, MN, USA, April 26-28, DOI https://doi.org/10.1137/1.9781611972771.43., (to appear in print).
Chandola, V., Cheboli, D., & Kumar, V. (2009). Detecting anomalies in a time series database Technical Report TR-09-004. University of Minnesota: Department of Computer Science and Engineering.
Google Scholar
Do, L.V., & Anh, D.T. (2017). Time series motif discovery based on subsequence join under dynamic time warping. In Proceedings of the 2017 international conference on data mining, communications and information technology DMICT, May 25–26 2017 Phuket Thailand, DOI https://doi.org/10.1145/3089871.3089874, (to appear in print).
Fink, E., & Gandhi, H.S. (2007). Important extrema of time series. In Proceedings of IEEE International Conference on System, Man and Cybernetics. Montreal, Canada 366–372, DOI https://doi.org/10.1109/ICSMC.2007.4414161, (to appear in print).
Gensler, A., & Sick, B. (2014). Novel criteria to measure performance of time series segmentation techniques (pp. 193–204).
Hartigan, J.A. (1975). Clustering Algorithms. New York: John Wiley & Sons.
MATH Google Scholar
He Z, Xu X, & Deng S. (2003). Discovering Cluster-based Local Outliers. Pattern Recognition Letters, 24(9-10), 1641–1650. https://doi.org/10.1016/S0167-8655(03)00003-5.
Article MATH Google Scholar
Itakura, F. (1975). Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics Speech, and Signal Processing, 23(1), 67–72. https://doi.org/10.1109/TASSP.1975.1162641.
Article Google Scholar
Jones, M., Nikovski, D., Imamura, M., & Hirata, T. (2016). Exemplar learning for extremely anomaly detection in real-value time series.
Keogh, E., & Ratanamahatana, C.A. (2005). Exact indexing of dynamic time warping. Knowledge and information systems, 7(3), 358–386. https://doi.org/10.1007/s10115-004-0154-9.
Article Google Scholar
Keogh, E., Chakrabarti, K., Pazzani, M., & Mehrotra, S. (2001). Dimensionality deduction for fast similarity search in large time series database. Journal of Knowledge and Information Systems, 3(3), 263–286.
Article Google Scholar
Keogh, E., Lin, J., & Fu, A. (2005). HOT SAX: Efficiently finding the most unusual time series subsequence. In Proceedings of 5th IEEE Int. Confefence on Data Mining ICDM Houston (pp. 226–233), DOI https://doi.org/10.1109/ICDM.2005.79, (to appear in print).
Keogh, E., Lin, J., & Fu, A. (2019). website of UCR Archive: http://www.cs.ucr.edu/~eamonn/discords/, Accessed 17 September.
Kha, N.H., & Anh, D.T. (2015). From cluster-based outlier detection to time series discord discovery Trends and Applications in Knowledge Discovery and Data Mining-PAKDD 2015 Workshops: BigPMA, VLSP, QIMIE, BAEBH, Ho Chi Minh City, Vietnam, May 19-21, 2015, X.L. Li et al. (Eds.), LNAI 9441, Springer, pp 16-28. https://doi.org/10.1007/978-3-319-25660-3_2..
Kim, S., Park, S., & Chu, W.W. (2001). An index-based approach for similarity search supporting time warping in large sequence databases. In proceedings of 17th International Conference on Data Engineering (pp. 607–614).
Lemire, D. (2009). Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern recognition, 42(9), 2169–2180. https://doi.org/10.1016/j.patcog.2008.11.030.
Article MATH Google Scholar
Leng, M., Chen, X., & Li, L. (2008). Variable length methods for detecting anomaly patterns in time series. In 2008 International Symposium on Computational Intelligence and Design, (Vol. 2 pp. 52–56): IEEE, DOI https://doi.org/10.1109/ISCID.2008.95, (to appear in print).
Li, G., Braysy, O., Jiang, L., Wu, Z., & Wang, Y. (2013). Finding time series discord based on bit representation clustering. Knowledge-Based Systems, 52, 243–254.
Article Google Scholar
Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). Symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, San Diego, CA, Jun 13 (pp. 2–11).
Ma, J., & Perkins, S. (2003). Time series novelty detection using one-class support vector machines. In Proceedings of International Joint Conference on Neural Networks, (Vol. 3 pp. 1741–1745), DOI https://doi.org/10.1109/IJCNN.2003.1223670.
Nevill-Manning, C.G., & Witten, I.H. (1997). Identifying hierarchical structure in sequences: a linear-time algorithm. Journal of Artificial Intelligence Research, 7, 67–82. https://doi.org/10.1613/jair.374.
Article MATH Google Scholar
Park, H.S., & Jun, C.H. (2009). A simple and fast algorithm for K-medoids clustering. Expert systems with applications, 36 (2), 3336–3341. https://doi.org/10.1016/j.eswa.2008.01.039.
Article Google Scholar
Petitjean, F., Ketterlin, A., & Gancarski, P. (2011). A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44(3), 678–693. https://doi.org/10.1016/j.patcog.2010.09.013.
Article MATH Google Scholar
Phien, N.N. (2018). An Efficient Method for Estimating Time Series Motif Length Using Sequitur Algorithm. In International Conference on Machine Learning and Intelligent Communications 2018 Jul 6, Springer, Cham, (pp. 531-538). https://doi.org/10.1007/978-3-030-00557-3_52.
Pratt, K.B., & Fink, E. (2002). Search for Pattern in Compressed Time Series. International Journal of Image and Graphics, 2(1), 89–106. https://doi.org/10.1142/S0219467802000482..
Article Google Scholar
Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., & Keogh, E. (2013). Addressing big data time series: Mining trillions of time series subsequences under dynamic time warping. ACM Transactions on Knowledge Discovery from Data TKDD, 7(3), 10. https://doi.org/10.1145/2513092.2500489.
Article Google Scholar
Ratanamahatana, C.A., & Keogh, E. (2004). Everything you know about Dynamic Time Warping is wrong. In Proceedings of 3rd Workshop on Mining Temporal and Sequential Data (pp. 22–25).
Safia, A.M.B., & Aghbari, Z.A. (2011). Searching data streams for variable length anomalies. In Proceedings of International Conference on Innovations in Information Technology, Apr 25, Abu Dhabi (pp. 297–302: IEEE).
Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 26(1), 43–49. https://doi.org/10.1109/TASSP.1978.1163055.
Article MATH Google Scholar
Thuy, H.T.T., Anh, D.T., & Chau, V.T.N. (2017). Comparing three time series segmentation methods via novel evaluation criteria. In Proceedings of IEEE International Conference on Information Technology, Information Systems, and Electrical Engineering, Indonesia Nov 1 (pp. 171–176): IEEE, DOI https://doi.org/10.1109/ICITISEE.2017.8285489, (to appear in print).
Truong, C.D., & Anh, D.T. (2015). An efficient method for motif and anomaly detection in time series based on clustering. International Journal of Business Intelligence and Data Mining, 10(4), 356–377. https://doi.org/10.1504/IJBIDM.2015.072212.
Article Google Scholar
Vinh, V.D., & Anh, D.T. (2016). Efficient Subsequence Join over Time Series under Dynamic Time Warping. Recent Developments in Intelligent Information and Database Systems, 42, 41–52. https://doi.org/10.1007/978-3-319-31277-4_4..
Article MathSciNet Google Scholar
Vlachos, M., Yu, P., & Castelli, V. (2005). On periodicity detection and structural periodic similarity. In Proceedings of SIAM International Conference on Data Mining SDM, Apr 21 (pp. 449–460): Society for Industrial and Applied Mathematics, DOI https://doi.org/10.1137/1.9781611972757.40, (to appear in print).
Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., & Keogh, E. (2013). Experimental comparison of representations and distance measures for time series data. Data Mining and Knowledge Discovery, 26, 275–309. https://doi.org/10.1007/s10618-012-0250-5.
Article MathSciNet Google Scholar
Wei, L., Keogh, E., & Xi, X. (2006). Saxually explicit images: Finding unusual shapes in large image databases. In Proceedings of the 6th IEEE International Conference on Data Mining ICDM 2006 Dec 18, (pp. 711–720): IEEE.
Yi, B.K., Jagadish, H.V., & Faloutsos, C. (1998). Efficient retrieval of similar time sequences under time warping. In Proceedings of 14th International Conference on Data Engineering (pp. 201–208).
Zhang, C., Liu, H., & Yin, A. (2017). Research of detection algorithm for time series abnormal subsequence. In Proceedings of International Conference of Pioneering Computer Scientists, Engineers and Educators ICPCSEE CCIS 727 (pp. 12–26).

Download references

Acknowledgements

This research is funded by Ho Chi Minh City University of Technology (HCMUT), VNU-HCM, under grant number BK-SDH-2020-8141217.

Author information

Authors and Affiliations

Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City, Vietnam
Huynh Thi Thu Thuy, Duong Tuan Anh & Vo Thi Ngoc Chau
Vietnam National University, Ho Chi Minh City, Vietnam
Huynh Thi Thu Thuy, Duong Tuan Anh & Vo Thi Ngoc Chau

Authors

Huynh Thi Thu Thuy
View author publications
You can also search for this author in PubMed Google Scholar
Duong Tuan Anh
View author publications
You can also search for this author in PubMed Google Scholar
Vo Thi Ngoc Chau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huynh Thi Thu Thuy.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thuy, H.T.T., Anh, D.T. & Chau, V.T.N. Efficient segmentation-based methods for anomaly detection in static and streaming time series under dynamic time warping. J Intell Inf Syst 56, 121–146 (2021). https://doi.org/10.1007/s10844-020-00609-6

Download citation

Received: 06 February 2020
Revised: 26 May 2020
Accepted: 08 June 2020
Published: 07 July 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s10844-020-00609-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient segmentation-based methods for anomaly detection in static and streaming time series under dynamic time warping

Abstract

Access this article

Similar content being viewed by others

Efficient Subsequence Join Over Time Series Under Dynamic Time Warping

Bound smoothing based time series anomaly detection using multiple similarity measures

Improving SPRING Method in Similarity Search Over Time-Series Streams by Data Normalization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient segmentation-based methods for anomaly detection in static and streaming time series under dynamic time warping

Abstract

Access this article

Similar content being viewed by others

Efficient Subsequence Join Over Time Series Under Dynamic Time Warping

Bound smoothing based time series anomaly detection using multiple similarity measures

Improving SPRING Method in Similarity Search Over Time-Series Streams by Data Normalization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation