Skip to main content
Log in

Efficient segmentation-based methods for anomaly detection in static and streaming time series under dynamic time warping

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

The problem of time series anomaly detection has attracted a lot of attention due to its usefulness in various application domains. However, most of the methods proposed so far used Euclidean distance to deal with this problem. Dynamic Time Warping (DTW) distance is more suitable than Euclidean distance because of its capability in shape-based similarity checking in many practical fields, for example those with multimedia data. In this paper, we propose two efficient anomaly detection methods, EP-Leader-DTW and SEP-Leader-DTW, for static and streaming time series under DTW, respectively. Our methods are based on time series segmentation, subsequence clustering, and anomaly scoring. For segmentation, the major extrema method is used to obtain subsequences. For clustering, we apply Leader algorithm to cluster the subsequences along with a lower bounding technique to accelerate DTW distance computation. Experimental results on several benchmark time series datasets reveal that our method for anomaly detection in static time series under DTW can perform very fast and accurately on large time series datasets. For streaming time series, our method can meet the instantaneous requirement with high accuracy. As a result, our anomaly detection methods are applicable to both static and streaming time series in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Anh, D.T., & Thanh, L.H. (2015). An efficient implementation of k-means clustering for time series data with DTW distance. International Journal of Business Intelligence and Data Mining, (Scopus), 10 (3), 213–232. https://doi.org/10.1504/IJBIDM.2015.071311, .

    Article  Google Scholar 

  • Berndt, D.J., & Clifford, J. (1994). Using dynamic time warping to find patterns in time series. In Proceedings of AAAI-94 Workshop on Knowledge Discovery in Databases, Seattle (pp. 229–248).

  • Bu, Y., Leung, T.W., Fu, A., Keogh, E., Pei, J, & Meshkin S. (2007). WAT: Finding Top-K Discords in Time Series Database. In Proceedings of the 2007 SIAM International Conference on Data Mining (SDM’07), Minneapolis, MN, USA, April 26-28, DOI https://doi.org/10.1137/1.9781611972771.43., (to appear in print).

  • Chandola, V., Cheboli, D., & Kumar, V. (2009). Detecting anomalies in a time series database Technical Report TR-09-004. University of Minnesota: Department of Computer Science and Engineering.

    Google Scholar 

  • Do, L.V., & Anh, D.T. (2017). Time series motif discovery based on subsequence join under dynamic time warping. In Proceedings of the 2017 international conference on data mining, communications and information technology DMICT, May 25–26 2017 Phuket Thailand, DOI https://doi.org/10.1145/3089871.3089874, (to appear in print).

  • Fink, E., & Gandhi, H.S. (2007). Important extrema of time series. In Proceedings of IEEE International Conference on System, Man and Cybernetics. Montreal, Canada 366–372, DOI https://doi.org/10.1109/ICSMC.2007.4414161, (to appear in print).

  • Gensler, A., & Sick, B. (2014). Novel criteria to measure performance of time series segmentation techniques (pp. 193–204).

  • Hartigan, J.A. (1975). Clustering Algorithms. New York: John Wiley & Sons.

    MATH  Google Scholar 

  • He Z, Xu X, & Deng S. (2003). Discovering Cluster-based Local Outliers. Pattern Recognition Letters, 24(9-10), 1641–1650. https://doi.org/10.1016/S0167-8655(03)00003-5.

    Article  MATH  Google Scholar 

  • Itakura, F. (1975). Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics Speech, and Signal Processing, 23(1), 67–72. https://doi.org/10.1109/TASSP.1975.1162641.

    Article  Google Scholar 

  • Jones, M., Nikovski, D., Imamura, M., & Hirata, T. (2016). Exemplar learning for extremely anomaly detection in real-value time series.

  • Keogh, E., & Ratanamahatana, C.A. (2005). Exact indexing of dynamic time warping. Knowledge and information systems, 7(3), 358–386. https://doi.org/10.1007/s10115-004-0154-9.

    Article  Google Scholar 

  • Keogh, E., Chakrabarti, K., Pazzani, M., & Mehrotra, S. (2001). Dimensionality deduction for fast similarity search in large time series database. Journal of Knowledge and Information Systems, 3(3), 263–286.

    Article  Google Scholar 

  • Keogh, E., Lin, J., & Fu, A. (2005). HOT SAX: Efficiently finding the most unusual time series subsequence. In Proceedings of 5th IEEE Int. Confefence on Data Mining ICDM Houston (pp. 226–233), DOI https://doi.org/10.1109/ICDM.2005.79, (to appear in print).

  • Keogh, E., Lin, J., & Fu, A. (2019). website of UCR Archive: http://www.cs.ucr.edu/~eamonn/discords/, Accessed 17 September.

  • Kha, N.H., & Anh, D.T. (2015). From cluster-based outlier detection to time series discord discovery Trends and Applications in Knowledge Discovery and Data Mining-PAKDD 2015 Workshops: BigPMA, VLSP, QIMIE, BAEBH, Ho Chi Minh City, Vietnam, May 19-21, 2015, X.L. Li et al. (Eds.), LNAI 9441, Springer, pp 16-28. https://doi.org/10.1007/978-3-319-25660-3_2..

  • Kim, S., Park, S., & Chu, W.W. (2001). An index-based approach for similarity search supporting time warping in large sequence databases. In proceedings of 17th International Conference on Data Engineering (pp. 607–614).

  • Lemire, D. (2009). Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern recognition, 42(9), 2169–2180. https://doi.org/10.1016/j.patcog.2008.11.030.

    Article  MATH  Google Scholar 

  • Leng, M., Chen, X., & Li, L. (2008). Variable length methods for detecting anomaly patterns in time series. In 2008 International Symposium on Computational Intelligence and Design, (Vol. 2 pp. 52–56): IEEE, DOI https://doi.org/10.1109/ISCID.2008.95, (to appear in print).

  • Li, G., Braysy, O., Jiang, L., Wu, Z., & Wang, Y. (2013). Finding time series discord based on bit representation clustering. Knowledge-Based Systems, 52, 243–254.

    Article  Google Scholar 

  • Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). Symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, San Diego, CA, Jun 13 (pp. 2–11).

  • Ma, J., & Perkins, S. (2003). Time series novelty detection using one-class support vector machines. In Proceedings of International Joint Conference on Neural Networks, (Vol. 3 pp. 1741–1745), DOI https://doi.org/10.1109/IJCNN.2003.1223670.

  • Nevill-Manning, C.G., & Witten, I.H. (1997). Identifying hierarchical structure in sequences: a linear-time algorithm. Journal of Artificial Intelligence Research, 7, 67–82. https://doi.org/10.1613/jair.374.

    Article  MATH  Google Scholar 

  • Park, H.S., & Jun, C.H. (2009). A simple and fast algorithm for K-medoids clustering. Expert systems with applications, 36 (2), 3336–3341. https://doi.org/10.1016/j.eswa.2008.01.039.

    Article  Google Scholar 

  • Petitjean, F., Ketterlin, A., & Gancarski, P. (2011). A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44(3), 678–693. https://doi.org/10.1016/j.patcog.2010.09.013.

    Article  MATH  Google Scholar 

  • Phien, N.N. (2018). An Efficient Method for Estimating Time Series Motif Length Using Sequitur Algorithm. In International Conference on Machine Learning and Intelligent Communications 2018 Jul 6, Springer, Cham, (pp. 531-538). https://doi.org/10.1007/978-3-030-00557-3_52.

  • Pratt, K.B., & Fink, E. (2002). Search for Pattern in Compressed Time Series. International Journal of Image and Graphics, 2(1), 89–106. https://doi.org/10.1142/S0219467802000482..

    Article  Google Scholar 

  • Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., & Keogh, E. (2013). Addressing big data time series: Mining trillions of time series subsequences under dynamic time warping. ACM Transactions on Knowledge Discovery from Data TKDD, 7(3), 10. https://doi.org/10.1145/2513092.2500489.

    Article  Google Scholar 

  • Ratanamahatana, C.A., & Keogh, E. (2004). Everything you know about Dynamic Time Warping is wrong. In Proceedings of 3rd Workshop on Mining Temporal and Sequential Data (pp. 22–25).

  • Safia, A.M.B., & Aghbari, Z.A. (2011). Searching data streams for variable length anomalies. In Proceedings of International Conference on Innovations in Information Technology, Apr 25, Abu Dhabi (pp. 297–302: IEEE).

  • Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 26(1), 43–49. https://doi.org/10.1109/TASSP.1978.1163055.

    Article  MATH  Google Scholar 

  • Thuy, H.T.T., Anh, D.T., & Chau, V.T.N. (2017). Comparing three time series segmentation methods via novel evaluation criteria. In Proceedings of IEEE International Conference on Information Technology, Information Systems, and Electrical Engineering, Indonesia Nov 1 (pp. 171–176): IEEE, DOI https://doi.org/10.1109/ICITISEE.2017.8285489, (to appear in print).

  • Truong, C.D., & Anh, D.T. (2015). An efficient method for motif and anomaly detection in time series based on clustering. International Journal of Business Intelligence and Data Mining, 10(4), 356–377. https://doi.org/10.1504/IJBIDM.2015.072212.

    Article  Google Scholar 

  • Vinh, V.D., & Anh, D.T. (2016). Efficient Subsequence Join over Time Series under Dynamic Time Warping. Recent Developments in Intelligent Information and Database Systems, 42, 41–52. https://doi.org/10.1007/978-3-319-31277-4_4..

    Article  MathSciNet  Google Scholar 

  • Vlachos, M., Yu, P., & Castelli, V. (2005). On periodicity detection and structural periodic similarity. In Proceedings of SIAM International Conference on Data Mining SDM, Apr 21 (pp. 449–460): Society for Industrial and Applied Mathematics, DOI https://doi.org/10.1137/1.9781611972757.40, (to appear in print).

  • Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., & Keogh, E. (2013). Experimental comparison of representations and distance measures for time series data. Data Mining and Knowledge Discovery, 26, 275–309. https://doi.org/10.1007/s10618-012-0250-5.

    Article  MathSciNet  Google Scholar 

  • Wei, L., Keogh, E., & Xi, X. (2006). Saxually explicit images: Finding unusual shapes in large image databases. In Proceedings of the 6th IEEE International Conference on Data Mining ICDM 2006 Dec 18, (pp. 711–720): IEEE.

  • Yi, B.K., Jagadish, H.V., & Faloutsos, C. (1998). Efficient retrieval of similar time sequences under time warping. In Proceedings of 14th International Conference on Data Engineering (pp. 201–208).

  • Zhang, C., Liu, H., & Yin, A. (2017). Research of detection algorithm for time series abnormal subsequence. In Proceedings of International Conference of Pioneering Computer Scientists, Engineers and Educators ICPCSEE CCIS 727 (pp. 12–26).

Download references

Acknowledgements

This research is funded by Ho Chi Minh City University of Technology (HCMUT), VNU-HCM, under grant number BK-SDH-2020-8141217.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huynh Thi Thu Thuy.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thuy, H.T.T., Anh, D.T. & Chau, V.T.N. Efficient segmentation-based methods for anomaly detection in static and streaming time series under dynamic time warping. J Intell Inf Syst 56, 121–146 (2021). https://doi.org/10.1007/s10844-020-00609-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-020-00609-6

Keywords

Navigation