Skip to main content
Log in

Speeding up dynamic time warping distance for sparse time series data

Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Dynamic time warping (DTW) distance has been effectively used in mining time series data in a multitude of domains. However, in its original formulation DTW is extremely inefficient in comparing long sparse time series, containing mostly zeros and some unevenly spaced nonzero observations. Original DTW distance does not take advantage of this sparsity, leading to redundant calculations and a prohibitively large computational cost for long time series. We derive a new time warping similarity measure (AWarp) for sparse time series that works on the run-length encoded representation of sparse time series. The complexity of AWarp is quadratic on the number of observations as opposed to the range of time of the time series. Therefore, AWarp can be several orders of magnitude faster than DTW on sparse time series. AWarp is exact for binary-valued time series and a close approximation of the original DTW distance for any-valued series. We discuss useful variants of AWarp: bounded (both upper and lower), constrained, and multidimensional. We show applications of AWarp to three data mining tasks including clustering, classification, and outlier detection, which are otherwise not feasible using classic DTW, while producing equivalent results. Potential areas of application include bot detection, human activity classification, search trend analysis, seismic analysis, and unusual review pattern mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Notes

  1. http://www.cs.unm.edu/~mueen/Projects/AWarp/.

References

  1. Mueen A, Keogh E (2010) Online discovery and maintenance of time series motifs. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’10, number C in KDD’10. ACM Press, p 1089

  2. Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 947–956

  3. Shokoohi-Yekta M, Chen Y, Campana B, Hu B, Zakaria J, Keogh E (2015) Discovery of meaningful rules in time series. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’15. ACM Press, New York, pp 1085–1094

  4. Hamooni H, Mueen A (2014) Dual-domain hierarchical classification of phonetic time series. In: ICDM 2014. ICDM

  5. Keogh E (2002) Exact indexing of dynamic time warping. In: Proceedings of the 28th international conference on very large data bases, VLDB’02, pp 406–417

  6. Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. ACM SIGMOD Rec 23(2):419–429

    Article  Google Scholar 

  7. Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Disc 26(2):275–309

    Article  MathSciNet  Google Scholar 

  8. Murray D, Stankovic L, Refit: electrical load measurements. http://www.refitsmarthomes.org/

  9. Cook DJ, Crandall AS, Thomas BL, Krishnan NC (2013) CASAS: a smart home in a box. Computer 46(7):62–69

    Article  Google Scholar 

  10. Run-Length Encoding. https://en.wikipedia.org/wiki/Run-length_encoding

  11. Boulgouris N, Plataniotis K, Hatzinakos D (2004) Gait recognition using dynamic time warping. In: IEEE 6th workshop on multimedia signal processing. IEEE, pp 263–266

  12. Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49

    Article  MATH  Google Scholar 

  13. Keogh EJ, Pazzani MJ (2000) Scaling up dynamic time warping for datamining applications. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’00. ACM Press, New York, pp 285–289

  14. Rath TM, Manmatha R (2003) Word image matching using dynamic time warping. In: 2003. Proceedings. 2003 IEEE computer society conference on computer vision and pattern recognition, vol 2. IEEE, p II—521

  15. Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD Workshop, pp 359–370

  16. Al-Naymat G, Chawla S, Taheri J (2009) SparseDTW: a novel approach to speed up dynamic time warping. In: Proceedings of the Eighth Australasian data mining conference, vol 101. Australian computer society, Inc., Darlinghurst, Australia, pp 117–127

  17. Tan LN, Alwan A, Kossan G, Cody ML, Taylor CE (2015) Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data. J Acoust Soc Am 137(3):1069–80

    Article  Google Scholar 

  18. Chu S, Keogh E, Hart D, Pazzani M (2002) Iterative deepening dynamic time warping for time series, Chapter 12, pp 195–212

  19. Salvador S, Chan P (2007) Toward accurate dynamic time warping in linear time and space. Intell Data Anal 11(5):561–580

    Google Scholar 

  20. Sart D, Mueen A, Najjar W, Niennattrakul V, Keogh E (2010) Accelerating dynamic time warping subsequnce search with GPUs and FPGAs. ICDM 2010. In: Proceedings—IEEE international conference on data mining, ICDM, pp 1001–1006

  21. Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining—KDD ’12. ACM Press, New York, p 262

  22. Begum N, Ulanova L, Wang J, Keogh E (2015) Accelerating dynamic time warping clustering with a novel admissible pruning strategy. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining- KDD’15. ACM Press, New York, pp 49–58

  23. Assent I, Wichterich M, Krieger R, Kremer H, Seidl T (2009) Anticipatory DTW for efficient similarity search in time series databases. J Proc VLDB Endow 2(1):826–837

    Article  Google Scholar 

  24. Candan KS, Rossini R, Sapino ML, Wang X (2012) sDTW: computing DTW distances using locally relevant constraints based on salient feature alignments. PVLDB 5(11):1519–1530

    Google Scholar 

  25. Shokoohi-Yekta M, Wang J, Keogh E, On the non-trivial generalization of dynamic time warping to the multi-dimensional case, Chapter 33, pp 289–297

  26. Lines J, Davis L, Hills J, Bagnall A (2012) A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD, pp 289–297

  27. Mueen A (2013) Enumeration of time series motifs of all lengths. In: Proceedings—IEEE international conference on data mining, ICDM. ICDM, pp 547–556

  28. Zhu Y, Zimmerman Z, Senobari NS, Yeh CCM, Funning G, Mueen A, Brisk P, Keogh E (2016) Matrix profile II: exploiting a novel algorithm and GPUs to break the one hundred million Barrier for time series motifs and joins. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, pp 739–748

  29. Awarp: Warping Similarity for Sparse Time Series. http://www.cs.unm.edu/~mueen/Projects/AWarp/

  30. Zhu Q, Batista G, Rakthanmanon T, Keogh E (2012) A novel approximation to dynamic time warping allows anytime clustering of massive time series datasets. In: Proceedings of the 2012 SIAM international conference on data mining, pp 999–1010

  31. Yeh CCM, Zhu Y, Ulanova L, Begum N, Ding Y, Dau HA, Silva DF, Mueen A, Keogh E (2016) Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, pp 1317–1322

  32. Silva DF, Batista GEAPA (2016) Speeding up all-pairwise dynamic time warping matrix calculation. In: Proceedings of the 2016 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Philadelphia, pp 837–845

  33. Shieh J, Keogh E (2009) ISAX: disk-aware mining and indexing of massive time series datasets. Data Min Knowl Disc 19(1):24–57

    Article  Google Scholar 

  34. Chavoshi N, Hamooni H, Mueen A (2016) DeBot: Twitter Bot detection via warped correlation. In: 2016 IEEE 16th international conference on data mining (ICDM). IEEE, 12, pp 817–822

  35. Mueen A, Keogh E, Zhu Q, Cash S, Westover B (2009) Exact discovery of time series motifs. In: Proceedings of the 2009 SIAM international conference on data mining, pp 473–484

  36. Yankov D, Keogh E, Medina J, Chiu B, Zordan V (2007) Detecting time series motifs under uniform scaling. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining KDD 07, KDD’07, p 844

  37. Anderson KR, Gaby JE (1983) Dynamic waveform matching. Inf Sci 31(3):221–242 12

    Article  MathSciNet  Google Scholar 

  38. Herrera RH, Fomel S, van der Baan M (2014) Automatic approaches for seismic to well tying. Interpretation 2(2):SD9–SD17

    Article  Google Scholar 

  39. Google Trends. https://www.google.com/trends/

  40. List of Most Downloaded Android Applications. https://en.wikipedia.org/wiki/List_of_most_downloaded_Android_applications

  41. Yankov D, Keogh EJ, Rebbapragada U (2007) Disk aware discord discovery: finding unusual time series in terabyte sized datasets. In: ICDM, pp 381–390

  42. Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: The 17th ACM SIGKDD international conference, pp 1154–1162

Download references

Acknowledgements

This work was supported by the NSF CCF Grant No. 1527127 and the NSF Graduate Research Fellowship under Grant No. DGE-0237002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdullah Mueen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mueen, A., Chavoshi, N., Abu-El-Rub, N. et al. Speeding up dynamic time warping distance for sparse time series data. Knowl Inf Syst 54, 237–263 (2018). https://doi.org/10.1007/s10115-017-1119-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-017-1119-0

Keywords

Navigation