Skip to main content
Log in

Using dynamic time warping distances as features for improved time series classification

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Dynamic time warping (DTW) has proven itself to be an exceptionally strong distance measure for time series. DTW in combination with one-nearest neighbor, one of the simplest machine learning methods, has been difficult to convincingly outperform on the time series classification task. In this paper, we present a simple technique for time series classification that exploits DTW’s strength on this task. But instead of directly using DTW as a distance measure to find nearest neighbors, the technique uses DTW to create new features which are then given to a standard machine learning method. We experimentally show that our technique improves over one-nearest neighbor DTW on 31 out of 47 UCR time series benchmark datasets. In addition, this method can be easily extended to be used in combination with other methods. In particular, we show that when combined with the symbolic aggregate approximation (SAX) method, it improves over it on 37 out of 47 UCR datasets. Thus the proposed method also provides a mechanism to combine distance-based methods like DTW with feature-based methods like SAX. We also show that combining the proposed classifiers through ensembles further improves the performance on time series classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Using two-tailed Wilcoxon signed-rank test.

  2. Using more than one nearest neighbors has not been found to be helpful.

  3. We compare all classifiers together using Friedman test in Sect. 5.3.

  4. The same \(r\) values were also used in Feature-DTW-DTW-R as well as in other feature-based classifiers that used DTW-R.

  5. http://www.cs.gmu.edu/~jessica/sax.htm.

References

  • Batista G, Wang X, Keogh EJ (2011) A complexity-invariant distance measure for time series. SDM, SIAM 11:699–710

    Google Scholar 

  • Baydogan M, Runger G, Tuv E (2013) A bag-of-features framework to classify time series. IEEE Trans Pattern Recogn Mach Intell 35(11):2796–2802

    Article  Google Scholar 

  • Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD workshop, Seattle, vol 10, pp 359–370

  • Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):27

    Article  Google Scholar 

  • Chen L, Ng R (2004) On the marriage of lp-norms and edit distance. In: Proceedings of the Thirtieth international conference on Very large data bases. vol 30, VLDB Endowment, pp 792–803

  • Chen Y, Hu B, Keogh EJ, Batista G (2013) DTW-D: Time series semi-supervised learning from a single example. In: Proceedings of the Nineteenth ACM SIGKDD Omternational Conference on Knowledge Discovery and Data Mining (KDD-2013) pp 383–391

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  • Dietterich TG (2000) Ensemble methods in machine learning. Multiple classifier systems. Springer, Heidelberg, pp 1–15

    Chapter  Google Scholar 

  • Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc VLDB Endow 1(2):1542–1552

    Article  Google Scholar 

  • Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases, In: Proceedings of the 1994 ACM SIGMOD international conference on Management of data, pp 419–429

  • Fulcher BD, Jones NS (2014) Highly comparative feature-based time-series classification. IEEE Trans Knowl Data Eng 26(12):3026–3037

    Article  Google Scholar 

  • Geurts P (2001) Pattern extraction for time series classification. Principles of data mining and knowledge discovery. Springer, Berlin, pp 115–127

    Chapter  Google Scholar 

  • Gudmundsson S, Runarsson TP, Sigurdsson S (2008) Support vector machines and dynamic time warping for time series. In: IEEE International Joint Conference on Neural Networks, 2008. IJCNN 2008 (IEEE World Congress on Computational Intelligence), pp 2772–2776

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  • Harvey D, Todd MD (2014) Automated feature design for numeric sequence classification by genetic programming. IEEE Trans Evolut Comput doi:10.1109/TEVC.2014.2341451

  • Haussler D (1999) Convolution kernels on discrete structures. Technical report, UC Santa Cruz

  • Hayashi A, Mizuhara Y, Suematsu N (2005) Embedding time series data for classification. Machine learning and data mining in pattern recognition. Springer, Berlin, pp 356–365

    Chapter  Google Scholar 

  • Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2013) Classification of time series by shapelet transformation. Data Min Knowl Discov 2:1–31

    Google Scholar 

  • Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23(1):67–72

    Article  Google Scholar 

  • Japkowicz N, Shah M (2011) Evaluating learning algorithms: a classification perspective. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Kate RJ (2014) UWM time series classification webpage. http://www.uwm.edu/~katerj/timeseries

  • Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4):349–371

    Article  MathSciNet  Google Scholar 

  • Keogh E, Lin J, Fu A (2005) HOT SAX: Efficiently finding the most unusual time series subsequence. In: Fifth IEEE International Conference on Data Mining (ICDM), pp 226–233

  • Keogh E, Zhu Q, Hu B, Hao Y, Xi X, Wei L, Ratanamahatana CA (2011) The UCR time series classification/clustering homepage. http://www.cs.ucr.edu/~eamonn/time_series_data

  • Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discov 15(2):107–144

    Article  MathSciNet  Google Scholar 

  • Lin J, Khade R, Li Y (2012) Rotation-invariant similarity in time series using bag-of-patterns representation. J Intell Inf Syst 39(2):287–315

    Article  Google Scholar 

  • Lines J, Bagnall A (2014) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discov 4:1–28

    Article  MathSciNet  Google Scholar 

  • Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval, vol 1. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Mizuhara Y, Hayashi A, Suematsu N (2006) Embedding of time series data by using dynamic time warping distances. Syst Comput Japan 37(3):1–9

    Article  Google Scholar 

  • Nanopoulos A, Alcock R, Manolopoulos Y (2001) Feature-based classification of time-series data. Int J Comput Res 10(3):49–61

    Google Scholar 

  • Nugent C, Lopez J, Black N, Webb J (2002) The application of neural networks in the classification of the electrocardiogram. Computational intelligence processing in medical diagnosis. Springer, Berlin, pp 229–260

    Chapter  Google Scholar 

  • Ordónez P, Armstrong T, Oates T, Fackler J (2011) Classification of patients using novel multivariate time series representations of physiological data. In: IEEE 10th International Conference on Machine Learning and Applications and Workshops (ICMLA), vol 2, pp 172–179

  • Ratanamahatana CA, Keogh E (2004a) Everything you know about dynamic time warping is wrong. In: Third Workshop on Mining Temporal and Sequential Data, pp 22–25

  • Ratanamahatana CA, Keogh E (2004b) Making time-series classification more accurate using learned constraints. In: Proceedings of SIAM International Conference on Data Mining (SDM ’04), pp 11–22

  • Rodríguez JJ, Alonso CJ (2004) Interval and dynamic time warping-based decision trees. In: Proceedings of the 2004 ACM symposium on Applied computing, ACM, pp 548–552

  • Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49

    Article  MATH  Google Scholar 

  • Senin P, Malinchik S (2013) SAX-VSM: Interpretable time series classification using sax and vector space model. In: IEEE 13th International Conference on Data Mining (ICDM), pp 1175–1180

  • Shieh J, Keogh E (2008) i SAX: Indexing and mining terabyte sized time series. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 623–631

  • Vickers A (2010) What is a P-value anyway? 34 stories to help you actually understand statistics. Addison-Wesley, Boston

    Google Scholar 

  • Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings 18th International Conference on Data Engineering, 2002, pp 673–684

  • Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26(2):275–309

    Article  MathSciNet  Google Scholar 

  • Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23rd international conference on Machine learning, ACM, pp 1033–1040

  • Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. ACM SIGKDD Explor Newsl 12(1):40–48

    Article  Google Scholar 

  • Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 947–956

Download references

Acknowledgments

We thank editor Eamonn Keogh and the anonymous reviewers for their feedback to improve this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rohit J. Kate.

Additional information

Responsible editor: Eamonn Keogh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kate, R.J. Using dynamic time warping distances as features for improved time series classification. Data Min Knowl Disc 30, 283–312 (2016). https://doi.org/10.1007/s10618-015-0418-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-015-0418-x

Keywords

Navigation