Advertisement

Data Mining and Knowledge Discovery

, Volume 29, Issue 3, pp 565–592 | Cite as

Time series classification with ensembles of elastic distance measures

  • Jason Lines
  • Anthony Bagnall
Article

Abstract

Several alternative distance measures for comparing time series have recently been proposed and evaluated on time series classification (TSC) problems. These include variants of dynamic time warping (DTW), such as weighted and derivative DTW, and edit distance-based measures, including longest common subsequence, edit distance with real penalty, time warp with edit, and move–split–merge. These measures have the common characteristic that they operate in the time domain and compensate for potential localised misalignment through some elastic adjustment. Our aim is to experimentally test two hypotheses related to these distance measures. Firstly, we test whether there is any significant difference in accuracy for TSC problems between nearest neighbour classifiers using these distance measures. Secondly, we test whether combining these elastic distance measures through simple ensemble schemes gives significantly better accuracy. We test these hypotheses by carrying out one of the largest experimental studies ever conducted into time series classification. Our first key finding is that there is no significant difference between the elastic distance measures in terms of classification accuracy on our data sets. Our second finding, and the major contribution of this work, is to define an ensemble classifier that significantly outperforms the individual classifiers. We also demonstrate that the ensemble is more accurate than approaches not based in the time domain. Nearly all TSC papers in the data mining literature cite DTW (with warping window set through cross validation) as the benchmark for comparison. We believe that our ensemble is the first ever classifier to significantly outperform DTW and as such raises the bar for future work in this area.

Keywords

Time series classification Elastic distance measures  Ensembles 

References

  1. Bagnall A (2012) Shapelet based time-series classification. http://www.uea.ac.uk/computing/machine-learning/shapelets
  2. Bagnall A, Davis L, Hills J, Lines J (2012) Transformation based ensembles for time series classification. In: Proceedings of the 12th SDMGoogle Scholar
  3. Batista G, Keogh E, Tataw O, de Souza V (2013) CID: an efficient complexity-invariant distance for time series. Data Mining and Knowledge Discovery online firstGoogle Scholar
  4. Batista G, Wang X, Keogh E (2011) A complexity-invariant distance measure for time series. In: Proceedings of the 11th SDMGoogle Scholar
  5. Baydogan M, Runger G, Tuv E (2013) A bag-of-features framework to classify time series. IEEE Trans Pattern Anal Mach Intell 35(11):2796–2802CrossRefGoogle Scholar
  6. Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefMATHGoogle Scholar
  7. Buza K (2011) Fusion methods for time-series classification. Ph.D. thesis, University of Hildesheim, GermanyGoogle Scholar
  8. Chen L, Ng R (2004) On the marriage of lp-norms and edit distance. In: Proceedings of the Thirtieth international conference on Very large data bases, vol 30, pp 792–803. VLDB EndowmentGoogle Scholar
  9. Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, pp 491–502. ACMGoogle Scholar
  10. Davis L, Theobald BJ, Toms A, Bagnall A (2012) On the segmentation and classification of hand radiographs. Int J Neural Syst 22(5):12345-1–12345-2Google Scholar
  11. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MATHMathSciNetGoogle Scholar
  12. Deng H, Runger G, Tuv E, Vladimir M (2013) A time series forest for classification and feature extraction. Inf Sci 239:142–153CrossRefMathSciNetGoogle Scholar
  13. Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: Experimental comparison of representations and distance measures. In: Proceedings of the 34th VLDBGoogle Scholar
  14. Górecki T, Łuczak M (2013) Using derivatives in time series classification. Data Min Knowl Discov 26(2):310–331CrossRefMathSciNetGoogle Scholar
  15. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRefGoogle Scholar
  16. Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2013) Classification of time series by shapelet transformation. Data Mining and Knowledge Discovery online firstGoogle Scholar
  17. Hu B, Chen Y, Keogh E (2013) Time series classification under more realistic assumptions. In: Proceedings of the thirteenth SIAM conference on data mining (SDM). SIAMGoogle Scholar
  18. Jeong Y, Jeong M, Omitaomu O (2011) Weighted dynamic time warping for time series classification. Pattern Recognit 44:2231–2240CrossRefGoogle Scholar
  19. Keogh E, Pazzani M (2011) Derivative dynamic time warping. In: Proceedings of the 1st SDMGoogle Scholar
  20. Keogh E, Zhu Q, Hu B, Hao Y, Xi X, Wei L, Ratanamahatana C (2011) The UCR time series classification/clustering homepage. http://www.cs.ucr.edu/eamonn/time_series_data/
  21. Lin J, Keogh E, Li W, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discov 15(2):107–144CrossRefMathSciNetGoogle Scholar
  22. Lin J, Khade R, Li Y (2012) Rotation-invariant similarity in time series using bag-of-patterns representation. J Intell Inf Syst 39(2):287–315CrossRefGoogle Scholar
  23. Lines J, Bagnall A (2014) Acompanying results, code and data. https://www.uea.ac.uk/computing/machine-learning/elastic-ensembles
  24. Lines J, Bagnall A, Caiger-Smith P, Anderson S (2011) Intelligent data engineering and automated learning (IDEAL) 2011. Springer, New York, pp 403–412Google Scholar
  25. Marteau PF (2009) Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans Pattern Anal Mach Intell 31(2):306–318CrossRefGoogle Scholar
  26. Rakthanmanon T, Keogh E (2013) Fast-shapelets: A fast algorithm for discovering robust time series shapelets. In: Proceedings of the 13th SIAM international conference on data mining (SDM)Google Scholar
  27. Ratanamahatana C, Keogh E (2005) Three myths about dynamic time warping data mining. In: Proceedings of the 5th SDMGoogle Scholar
  28. Rodriguez J, Alonso C (2005) upport vector machines of interval-based features for time series classification. Knowl-Based Syst 18(4):171–178CrossRefGoogle Scholar
  29. Stefan A, Athitsos V, Das G (2012) The move-split-merge metric for time series. IEEE Trans Knowl Data Eng 25(6):1425–1438CrossRefGoogle Scholar
  30. Tanner J, Whitehouse R, Healy M, Goldstein H, Cameron N (2011) Assessment of skeletal maturity and prediction of adult height (TW3) method. Academic Press, New YorkGoogle Scholar
  31. Trust ES (2012) Powering the Nation. Department for Environment, Food and Rural Affairs (DEFRA),Google Scholar
  32. Ye L, Keogh E (2009) Time series shapelets: A new primitive for data mining. In: Proceedings of the 15th ACM SIGKDDGoogle Scholar

Copyright information

© The Author(s) 2014

Authors and Affiliations

  1. 1.University of East AngliaNorwichUK

Personalised recommendations