Advertisement

A review on distance based time series classification

  • Amaia Abanda
  • Usue Mori
  • Jose A. Lozano
Article
Part of the following topical collections:
  1. Academic Surveys and Tutorials

Abstract

Time series classification is an increasing research topic due to the vast amount of time series data that is being created over a wide variety of fields. The particularity of the data makes it a challenging task and different approaches have been taken, including the distance based approach. 1-NN has been a widely used method within distance based time series classification due to its simplicity but still good performance. However, its supremacy may be attributed to being able to use specific distances for time series within the classification process and not to the classifier itself. With the aim of exploiting these distances within more complex classifiers, new approaches have arisen in the past few years that are competitive or which outperform the 1-NN based approaches. In some cases, these new methods use the distance measure to transform the series into feature vectors, bridging the gap between time series and traditional classifiers. In other cases, the distances are employed to obtain a time series kernel and enable the use of kernel methods for time series classification. One of the main challenges is that a kernel function must be positive semi-definite, a matter that is also addressed within this review. The presented review includes a taxonomy of all those methods that aim to classify time series using a distance based approach, as well as a discussion of the strengths and weaknesses of each method.

Keywords

Time series Classification Distance based Kernel Definiteness 

Notes

Acknowledgements

The authors would like to thank the people who contributed to the UCR time series repository, as well as would like to express our sincere appreciation for the comments and advices provided by Eamonn Keogh and Lingfei Wu to improve this paper. This research is supported by the Basque Government through the BERC 2018-2021 program and by Spanish Ministry of Economy and Competitiveness MINECO through BCAM Severo Ochoa excellence accreditation SEV-2013-0323 and through project TIN2017-82626-R funded by (AEI/FEDER, UE) and acronym GECECPAST. In addition, by the Research Groups 2013-2018 (IT-609-13) programs (Basque Government), TIN2016-78365-R (Spanish Ministry of Economy, Industry and Competitiveness). A. Abanda is also supported by the Grant BES-2016-076890.

References

  1. Adams CC (2004) The knot book: an elementary introduction to the mathematical theory of knots. American Mathematical Society, ProvidencezbMATHGoogle Scholar
  2. Bagnall A, Janacek G (2014) A run length transformation for discriminating between autoregressive time series. J Classif 31(2):274–295zbMATHGoogle Scholar
  3. Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660MathSciNetGoogle Scholar
  4. Bahlmann C, Haasdonk B, Burkhardt H (2002) Online handwriting recognition with support vector machines: a kernel approach. In: Proceedings of international workshop on frontiers in handwriting recognition, IWFHR, pp 49–54Google Scholar
  5. Belkin M, Niyogi P (2002) Laplacian Eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Process Syst 14:585–591Google Scholar
  6. Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. Workshop Knowl Discovery Databases 398:359–370Google Scholar
  7. Borg I, Groenen P (1997) Modern multidimensional scaling: theory and applications. Springer, BerlinzbMATHGoogle Scholar
  8. Bostrom A, Bagnall A (2014) Binary shapelet transform for multiclass time series classification. Trans Large Scale Data Knowl Centered Syst 8800:24–46Google Scholar
  9. Bostrom A, Bagnall A, Lines J (2016) Evaluating improvements to the shapelet transform. www-bcf.usc.edu. Accessed 21 Nov 2017
  10. Casacuberta F, Vidal E, Rulot H (1987) On the metric properties of dynamic time warping. IEEE Trans Acoustics Speech Signal Process 35(11):1631–1633Google Scholar
  11. Chen L, Ng R (2004) On the marriage of Lp-norms and edit distance. In: International conference on very large data bases, pp 792–803Google Scholar
  12. Chen P, Fan R, Lin C (2006) A study on SMO-type decomposition methods for support vector machines. IEEE Trans Neural Netw Learn Syst 17(4):893–908Google Scholar
  13. Chen Y, Hu B, Keogh E, Batista GEAPA (2013) DTW-D: time series semi-supervised learning from a single example. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, p 383Google Scholar
  14. Chen Y, Garcia E, Gupta M (2009) Similarity-based classification: concepts and algorithms. J Mach Learn Res 10(206):747–776MathSciNetzbMATHGoogle Scholar
  15. Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista GEAPA (2015a) The UCR time series classification archiveGoogle Scholar
  16. Chen Z, Zuo W, Hu Q, Lin L (2015b) Kernel sparse representation for time series classification. Inf Sci 292:15–26MathSciNetzbMATHGoogle Scholar
  17. Corduas M, Piccolo D (2008) Time series clustering and classification by the autoregressive metric. Comput Stat Data Anal 52(4):1860–1872MathSciNetzbMATHGoogle Scholar
  18. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 297:273–297zbMATHGoogle Scholar
  19. Cortes C, Haffner P, Mohri M (2004) Rational kernels: theory and algorithms. J Mach Learn Res 5:1035–1062MathSciNetzbMATHGoogle Scholar
  20. Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27zbMATHGoogle Scholar
  21. Cuturi M (2011) Fast global alignment kernels. In: Proceedings of the 28th ICML international conference on machine learning, pp 929–936Google Scholar
  22. Cuturi M, Vert J (2007) A kernel for time series based on global alignments. IEEE Trans Acoustics Speech Signal Process 1:413–416Google Scholar
  23. Decoste D, Schölkopf B (2002) Training invariant support vector machines using selective sampling. Mach Learn 46:161–190zbMATHGoogle Scholar
  24. Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc VLDB Very Large Database Endow 1(2):1542–1552Google Scholar
  25. Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45(1):1–34zbMATHGoogle Scholar
  26. Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: ACM SIGMOD international conference on management of data, pp 419–429Google Scholar
  27. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. Comput Syst Sci 139:119–139MathSciNetzbMATHGoogle Scholar
  28. Fu TC (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181Google Scholar
  29. Gaidon A, Harchoui Z, Schmid C (2011) A time series kernel for action recognition. In: Procedings of the British machine vision conference, pp 63.1–63.11Google Scholar
  30. Giusti R, Silva DF, Batista GEAPA (2016) Improved time series classification with representation diversity and SVM. In: International conference on machine learning and applications, pp 1–6Google Scholar
  31. Grabocka J, Schilling N, Wistuba M, Schmidt-Thieme L (2014) Learning time-series shapelets. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 392–401Google Scholar
  32. Graepel T, Herbrich R, Bollmann-Sdorra P, Obermayer K (1999) Classification on pairwise proximity data. Adv Neural Inf Process Syst 11:438–444Google Scholar
  33. Greub WH (1975) Linear algebra. Springer, BerlinzbMATHGoogle Scholar
  34. Gudmundsson S, Runarsson TP, Sigurdsson S (2008) Support vector machines and dynamic time warping for time series. In: Joint conference on neural networks (IEEE world congress on computational intelligence), pp 2772–2776Google Scholar
  35. Guyon I, Schomaker L, Planiondon R, Liberman M, Janet S, Montreal Ecole Polytechnique De, Consortium Linguistic Data (1994) UNIPEN project of on-line data exchange, pp 29–33Google Scholar
  36. Haasdonk B (2005) Feature space interpretation of SVMs with indefinite kernels. IEEE Trans Pattern Anal Mach Intell 27(4):482–492Google Scholar
  37. Haasdonk B, Bahlmann C (2004) Learning with distance substitution kernels. In: Joint pattern recognition symposium, pp 220–227Google Scholar
  38. Hayashi A, Mizuhara Y, Suematsu N (2005) Embedding time series data for classification. In: International workshop on machine learning and data mining in pattern recognition, pp 356–365Google Scholar
  39. He Q, Zhi D, Zhuang F, Shang T, Shi Z (2012) Fast time series classification based on infrequent shapelets. In: Proceedings of the 11th ICMLA international conference on machine learning and applications vol 1, pp 215–219Google Scholar
  40. Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2014) Classification of time series by shapelet transformation. Data Min Knowl Discovery 28(4):851–881MathSciNetzbMATHGoogle Scholar
  41. Hochreiter S, Obermayer K (2006) Support vector machines for dyadic data. Neural Comput 1510:1472–1510MathSciNetzbMATHGoogle Scholar
  42. Iwana BK, Frinken V, Riesen K, Uchida S (2017) Efficient temporal pattern recognition by means of dissimilarity space embedding with discriminative prototypes. Pattern Recognit 64:268–276Google Scholar
  43. Jacobs DW, Weinshall D, Gdalyahu Y (2000) Classification with nonmetric distances: image retrieval and class representation. IEEE Trans Pattern Anal Mach Intell 22(6):583–600Google Scholar
  44. Jain B, Spiegel S (2015) Dimension reduction in dissimilarity spaces for time series classification. In: International workshop on advanced analytics and learning on temporal data, pp 31–46Google Scholar
  45. Jalalian A, Chalup SK (2013) GDTW-P-SVMs: variable-length time series analysis using support vector machines. Neurocomputing 99:270–282Google Scholar
  46. Janyalikit T, Sathianwiriyakhun P, Sivaraks H, Ratanamahatana CA (2016) An enhanced support vector machine for faster time series classification. In: Asian conference on intelligent information and database systems, pp 616–625Google Scholar
  47. Jeong YS, Jayaraman R (2015) Support vector-based algorithms with weighted dynamic time warping kernel function for time series classification. Knowl Based Syst 75(June):184–191Google Scholar
  48. Jeong Y, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recognit 44(9):2231–2240Google Scholar
  49. Kate RJ (2015) Using dynamic time warping distances as features for improved time series classification. Data Min Knowl Discovery 30(2):283–312MathSciNetGoogle Scholar
  50. Kaya H, Gündüz-Öüdücü S (2013) SAGA: a novel signal alignment method based on genetic algorithm. Inf Sci 228:113–130MathSciNetGoogle Scholar
  51. Kaya H, Gündüz-Öüdücü S (2015) A distance based time series classification framework. Inf Syst 51:27–42Google Scholar
  52. Keogh E, Kasetty S (2002) On the need for time series data mining benchmarks. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, pp 102Google Scholar
  53. Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7:358–386Google Scholar
  54. Korn F, Jagaciish HV, Faloutsos C (1997) Efficiently supporting ad hoc queries sequences in large datasets of time for systems. In: Proceedings of the 1997 ACM SIGMOD international conference on management of data, pp 289–300Google Scholar
  55. Kumara K, Agrawal R, Bhattacharyya C (2008) A large margin approach for writer independent online handwriting classification. Pattern Recognit Lett 29(7):933–937Google Scholar
  56. Lei H, Sun B (2007) A study on the dynamic time warping in kernel machines. In: Proceedings of the 3rd SITIS international IEEE conference on signal-image technologies and internet-based system, pp 839–845Google Scholar
  57. Lei Q, Yi J, Vaculin R, Wu L, Dhillon IS (2017) Similarity preserving representation learning for time series analysis. arXiv: 1702.03584 [cs]
  58. Leslie C, Eskin E, Noble WS (2002) The spectrum kernel: a string kernel for SVM protein classification. In: Proceedings of the pacific symposium on biocomputing, pp 564–575Google Scholar
  59. Li M, Chen X, Li X, Ma B, Vitányi PMB (2004) The similarity metric. IEEE Trans Inf Theory 50(12):3250–3264MathSciNetzbMATHGoogle Scholar
  60. Li X, Lin J (2018) Evolving separating references for time series classification. In: Proceedings of the 2018 SIAM international conference on data mining, pp 243–251Google Scholar
  61. Liberman M (1993) TI46 speech corpus. In: Linguistic data consortiumGoogle Scholar
  62. Lichman M (2013) UCI machine learning repositoryGoogle Scholar
  63. Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discovery 15(2):107–144MathSciNetGoogle Scholar
  64. Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discovery 29(3):565–592MathSciNetGoogle Scholar
  65. Lines J, Davis LM, Hills J, Bagnall A (2012) A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp 289Google Scholar
  66. Lods A, Malinowski S, Tavenard R, Amsaleg L (2017) Learning DTW-preserving shapelets. In: International symposium on intelligent data analysis. Springer, Cham, pp 198–209Google Scholar
  67. Lu Z, Leen KT, Huang Y, Erdogmus D (2008) A reproducing kernel hilbert space framework for pairwise time series distances. In: Proceedings of the 25th ICML international conference on machine learning, vol 56, pp 624–631Google Scholar
  68. Marteau PF (2009) Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans Pattern Anal Mach Intell 31(2):306–318Google Scholar
  69. Marteau PF, Gibet S (2010) Constructing positive definite elastic kernels with application to time series classification. In: CoRR, pp 1–18Google Scholar
  70. Marteau PF, Gibet S (2014) On recursive edit distance kernels with application to time series classification. IEEE Trans Neural Netw Learn Syst 26(6):1–15MathSciNetGoogle Scholar
  71. Marteau PF, Bonnel N, Ménier G (2012) Discrete elastic inner vector spaces with application in time series and sequence mining. IEEE Trans Knowl Data Eng 25(9):2024–2035Google Scholar
  72. Mizuhara Y, Hayashi A, Suematsu N (2006) Embedding of time series data by using dynamic time warping distances. Syst Comput Jpn 37(3):1–9Google Scholar
  73. Mori U, Mendiburu A, Keogh E, Lozano JA (2017) Reliable early classification of time series based on discriminating the classes over time. Data Min Knowl Discovery 31(1):233–263MathSciNetGoogle Scholar
  74. Mueen A, Keogh E, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1154–1162Google Scholar
  75. Ong CS, Mary X, Canu S, Smola AJ (2004) Learning with non-positive kernels. In: Proceedings of the 21th ICML international conference on machine learning, p 81Google Scholar
  76. Pȩkalska E, Duin RPW (2005) The dissimilarity representation for pattern recognition: foundations and applicationsGoogle Scholar
  77. Pȩkalska E, Paclík P, Duin RPW (2001) A generalized kernel approach to dissimilarity-based classification. J Mach Learn Res 2:175–211MathSciNetzbMATHGoogle Scholar
  78. Pȩkalska E, Duin RPW, Paclík P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recognit 39(2):189–208zbMATHGoogle Scholar
  79. Popivanov I, Miller RJ (2002) Similarity search over time-series data using wavelets. In: Proceedings 18th international conference on data engineering (ICDE), pp 212–221Google Scholar
  80. Pree H, Herwig B, Gruber T, Sick B, David K, Lukowicz P (2014) On general purpose time series similarity measures and their use as kernel functions in support vector machines. Inf Sci 281:478–495Google Scholar
  81. Rahimi A, Recht B (2008) Random features for large-scale kernel machines. In: Advances in neural information processing systemsGoogle Scholar
  82. Rakthanmanon T, Keogh E (2013) Fast shapelets: a scalable algorithm for discovering time series shapelets. In: Proceedings of the 13th ICDM international conference on data mining, pp 668–676Google Scholar
  83. Rasmussen C, Williams C (2006) Gaussian processes for machine learning. Springer, BerlinzbMATHGoogle Scholar
  84. Rüping S (2001) SVM kernels for time series analysis. Technical reportGoogle Scholar
  85. Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoustics Speech Signal Process 26(1):43–49zbMATHGoogle Scholar
  86. Schölkopf B (2001) Learning with kernels: support vector machines, regularization, optimization, and beyondGoogle Scholar
  87. Senin P, Lin J, Wang X, Oates T, Gandhi S, Boedihardjo AP, Chen C, Frankenstein S, Lerner M (2014) GrammarViz 2.0: a tool for grammar-based pattern discovery in time series. In: Joint European conference on machine learning and knowledge discovery in databases, pp 468–472Google Scholar
  88. Serrà J, Arcos JL (2014) An empirical evaluation of similarity measures for time series classification. Knowl Based Syst 67:305–314Google Scholar
  89. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, CambridgezbMATHGoogle Scholar
  90. Shimodaira H, Noma KI, Nakai M, Sagayama S (2002) Dynamic time-alignment kernel in support vector machine. Adv Neural Inf Process Syst 2(1):921–928Google Scholar
  91. Sivaramakrishnan KR, Bhattacharyya C (2004) Time series classification for online tamil handwritten character recognition a kernel based approach. In: International conference on neural information processing, pp 800–805Google Scholar
  92. Smyth P (1997) Clustering sequences with hidden Markov models. Adv Neural Inf Process Syst 9:648–654Google Scholar
  93. Sun R, Luo ZQ (2016) Guaranteed matrix completion via non-convex factorization. IEEE Trans Inf Theory 62(11):6535–6579MathSciNetzbMATHGoogle Scholar
  94. Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining. Addison Wesley, BostonGoogle Scholar
  95. Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54:45–66zbMATHGoogle Scholar
  96. Troncoso A, Arias M, Riquelme JC (2015) A multi-scale smoothing kernel for measuring time-series similarity. Neurocomputing 167:8–17Google Scholar
  97. Vapnik V (1998) Statistical learning theory, vol 2. Wiley, New YorkzbMATHGoogle Scholar
  98. Wachman G, Khardon R, Protopapas P, Charles RA (2009) Kernels for periodic time series arising in astronomy. In: European conference on machine learning and knowledge discovery in databasesGoogle Scholar
  99. Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discovery 26(2):275–309MathSciNetGoogle Scholar
  100. Wang X, Lin J, Senin P, Alamos L, Oates T, Gandhi S, Boedihardjo AP, Chen C, Frankenstein S (2016) RPM: representative pattern mining for efficient time series classification. In: Proceedings of the 19th international conference on extending database technology, pp 185–196Google Scholar
  101. Weston J, Schölkopf B, Eskin E, Leslie C, Noble WS (2003) Dealing with large diagonals in kernel matrices. In: Annals of the institute of statistical mathematics, vol 55, pp 391–408MathSciNetzbMATHGoogle Scholar
  102. Wilson RC, Hancock ER, Pȩkalska E, Duin RPW (2014) Spherical and hyperbolic embeddings of data. IEEE Trans Pattern Anal Mach Intell 36(11):2255–2269Google Scholar
  103. Wu G, Chang EY, Zhang Z (2005a) An analysis of transformation on non-positive semidefinite similarity matrix for kernel machines. In: Proceedings of the 22th ICML international conference on machine learning, p 8Google Scholar
  104. Wu G, Chang EY, Zhang Z (2005b) Learning with non-metric proximity matrices. In: Proceedings of the 13th ACM international conference on multimedia, p 411Google Scholar
  105. Wu L, Yen IEH, Xu F, Ravikuma P, Witbrock M (2018a) D2KE: from distance to kernel and embedding, pp 1–18. arXiv:1802.04956
  106. Wu L, Yen IE-H, Yi J, Xu F, Lei Q, Witbrock M (2018b) Random warping series: a random features method for time-series embedding. Proc Twenty-First Int Conf Artif Intell Stat 84:793–802Google Scholar
  107. Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23rd ICML international conference on machine learning, pp 1033–1040Google Scholar
  108. Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. ACM SIGKDD Explor Newsl 12(1):40Google Scholar
  109. Xue Y, Zhang L, Tao Z, Wang B, Li F (2017) An altered kernel transformation for time series classification. In: International conference on neural information processing, pp 455–465Google Scholar
  110. Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, p 947Google Scholar
  111. Ye L, Keogh E (2011) Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min Knowl Discovery 22(1–2):149–182MathSciNetzbMATHGoogle Scholar
  112. Zhang D, Zuo W, Zhang D, Zhang H (2010) Time series classification using support vector machine with Gaussian elastic metric kernel. In: Proceedings of international conference on pattern recognition, pp 29–32Google Scholar
  113. Zhang L, Chang P, Liu J, Yan Z, Wang T, Li F (2012) Kernel sparse representation-based classifier. IEEE Trans Signal Process 60(4):1684–1695MathSciNetzbMATHGoogle Scholar

Copyright information

© The Author(s) 2018

Authors and Affiliations

  1. 1.Basque Center for Applied Mathematics (BCAM)BilbaoSpain
  2. 2.Intelligent Systems Group (ISG), Department of Computer Science and Artificial IntelligenceUniversity of the Basque Country UPV/EHUDonostia-San SebastianSpain
  3. 3.Department of Applied Mathematics, Statistics and Operational ResearchUniversity of the Basque Country UPV/EHULeoiaSpain

Personalised recommendations