Data Mining and Knowledge Discovery

, Volume 29, Issue 2, pp 400–422 | Cite as

Learning a symbolic representation for multivariate time series classification

  • Mustafa Gokce Baydogan
  • George Runger


Multivariate time series (MTS) classification has gained importance with the increase in the number of temporal datasets in different domains (such as medicine, finance, multimedia, etc.). Similarity-based approaches, such as nearest-neighbor classifiers, are often used for univariate time series, but MTS are characterized not only by individual attributes, but also by their relationships. Here we provide a classifier based on a new symbolic representation for MTS (denoted as SMTS) with several important elements. SMTS considers all attributes of MTS simultaneously, rather than separately, to extract information contained in the relationships. Symbols are learned from a supervised algorithm that does not require pre-defined intervals, nor features. An elementary representation is used that consists of the time index, and the values (and first differences for numerical attributes) of the individual time series as columns. That is, there is essentially no feature extraction (aside from first differences) and the local series values are fused to time position through the time index. The initial representation of raw data is quite simple conceptually and operationally. Still, a tree-based ensemble can detect interactions in the space of the time index and time values and this is exploited to generate a high-dimensional codebook from the terminal nodes of the trees. Because the time index is included as an attribute, each MTS is learned to be segmented by time, or by the value of one of its attributes. The codebook is processed with a second ensemble where now implicit feature selection is exploited to handle the high-dimensional input. The constituent properties produce a distinctly different algorithm. Moreover, MTS with nominal and missing values are handled efficiently with tree learners. Experiments demonstrate the effectiveness of the proposed approach in terms of accuracy and computation times in a large collection multivariate (and univariate) datasets.


Supervised learning Codebook Decision trees 



This research was partially supported by ONR Grant N00014-09-1-0656.


  1. Akl A, Valaee S (2010) Accelerometer-based gesture recognition via dynamic-time warping, affinity propagation, compressive sensing. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp 2270–2273, MarchGoogle Scholar
  2. Bache K, Lichman M (2013) UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA.
  3. Bankó Z, Abonyi J (2012) Correlation based dynamic time warping of multivariate time series. Expert Systems with Applications 18(5):231–241Google Scholar
  4. Baydogan MG (2012) Modeling Time Series Data for Supervised Learning. PhD thesis, Arizona State University, Dec.Google Scholar
  5. Baydogan MG (2013) Multivariate time series classification. homepage:
  6. Baydogan MG, Runger G, Tuv E (2013) A bag-of-features framework to classify time series. Pattern Analysis and Machine Intelligence, IEEE Transactions on 35(11):2796–2802CrossRefGoogle Scholar
  7. Bicego M, Pekalska E, Tax DMJ, Duin RPW (2009) Component-based discriminative classification for hidden Markov models. Pattern Recognition 42(11):2637–2648CrossRefzbMATHGoogle Scholar
  8. Breiman L (2001) Random forests. Machine Learning 45(1):5–32CrossRefzbMATHGoogle Scholar
  9. Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and Regression Trees. Wadsworth, Belmont, MAzbMATHGoogle Scholar
  10. Brodley C, Utgoff P (1995) Multivariate decision trees. Machine Learning 19(1):45–77zbMATHGoogle Scholar
  11. Chakrabarti K, Keogh E, Mehrotra S, Pazzani M (2002) Locally adaptive dimensionality reduction for indexing large time series databases. ACM Trans. Database Syst. 27(2):188–228CrossRefGoogle Scholar
  12. Chaovalitwongse W, Pardalos P (2008) On the time series support vector machine using dynamic time warping kernel for brain activity classification. Cybernetics and Systems Analysis 44:125–138CrossRefzbMATHMathSciNetGoogle Scholar
  13. Fu T-C (2011) A review on time series data mining. Engineering Applications of Artificial Intelligence 24:164–181CrossRefGoogle Scholar
  14. Geurts P (2001) Pattern extraction for time series classification. Principles of Data Mining and Knowledge Discovery, volume 2168 of Lecture Notes in Computer ScienceSpringer, Berlin / Heidelberg, pp 115–127Google Scholar
  15. Hammami N, Bedda M (2010) Improved tree model for arabic speech recognition. In Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on, volume 5, pages 521–526, JulyGoogle Scholar
  16. Kadous MW, Sammut C (2005) Classification of multivariate time series and structured data using constructive induction. Machine Learning 58:179–216CrossRefGoogle Scholar
  17. Keogh E, Zhu Q, Hu B, Y. H, Xi X, Wei L, Ratanamahatana CA (2011) The UCR time series classification/clustering. homepage:
  18. Kudo M, Toyama J, Shimbo M (1999) Multidimensional curve classification using passing-through regions. Pattern Recognition Letters 20(1113):1103–1111CrossRefGoogle Scholar
  19. Kuksa PP (2012) 2d similarity kernels for biological sequence classification. In ACM SIGKDD Workshop on Data Mining in BioinformaticsGoogle Scholar
  20. Li C, Khan L, Prabhakaran B (2006) Real-time classification of variable length multi-attribute motions. Knowledge and Information Systems 10:163–183CrossRefGoogle Scholar
  21. Li C, Khan L, Prabhakaran B (2007) Feature selection for classification of variable length multiattribute motions. In Multimedia Data Mining and Knowledge Discovery, pages 116–137. Springer LondonGoogle Scholar
  22. Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp 2–11. ACM PressGoogle Scholar
  23. Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery 15:107–144CrossRefMathSciNetGoogle Scholar
  24. Lin J, Khade R, Li Y (2012) Rotation-invariant similarity in time series using bag-of-patterns representation. Journal of Intelligent Information Systems, pp 1–29Google Scholar
  25. Lin J, Williamson S, Borne K, DeBarr D (2012) Pattern recognition in time series. In Advances in Machine Learning and Data Mining for Astronomy, Chapman & Hall, To appear.Google Scholar
  26. Liu J, Wang Z, Zhong L, Wickramasuriya J, Vasudevan V (2009) uWave: Accelerometer-based personalized gesture recognition and its applications. Pervasive Computing and Communications, IEEE International Conference on 0:1–9Google Scholar
  27. McGovern A, Rosendahl D, Brown R, Droegemeier K (2011) Identifying predictive multi-dimensional time series motifs: an application to severe weather prediction. Data Mining and Knowledge Discovery 22:232–258CrossRefGoogle Scholar
  28. Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 30:1632–1646CrossRefGoogle Scholar
  29. Olszewski RT (2012) accessed: June 10
  30. Ordonez P, Armstrong T, Oates T, Fackler J (2011) Using modified multivariate bag-of-words models to classify physiological data. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, ICDMW ’11, pages 534–539, Washington, DC, USA, IEEE Computer Society.Google Scholar
  31. Orsenigo C, Vercellis C (2010) Combining discrete svm and fixed cardinality warping distances for multivariate time series classification. Pattern Recognition 43(11):3787–3794CrossRefzbMATHGoogle Scholar
  32. Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann,Google Scholar
  33. Ratanamahatana C, Keogh E (2004) Making time-series classification more accurate using learned constraints. In Proceedings of SIAM International Conference on Data Mining (SDM04), pp 11–22Google Scholar
  34. Ratanamahatana C, Keogh E (2005) Three myths about dynamic time warping data mining. In Proceedings of SIAM International Conference on Data Mining (SDM05), volume 21, pp 506–510Google Scholar
  35. Sakoe H (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26:43–49CrossRefzbMATHGoogle Scholar
  36. Schaefer JT (1990) The Critical Success Index as an Indicator of Warning Skill. Weather and Forecasting 5(4):570–575CrossRefMathSciNetGoogle Scholar
  37. Shieh J, Keogh E (2008) isax: indexing and mining terabyte sized time series. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’08, pages 623–631, New York, NY, USA, ACM.Google Scholar
  38. CMU Graphics Lab Motion Capture Database. Homepage:, 2012Google Scholar
  39. Weng X, Shen J (2008) Classification of multivariate time series using locality preserving projections. Knowledge-Based Systems 21(7):581–587CrossRefGoogle Scholar

Copyright information

© The Author(s) 2014

Authors and Affiliations

  1. 1.Department of Industrial EngineeringBoğaziçi UniversityBebek, Istanbul Turkey
  2. 2.School of Computing, Informatics & Decision Systems EngineeringArizona State UniversityTempeUSA

Personalised recommendations