Skip to main content
Log in

Learning a symbolic representation for multivariate time series classification

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript


Multivariate time series (MTS) classification has gained importance with the increase in the number of temporal datasets in different domains (such as medicine, finance, multimedia, etc.). Similarity-based approaches, such as nearest-neighbor classifiers, are often used for univariate time series, but MTS are characterized not only by individual attributes, but also by their relationships. Here we provide a classifier based on a new symbolic representation for MTS (denoted as SMTS) with several important elements. SMTS considers all attributes of MTS simultaneously, rather than separately, to extract information contained in the relationships. Symbols are learned from a supervised algorithm that does not require pre-defined intervals, nor features. An elementary representation is used that consists of the time index, and the values (and first differences for numerical attributes) of the individual time series as columns. That is, there is essentially no feature extraction (aside from first differences) and the local series values are fused to time position through the time index. The initial representation of raw data is quite simple conceptually and operationally. Still, a tree-based ensemble can detect interactions in the space of the time index and time values and this is exploited to generate a high-dimensional codebook from the terminal nodes of the trees. Because the time index is included as an attribute, each MTS is learned to be segmented by time, or by the value of one of its attributes. The codebook is processed with a second ensemble where now implicit feature selection is exploited to handle the high-dimensional input. The constituent properties produce a distinctly different algorithm. Moreover, MTS with nominal and missing values are handled efficiently with tree learners. Experiments demonstrate the effectiveness of the proposed approach in terms of accuracy and computation times in a large collection multivariate (and univariate) datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others


  • Akl A, Valaee S (2010) Accelerometer-based gesture recognition via dynamic-time warping, affinity propagation, compressive sensing. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp 2270–2273, March

  • Bache K, Lichman M (2013) UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA.

  • Bankó Z, Abonyi J (2012) Correlation based dynamic time warping of multivariate time series. Expert Systems with Applications 18(5):231–241

    Google Scholar 

  • Baydogan MG (2012) Modeling Time Series Data for Supervised Learning. PhD thesis, Arizona State University, Dec.

  • Baydogan MG (2013) Multivariate time series classification. homepage:

  • Baydogan MG, Runger G, Tuv E (2013) A bag-of-features framework to classify time series. Pattern Analysis and Machine Intelligence, IEEE Transactions on 35(11):2796–2802

    Article  Google Scholar 

  • Bicego M, Pekalska E, Tax DMJ, Duin RPW (2009) Component-based discriminative classification for hidden Markov models. Pattern Recognition 42(11):2637–2648

    Article  MATH  Google Scholar 

  • Breiman L (2001) Random forests. Machine Learning 45(1):5–32

    Article  MATH  Google Scholar 

  • Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and Regression Trees. Wadsworth, Belmont, MA

    MATH  Google Scholar 

  • Brodley C, Utgoff P (1995) Multivariate decision trees. Machine Learning 19(1):45–77

    MATH  Google Scholar 

  • Chakrabarti K, Keogh E, Mehrotra S, Pazzani M (2002) Locally adaptive dimensionality reduction for indexing large time series databases. ACM Trans. Database Syst. 27(2):188–228

    Article  Google Scholar 

  • Chaovalitwongse W, Pardalos P (2008) On the time series support vector machine using dynamic time warping kernel for brain activity classification. Cybernetics and Systems Analysis 44:125–138

    Article  MATH  MathSciNet  Google Scholar 

  • Fu T-C (2011) A review on time series data mining. Engineering Applications of Artificial Intelligence 24:164–181

    Article  Google Scholar 

  • Geurts P (2001) Pattern extraction for time series classification. Principles of Data Mining and Knowledge Discovery, volume 2168 of Lecture Notes in Computer ScienceSpringer, Berlin / Heidelberg, pp 115–127

  • Hammami N, Bedda M (2010) Improved tree model for arabic speech recognition. In Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on, volume 5, pages 521–526, July

  • Kadous MW, Sammut C (2005) Classification of multivariate time series and structured data using constructive induction. Machine Learning 58:179–216

    Article  Google Scholar 

  • Keogh E, Zhu Q, Hu B, Y. H, Xi X, Wei L, Ratanamahatana CA (2011) The UCR time series classification/clustering. homepage:

  • Kudo M, Toyama J, Shimbo M (1999) Multidimensional curve classification using passing-through regions. Pattern Recognition Letters 20(1113):1103–1111

    Article  Google Scholar 

  • Kuksa PP (2012) 2d similarity kernels for biological sequence classification. In ACM SIGKDD Workshop on Data Mining in Bioinformatics

  • Li C, Khan L, Prabhakaran B (2006) Real-time classification of variable length multi-attribute motions. Knowledge and Information Systems 10:163–183

    Article  Google Scholar 

  • Li C, Khan L, Prabhakaran B (2007) Feature selection for classification of variable length multiattribute motions. In Multimedia Data Mining and Knowledge Discovery, pages 116–137. Springer London

  • Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp 2–11. ACM Press

  • Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery 15:107–144

    Article  MathSciNet  Google Scholar 

  • Lin J, Khade R, Li Y (2012) Rotation-invariant similarity in time series using bag-of-patterns representation. Journal of Intelligent Information Systems, pp 1–29

  • Lin J, Williamson S, Borne K, DeBarr D (2012) Pattern recognition in time series. In Advances in Machine Learning and Data Mining for Astronomy, Chapman & Hall, To appear.

  • Liu J, Wang Z, Zhong L, Wickramasuriya J, Vasudevan V (2009) uWave: Accelerometer-based personalized gesture recognition and its applications. Pervasive Computing and Communications, IEEE International Conference on 0:1–9

  • McGovern A, Rosendahl D, Brown R, Droegemeier K (2011) Identifying predictive multi-dimensional time series motifs: an application to severe weather prediction. Data Mining and Knowledge Discovery 22:232–258

    Article  Google Scholar 

  • Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 30:1632–1646

    Article  Google Scholar 

  • Olszewski RT (2012) accessed: June 10

  • Ordonez P, Armstrong T, Oates T, Fackler J (2011) Using modified multivariate bag-of-words models to classify physiological data. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, ICDMW ’11, pages 534–539, Washington, DC, USA, IEEE Computer Society.

  • Orsenigo C, Vercellis C (2010) Combining discrete svm and fixed cardinality warping distances for multivariate time series classification. Pattern Recognition 43(11):3787–3794

    Article  MATH  Google Scholar 

  • Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann,

  • Ratanamahatana C, Keogh E (2004) Making time-series classification more accurate using learned constraints. In Proceedings of SIAM International Conference on Data Mining (SDM04), pp 11–22

  • Ratanamahatana C, Keogh E (2005) Three myths about dynamic time warping data mining. In Proceedings of SIAM International Conference on Data Mining (SDM05), volume 21, pp 506–510

  • Sakoe H (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26:43–49

    Article  MATH  Google Scholar 

  • Schaefer JT (1990) The Critical Success Index as an Indicator of Warning Skill. Weather and Forecasting 5(4):570–575

    Article  MathSciNet  Google Scholar 

  • Shieh J, Keogh E (2008) isax: indexing and mining terabyte sized time series. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’08, pages 623–631, New York, NY, USA, ACM.

  • CMU Graphics Lab Motion Capture Database. Homepage:, 2012

  • Weng X, Shen J (2008) Classification of multivariate time series using locality preserving projections. Knowledge-Based Systems 21(7):581–587

    Article  Google Scholar 

Download references


This research was partially supported by ONR Grant N00014-09-1-0656.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Mustafa Gokce Baydogan.

Additional information

Responsible editor: M. J. Zaki.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baydogan, M.G., Runger, G. Learning a symbolic representation for multivariate time series classification. Data Min Knowl Disc 29, 400–422 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: