Skip to main content
Log in

Stacking for multivariate time series classification

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

This work presents a novel approach to multivariate time series classification. The method exploits the multivariate structure of the time series and the possibilities of the stacking ensemble method. The basics of the method may be described in three steps: first, decomposing the multivariate time series on its constituent univariate time series; second, inducing a classifier for each univariate time series plus and additional multivariate classifier for the whole time series; third, creating the final multivariate time series classifier stacking the previous classifiers. The ensemble obtained has the potential to improve the accuracy of the single multivariate time series classifier. Several configurations of the stacking method have been tested on seven multivariate time series data sets. In five out of seven data sets, the proposed method obtains the smallest error rate. Moreover, in two out of seven data sets, stacking only the univariate time series classifiers provides the best results. The experimental results show that when a multivariate time series method does not produce an accurate classifier, stacking it with univariate time series classifiers is an alternative worthy of consideration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Auslan, Japanese vowels and pendigits are available at the UCI repository [30]. Auslan is available at http://sites.google.com/site/waleedkadous/data-1. ECG and wafer are available at http://www.cs.cmu.edu/bobski/data/data.html.

References

  1. Agrawal R, Lin KI, Sawhney HS, Shim K (1995) Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In: Dayal U, Gray PMD, Nishio S (eds) VLDB’95, Proceedings of 21th international conference on very large data bases, Zurich, September 1995. Morgan Kaufmann, Massachusetts, pp 490–501

  2. Alimoglu F (1996) Combining multiple classifiers for pen-based handwritten digit recognition. Master’s thesis, Institute of Graduate Studies in Science and Engineering, Bogazici University, Istanbul

  3. Alimoglu F, Alpaydin E (2001) Combining multiple representations for pen-based handwritten digit recognition. ELEKTRIK Turk J Electr Eng Comput Sci 9(1):1–12

    Google Scholar 

  4. Alonso C, Prieto O, Rodríguez JJ, Bregón A (2008) Multivariate time series classification via stacking of univariate classifiers. In: Okun O, Valentini G (eds) Supervised and unsupervised ensemble methods and their applications, Springer, New York, pp 135–152

    Chapter  Google Scholar 

  5. Bahlmann C, Haasdonk B, Burkhardt H (2002) On-line handwriting recognition with support vector machines: a kernel approach. In: Proceedings of the 8th International workshop on frontiers in handwriting recognition (IWFHR), pp 49–54

  6. Bankó Z, Abonyi J (2012) Correlation based dynamic time warping of multivariate time series. Expert Syst Appl 39(17):12,814–12,823

    Article  Google Scholar 

  7. Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state markov chains. Ann Math Stat 27:1554–1563

    Article  MathSciNet  Google Scholar 

  8. Box G, Jenkins G (1976) Time series analysis forecasting and control. Prentice-Hall, USA

    MATH  Google Scholar 

  9. Box G, Jenkings G, Reinsel G (1994) Time series analysis, forecasting and control, 3rd edn. Prentice Hall, Englewood Cliffs

    MATH  Google Scholar 

  10. Bratko I, Mozetič I, Lavrač N (1989) KARDIO: a study in deep and qualitative knowledge for expert systems. MIT Press, Cambridge

    Google Scholar 

  11. Bregón A, Simón A, Rodríguez JJ, Alonso CJ, Pulido B, Moro I (2006) Early fault classification in dynamic systems using case-based reasoning. In: Marín R, Onaindía E, Bugarín A, Santos J (eds) Current topics in artificial intelligence. 11th conference of the Spanish association for artificial intelligence, revised selected papers. Lecture notes in artificial intelligence, vol 4177. Springer, New York, pp 211–220

  12. Campbell JP (1997) Speaker recognition : a tutorial. Proc IEEE 85(9):1437–1462

    Article  Google Scholar 

  13. Chakrabarti C, Rammohan R, Luger GF (2005) A first-order stochastic prognostic system for the diagnosis of helicopter rotor systems for the us navy. In: Prasad B (ed) IICAI, pp 3645–3656

  14. Chaovalitwongse W, Pardalos P (2008) On the time series support vector machine using dynamic time warping kernel for brain activity classification. Cybern Syst Anal 44(1):125–138. doi:10.1007/s10559-008-0012-y

    Article  MATH  MathSciNet  Google Scholar 

  15. Chen L, Kamel MS (2007) A new design of multiple classifier system and its application to the classification of time series data. In: Proceedings of the ISIC. IEEE international conference on systems, man and cybernetics, pp 385–391

  16. Chen L, Kamel M, Jiang J (2004) A modular system for the classification of time series data. In: Roli F, Kittler J, Windeatt T (eds) Multiple classifier systems, pp 134–143

  17. Clancey WJ (1985) Heuristic classification. Artif Intell 27:289–350

    Article  Google Scholar 

  18. Cohen W (1995) Learning to classify english text with ilp methods. In: Raedt LD (ed) Proceedings of the 5th international workshop on inductive logic programming, pp 3–24

  19. Console L, Picardi C, Dupre DT (2003) Temporal decision trees: model-based diagnosis of dynamic systems on-board. J Artif Intell Res 19:469–512

    MATH  Google Scholar 

  20. Cuturi M, Vert JP, Birkenes O, Matsui T (2007) A kernel for time series based on global alignments. In: Acoustics, speech and signal processing. Proceedings of the IEEE international conference on ICASSP 2007, vol 2. pp II-413–II-416. doi:10.1109/ICASSP.2007.366260

  21. Das G, Lin KI, Mannila H, Renganathan G, Smyth P (1998) Rule discovery from time series. In: Proceedings of the 4th international conference of knowledge discovery and data mining. AAAI Press, California, pp 16–22

  22. Dash M, Liu H (1997) Feature selection for classification. Int J Intell Data Anal 1(4):131–156

    Article  Google Scholar 

  23. Dean T, Kanazaba K (1989) A model for reasoning about persistence and causation. Comput Intell 5(3):142–150

    Article  Google Scholar 

  24. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MATH  MathSciNet  Google Scholar 

  25. Dietrich C, Schwenker F, Palm G (2001) MCS2001, chap classification of time series utilizing temporal and decision fusion. Springer, New York, pp 378–387

    Google Scholar 

  26. Dietterich TG (1997) Machine-learning research. AI Mag 18(4):97–136

    Google Scholar 

  27. Dong M, He D (2007) Hidden semi-Markov model-based methodology for multi-sensor equipment health diagnosis and prognosis. Eur J Oper Res 178:858–878

    Article  MATH  Google Scholar 

  28. Dousson C, Duong TV (1999) Discovering chronicles with numerical time constraints from alarm logs for monitoring dynamic systems. In: Thomas D (ed) Proceedings of the 16th international joint conference on artificial intelligence (IJCAI-99), vol 1. pp 620–626

  29. Esmael B, Arnaout A, Fruhwirth RK, Thonhauser G (2012) Multivariate time series classification by combining trend-based and value-based approximations. In: Computational science and Its applications—ICCSA 2012. Lecture notes in computer science, vol 7336. pp 392–403. doi:10.1007/978-3-642-31128-4_29

  30. Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml

  31. Fu Tc (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181. doi:10.1016/j.engappai.2010.09.007

    Article  Google Scholar 

  32. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. Syst Man Cybern Part C Appl Rev IEEE Trans 42(4):463–484. doi: 10.1109/TSMCC.2011.2161285

    Article  Google Scholar 

  33. García S, Herrera F (2008) An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. J Mach Learn Res 9:2677–2694. http://www.jmlr.org/papers/volume9/garcia08a/garcia08a.pdf

    Google Scholar 

  34. García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining. Exp Analf Power Inf Sci 180(10):2044–2064. doi:10.1016/j.ins.2009.12.010

    Google Scholar 

  35. van Gerven MA, Taal BG, Lucas PJ (2008) Dynamic bayesian networks as prognostic models for clinical patient management. J Biomed Inform 41(4):515–529

    Article  Google Scholar 

  36. Geurts P, Wehenkel L (2005) Segment and combine approach for non-parametric time-series classification. Knowl Discov Databases PKDD 478–485. doi:10.1007/11564126_48

  37. Ghalwash MF, Obradovic Z (2012) Early classification of multivariate temporal observations by extraction of interpretable shape lets. BMC Bio Inform 13:195

    Article  Google Scholar 

  38. Ghosh J, Deuser L, Beck S (1992) A neural network based hybrid system for detection, characterization and classification of short-duration oceanic signals. IEEE J Ocean Eng 17(4):351–363

    Article  Google Scholar 

  39. Hansen JV, Nelson RD (2002) Data mining of time series using stacked generalizers. Neurocomputing 43:173–184

    Article  MATH  Google Scholar 

  40. Huang YS, Suen C (1995) A method of combining multiple experts forf the recognition of unconstrained handwritten numerals. IEEE Trans Pattern Anal Mach Intell 17(1):90–94

    Article  Google Scholar 

  41. Jelinek F (1997) Statistical methods for speech recognition. MIT Press, Cambridge

    Google Scholar 

  42. Kadous MW (2002) Temporal classification: extending the classification paradigm to multivariate time series. PhD thesis, The University of New South Wales, School of Computer Science and Engineering. http://sites.google.com/site/waleedkadous/publications

  43. Kadous MW, Sammut C (2005) Classification of multivariate time series and structured data using constructive induction. Mach Learn 58(2–3):179–216

    Article  Google Scholar 

  44. Keogh E, Pazzani M (2001) Derivative dynamic time warping. In: Proceedings of the first SIAM international conference on data mining. http://www.cs.ucr.edu/eamonn/sdm01.pdf

  45. Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386

    Article  Google Scholar 

  46. Kudo M, Toyama J, Shimbo M (1999) Multidimensional curve classification using passing-through regions. Pattern Recognit Lett 20(11–13):1103–1111

    Article  Google Scholar 

  47. Kuncheva L (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, New York

    Book  Google Scholar 

  48. Kuncheva LI (2001) Combining classifiers: soft computing solutions. In: Pal SK (eds) Pattern recognition: from classical to modern approaches, World Scientific, Singapore, pp 427–452

    Chapter  Google Scholar 

  49. Kuncheva LI (2005) Diversity in multiple classifier systems. Inf Fusion 6(1):3–4

    Article  MathSciNet  Google Scholar 

  50. Lin HT, Li L (2007) Support vector machinery for infinite ensemble learning. J Mach Learn Res 9:285–312. http://www.jmlr.org/papers/volume9/lin08a/lin08a.pdf

    Google Scholar 

  51. Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: DMKD ’03 Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, ACM Press, New York, pp 2–11. http://portal.acm.org/citation.cfm?id=882086

  52. Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discov 15:107–144

    Article  MathSciNet  Google Scholar 

  53. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 14(4):491–502

    Google Scholar 

  54. Mikut R, Burmeister O, Groll L, Reischl M (2008) Takagi–Sugeno–Kang Fuzzy classifiers for a special class of time-varying systems. Fuzzy Syst IEEE Trans 16(4):1038–1049. doi:10.1109/TFUZZ.2008.917291

    Article  Google Scholar 

  55. Minnen D, Zang P, Isbell C, Starner T (2007) Boosting diverse learners for domain agnostic time series classification. In: Proceedings of the workshop and challenge on time series classification at SIGKDD

  56. Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281

    Article  MATH  Google Scholar 

  57. Nakamura T, Taki K, Nomiya H, Seki K, Uehara K (2012) A shape-based similarity measure for time series data with ensemble learning. Pattern Anal Appl (in press). doi:10.1007/s10044-011-0262-6

  58. Olszewski RT (2001) Generalized feature extraction for structural pattern recognition in time-series data. PhD thesis, Computer Science Department, Carnegie Mellon University. http://reports-archive.adm.cs.cmu.edu/anon/2001/abstracts/01-108.html

  59. Orsenigo C, Vercellis C (2010) Combining discrete svm and fixed cardinality warping distances for multivariate time series classification. Pattern Recognit 43(11):3787–3794

    Article  MATH  Google Scholar 

  60. Papadimitriou S, Sun J, Faloutsos C (2005) Streaming pattern discovery in multiple time-series. In: VLDB ’05: Proceedings of the 31st international conference on very large data bases, VLDB Endowment, pp 697–708

  61. Patton R, Chen J, Siew T (1994) Fault diagnosis in nonlinear dynamic systems via neural networks. In: Proceedings of the IEEE international conference control’94, vol 2. pp 1346–1351

  62. Petitjean B, Barut S, Rolet S, Simonet D (2006) Damage detection on aeorespace structures. In: Gemes A (ed) Proceedings of the third European workshop on structural health monitoring, pp 159–166

  63. Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, vol 77(2)

  64. Rodríguez JJ, Alonso CJ (2004) Support vector machines of interval-based features for time series classification. In: Bramer M, Coenen F, Allen T (eds) Research and develpment in intelligent systems XXI, Springer, New York, pp 244–257

    Google Scholar 

  65. Rodríguez JJ, Alonso CJ, Boström H (2001) Boosting interval based literals. Intell Data Anal 5(3):245–262

    MATH  Google Scholar 

  66. Rodríguez JJ, Alonso CJ, Maestro JA (2005) Support vector machines of interval-based features for time series classification. Knowl-Based Syst 18(4–5):171–178. doi:10.1016/j.knosys.2004.10.007

    Article  Google Scholar 

  67. Ron D, Singer Y, Tishby N (1998) On the learn ability and usage of acyclic probabilistic finite automata. J Comput Syst Sci 56:133–152

    Article  MATH  MathSciNet  Google Scholar 

  68. Rooney N, Patterson DW, Nugent CD (2007) Non-strict heterogeneous stacking. Pattern Recognit Lett 28(9):1050–1061

    Article  Google Scholar 

  69. Roverso D (2000) Multivariate temporal classification by windowed wavelet decomposition and recurrent neural networks. In: Paper presented at the 3rd ANS international topical meeting on nuclear plant instrumentation, control and human-machine interface

  70. Roychoudhury I, Biswas G, Koutsoukos X (2008) Comprehensive diagnosis of continuous systems using dynamic bayes nets. In: Proceedings of the 19th international workshop on principles of diagnosis

  71. Schreiber G, Akkermans H, Anjewierden A, de Hoog R, Shadbolt N, de Velde WV, Wielinga B (1999) Knowledge engineering and management, the CommonKADS methodology. The MIT Press, Cambridge

    Google Scholar 

  72. Sivaramakrishnan KR, Karthik K, Bhattacharyya C (2007) Kernels for large margin time-series classification. In: Proceedings of the international joint conference on neural networks, IJCNN 2007, pp 2746–2751. doi:10.1109/IJCNN.2007.4371393

  73. Strickert M, Hammer B (2005) Merge som for temporal data. Neurocomputing 64:39–71. doi:10.1016/j.neucom.2004.11.014

    Article  Google Scholar 

  74. Ting KM, Witten IH (1999) Issues in stacked generalization. J Artif Intell Res (JAIR) 10:271–289

    MATH  Google Scholar 

  75. Vincent RD, Pineau J, de Guzman P, Avoli M (2007) Recurrent boosting for classification of natural and synthetic time-series data. In: Proceedings of the Canadian conference on AI, pp 192–203

  76. Weng X, Shen J (2008a) Classification of multivariate time series using locality preserving projections. Knowl-Based Syst 21(7):581–587

    Article  Google Scholar 

  77. Weng X, Shen J (2008b) Classification of multivariate time series using two-dimensional singular value decomposition. Knowl-Based Syst 21(7):535–539

    Article  Google Scholar 

  78. Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259

    Article  MathSciNet  Google Scholar 

  79. Yoon H, Yang K, Shahabi C (2005) Feature subset selection and feature ranking for multivariate time series. Trans Knowl Data Eng 17(9):1186–1198. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1490526

    Google Scholar 

  80. Zang P, Isbell C (2007) Managing domain knowledge and multiple models with boosting. In: Proceedings of international joint conference on artificial intelligence, pp 1144–1149

  81. Zhang X, Xiao X, Tao L, Xu G (2007) Real-time situation detection based on rao-blackwellized particle filters in meetings. In: Proceedings of the 2007 IEEE international conference on robotics and biomimetics

  82. Zhao JH, Dong ZY, Xu Z (2006) Effective feature preprocessing for time series forecasting. In: ADMA, pp 769–781

  83. Zhao ZYJ, Sun J, Ge SS (2007) High performance quadratic classifier and the application on pendigits recognition. In: Proceedings of the 46th IEEE conference on decision and control, pp 3072–3077. doi:10.1109/CDC.2007.4434191

Download references

Acknowledgements

We express our gratitude to the donors of the different data sets.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan J. Rodríguez.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prieto, O.J., Alonso-González, C.J. & Rodríguez, J.J. Stacking for multivariate time series classification. Pattern Anal Applic 18, 297–312 (2015). https://doi.org/10.1007/s10044-013-0351-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-013-0351-9

Keywords

Navigation