Stacking for multivariate time series classification

Prieto, Oscar J.; Alonso-González, Carlos J.; Rodríguez, Juan J.

doi:10.1007/s10044-013-0351-9

Stacking for multivariate time series classification

Theoretical Advances
Published: 05 September 2013

Volume 18, pages 297–312, (2015)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Oscar J. Prieto¹,
Carlos J. Alonso-González² &
Juan J. Rodríguez³

1557 Accesses
29 Citations
Explore all metrics

Abstract

This work presents a novel approach to multivariate time series classification. The method exploits the multivariate structure of the time series and the possibilities of the stacking ensemble method. The basics of the method may be described in three steps: first, decomposing the multivariate time series on its constituent univariate time series; second, inducing a classifier for each univariate time series plus and additional multivariate classifier for the whole time series; third, creating the final multivariate time series classifier stacking the previous classifiers. The ensemble obtained has the potential to improve the accuracy of the single multivariate time series classifier. Several configurations of the stacking method have been tested on seven multivariate time series data sets. In five out of seven data sets, the proposed method obtains the smallest error rate. Moreover, in two out of seven data sets, stacking only the univariate time series classifiers provides the best results. The experimental results show that when a multivariate time series method does not produce an accurate classifier, stacking it with univariate time series classifiers is an alternative worthy of consideration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

Auslan, Japanese vowels and pendigits are available at the UCI repository [30]. Auslan is available at http://sites.google.com/site/waleedkadous/data-1. ECG and wafer are available at http://www.cs.cmu.edu/bobski/data/data.html.

References

Agrawal R, Lin KI, Sawhney HS, Shim K (1995) Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In: Dayal U, Gray PMD, Nishio S (eds) VLDB’95, Proceedings of 21th international conference on very large data bases, Zurich, September 1995. Morgan Kaufmann, Massachusetts, pp 490–501
Alimoglu F (1996) Combining multiple classifiers for pen-based handwritten digit recognition. Master’s thesis, Institute of Graduate Studies in Science and Engineering, Bogazici University, Istanbul
Alimoglu F, Alpaydin E (2001) Combining multiple representations for pen-based handwritten digit recognition. ELEKTRIK Turk J Electr Eng Comput Sci 9(1):1–12
Google Scholar
Alonso C, Prieto O, Rodríguez JJ, Bregón A (2008) Multivariate time series classification via stacking of univariate classifiers. In: Okun O, Valentini G (eds) Supervised and unsupervised ensemble methods and their applications, Springer, New York, pp 135–152
Chapter Google Scholar
Bahlmann C, Haasdonk B, Burkhardt H (2002) On-line handwriting recognition with support vector machines: a kernel approach. In: Proceedings of the 8th International workshop on frontiers in handwriting recognition (IWFHR), pp 49–54
Bankó Z, Abonyi J (2012) Correlation based dynamic time warping of multivariate time series. Expert Syst Appl 39(17):12,814–12,823
Article Google Scholar
Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state markov chains. Ann Math Stat 27:1554–1563
Article MathSciNet Google Scholar
Box G, Jenkins G (1976) Time series analysis forecasting and control. Prentice-Hall, USA
MATH Google Scholar
Box G, Jenkings G, Reinsel G (1994) Time series analysis, forecasting and control, 3rd edn. Prentice Hall, Englewood Cliffs
MATH Google Scholar
Bratko I, Mozetič I, Lavrač N (1989) KARDIO: a study in deep and qualitative knowledge for expert systems. MIT Press, Cambridge
Google Scholar
Bregón A, Simón A, Rodríguez JJ, Alonso CJ, Pulido B, Moro I (2006) Early fault classification in dynamic systems using case-based reasoning. In: Marín R, Onaindía E, Bugarín A, Santos J (eds) Current topics in artificial intelligence. 11th conference of the Spanish association for artificial intelligence, revised selected papers. Lecture notes in artificial intelligence, vol 4177. Springer, New York, pp 211–220
Campbell JP (1997) Speaker recognition : a tutorial. Proc IEEE 85(9):1437–1462
Article Google Scholar
Chakrabarti C, Rammohan R, Luger GF (2005) A first-order stochastic prognostic system for the diagnosis of helicopter rotor systems for the us navy. In: Prasad B (ed) IICAI, pp 3645–3656
Chaovalitwongse W, Pardalos P (2008) On the time series support vector machine using dynamic time warping kernel for brain activity classification. Cybern Syst Anal 44(1):125–138. doi:10.1007/s10559-008-0012-y
Article MATH MathSciNet Google Scholar
Chen L, Kamel MS (2007) A new design of multiple classifier system and its application to the classification of time series data. In: Proceedings of the ISIC. IEEE international conference on systems, man and cybernetics, pp 385–391
Chen L, Kamel M, Jiang J (2004) A modular system for the classification of time series data. In: Roli F, Kittler J, Windeatt T (eds) Multiple classifier systems, pp 134–143
Clancey WJ (1985) Heuristic classification. Artif Intell 27:289–350
Article Google Scholar
Cohen W (1995) Learning to classify english text with ilp methods. In: Raedt LD (ed) Proceedings of the 5th international workshop on inductive logic programming, pp 3–24
Console L, Picardi C, Dupre DT (2003) Temporal decision trees: model-based diagnosis of dynamic systems on-board. J Artif Intell Res 19:469–512
MATH Google Scholar
Cuturi M, Vert JP, Birkenes O, Matsui T (2007) A kernel for time series based on global alignments. In: Acoustics, speech and signal processing. Proceedings of the IEEE international conference on ICASSP 2007, vol 2. pp II-413–II-416. doi:10.1109/ICASSP.2007.366260
Das G, Lin KI, Mannila H, Renganathan G, Smyth P (1998) Rule discovery from time series. In: Proceedings of the 4th international conference of knowledge discovery and data mining. AAAI Press, California, pp 16–22
Dash M, Liu H (1997) Feature selection for classification. Int J Intell Data Anal 1(4):131–156
Article Google Scholar
Dean T, Kanazaba K (1989) A model for reasoning about persistence and causation. Comput Intell 5(3):142–150
Article Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MATH MathSciNet Google Scholar
Dietrich C, Schwenker F, Palm G (2001) MCS2001, chap classification of time series utilizing temporal and decision fusion. Springer, New York, pp 378–387
Google Scholar
Dietterich TG (1997) Machine-learning research. AI Mag 18(4):97–136
Google Scholar
Dong M, He D (2007) Hidden semi-Markov model-based methodology for multi-sensor equipment health diagnosis and prognosis. Eur J Oper Res 178:858–878
Article MATH Google Scholar
Dousson C, Duong TV (1999) Discovering chronicles with numerical time constraints from alarm logs for monitoring dynamic systems. In: Thomas D (ed) Proceedings of the 16th international joint conference on artificial intelligence (IJCAI-99), vol 1. pp 620–626
Esmael B, Arnaout A, Fruhwirth RK, Thonhauser G (2012) Multivariate time series classification by combining trend-based and value-based approximations. In: Computational science and Its applications—ICCSA 2012. Lecture notes in computer science, vol 7336. pp 392–403. doi:10.1007/978-3-642-31128-4_29
Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml
Fu Tc (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181. doi:10.1016/j.engappai.2010.09.007
Article Google Scholar
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. Syst Man Cybern Part C Appl Rev IEEE Trans 42(4):463–484. doi: 10.1109/TSMCC.2011.2161285
Article Google Scholar
García S, Herrera F (2008) An extension on "statistical comparisons of classifiers over multiple data sets" for all pairwise comparisons. J Mach Learn Res 9:2677–2694. http://www.jmlr.org/papers/volume9/garcia08a/garcia08a.pdf
Google Scholar
García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining. Exp Analf Power Inf Sci 180(10):2044–2064. doi:10.1016/j.ins.2009.12.010
Google Scholar
van Gerven MA, Taal BG, Lucas PJ (2008) Dynamic bayesian networks as prognostic models for clinical patient management. J Biomed Inform 41(4):515–529
Article Google Scholar
Geurts P, Wehenkel L (2005) Segment and combine approach for non-parametric time-series classification. Knowl Discov Databases PKDD 478–485. doi:10.1007/11564126_48
Ghalwash MF, Obradovic Z (2012) Early classification of multivariate temporal observations by extraction of interpretable shape lets. BMC Bio Inform 13:195
Article Google Scholar
Ghosh J, Deuser L, Beck S (1992) A neural network based hybrid system for detection, characterization and classification of short-duration oceanic signals. IEEE J Ocean Eng 17(4):351–363
Article Google Scholar
Hansen JV, Nelson RD (2002) Data mining of time series using stacked generalizers. Neurocomputing 43:173–184
Article MATH Google Scholar
Huang YS, Suen C (1995) A method of combining multiple experts forf the recognition of unconstrained handwritten numerals. IEEE Trans Pattern Anal Mach Intell 17(1):90–94
Article Google Scholar
Jelinek F (1997) Statistical methods for speech recognition. MIT Press, Cambridge
Google Scholar
Kadous MW (2002) Temporal classification: extending the classification paradigm to multivariate time series. PhD thesis, The University of New South Wales, School of Computer Science and Engineering. http://sites.google.com/site/waleedkadous/publications
Kadous MW, Sammut C (2005) Classification of multivariate time series and structured data using constructive induction. Mach Learn 58(2–3):179–216
Article Google Scholar
Keogh E, Pazzani M (2001) Derivative dynamic time warping. In: Proceedings of the first SIAM international conference on data mining. http://www.cs.ucr.edu/eamonn/sdm01.pdf
Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386
Article Google Scholar
Kudo M, Toyama J, Shimbo M (1999) Multidimensional curve classification using passing-through regions. Pattern Recognit Lett 20(11–13):1103–1111
Article Google Scholar
Kuncheva L (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, New York
Book Google Scholar
Kuncheva LI (2001) Combining classifiers: soft computing solutions. In: Pal SK (eds) Pattern recognition: from classical to modern approaches, World Scientific, Singapore, pp 427–452
Chapter Google Scholar
Kuncheva LI (2005) Diversity in multiple classifier systems. Inf Fusion 6(1):3–4
Article MathSciNet Google Scholar
Lin HT, Li L (2007) Support vector machinery for infinite ensemble learning. J Mach Learn Res 9:285–312. http://www.jmlr.org/papers/volume9/lin08a/lin08a.pdf
Google Scholar
Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: DMKD ’03 Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, ACM Press, New York, pp 2–11. http://portal.acm.org/citation.cfm?id=882086
Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discov 15:107–144
Article MathSciNet Google Scholar
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 14(4):491–502
Google Scholar
Mikut R, Burmeister O, Groll L, Reischl M (2008) Takagi–Sugeno–Kang Fuzzy classifiers for a special class of time-varying systems. Fuzzy Syst IEEE Trans 16(4):1038–1049. doi:10.1109/TFUZZ.2008.917291
Article Google Scholar
Minnen D, Zang P, Isbell C, Starner T (2007) Boosting diverse learners for domain agnostic time series classification. In: Proceedings of the workshop and challenge on time series classification at SIGKDD
Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281
Article MATH Google Scholar
Nakamura T, Taki K, Nomiya H, Seki K, Uehara K (2012) A shape-based similarity measure for time series data with ensemble learning. Pattern Anal Appl (in press). doi:10.1007/s10044-011-0262-6
Olszewski RT (2001) Generalized feature extraction for structural pattern recognition in time-series data. PhD thesis, Computer Science Department, Carnegie Mellon University. http://reports-archive.adm.cs.cmu.edu/anon/2001/abstracts/01-108.html
Orsenigo C, Vercellis C (2010) Combining discrete svm and fixed cardinality warping distances for multivariate time series classification. Pattern Recognit 43(11):3787–3794
Article MATH Google Scholar
Papadimitriou S, Sun J, Faloutsos C (2005) Streaming pattern discovery in multiple time-series. In: VLDB ’05: Proceedings of the 31st international conference on very large data bases, VLDB Endowment, pp 697–708
Patton R, Chen J, Siew T (1994) Fault diagnosis in nonlinear dynamic systems via neural networks. In: Proceedings of the IEEE international conference control’94, vol 2. pp 1346–1351
Petitjean B, Barut S, Rolet S, Simonet D (2006) Damage detection on aeorespace structures. In: Gemes A (ed) Proceedings of the third European workshop on structural health monitoring, pp 159–166
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, vol 77(2)
Rodríguez JJ, Alonso CJ (2004) Support vector machines of interval-based features for time series classification. In: Bramer M, Coenen F, Allen T (eds) Research and develpment in intelligent systems XXI, Springer, New York, pp 244–257
Google Scholar
Rodríguez JJ, Alonso CJ, Boström H (2001) Boosting interval based literals. Intell Data Anal 5(3):245–262
MATH Google Scholar
Rodríguez JJ, Alonso CJ, Maestro JA (2005) Support vector machines of interval-based features for time series classification. Knowl-Based Syst 18(4–5):171–178. doi:10.1016/j.knosys.2004.10.007
Article Google Scholar
Ron D, Singer Y, Tishby N (1998) On the learn ability and usage of acyclic probabilistic finite automata. J Comput Syst Sci 56:133–152
Article MATH MathSciNet Google Scholar
Rooney N, Patterson DW, Nugent CD (2007) Non-strict heterogeneous stacking. Pattern Recognit Lett 28(9):1050–1061
Article Google Scholar
Roverso D (2000) Multivariate temporal classification by windowed wavelet decomposition and recurrent neural networks. In: Paper presented at the 3rd ANS international topical meeting on nuclear plant instrumentation, control and human-machine interface
Roychoudhury I, Biswas G, Koutsoukos X (2008) Comprehensive diagnosis of continuous systems using dynamic bayes nets. In: Proceedings of the 19th international workshop on principles of diagnosis
Schreiber G, Akkermans H, Anjewierden A, de Hoog R, Shadbolt N, de Velde WV, Wielinga B (1999) Knowledge engineering and management, the CommonKADS methodology. The MIT Press, Cambridge
Google Scholar
Sivaramakrishnan KR, Karthik K, Bhattacharyya C (2007) Kernels for large margin time-series classification. In: Proceedings of the international joint conference on neural networks, IJCNN 2007, pp 2746–2751. doi:10.1109/IJCNN.2007.4371393
Strickert M, Hammer B (2005) Merge som for temporal data. Neurocomputing 64:39–71. doi:10.1016/j.neucom.2004.11.014
Article Google Scholar
Ting KM, Witten IH (1999) Issues in stacked generalization. J Artif Intell Res (JAIR) 10:271–289
MATH Google Scholar
Vincent RD, Pineau J, de Guzman P, Avoli M (2007) Recurrent boosting for classification of natural and synthetic time-series data. In: Proceedings of the Canadian conference on AI, pp 192–203
Weng X, Shen J (2008a) Classification of multivariate time series using locality preserving projections. Knowl-Based Syst 21(7):581–587
Article Google Scholar
Weng X, Shen J (2008b) Classification of multivariate time series using two-dimensional singular value decomposition. Knowl-Based Syst 21(7):535–539
Article Google Scholar
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Article MathSciNet Google Scholar
Yoon H, Yang K, Shahabi C (2005) Feature subset selection and feature ranking for multivariate time series. Trans Knowl Data Eng 17(9):1186–1198. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1490526
Google Scholar
Zang P, Isbell C (2007) Managing domain knowledge and multiple models with boosting. In: Proceedings of international joint conference on artificial intelligence, pp 1144–1149
Zhang X, Xiao X, Tao L, Xu G (2007) Real-time situation detection based on rao-blackwellized particle filters in meetings. In: Proceedings of the 2007 IEEE international conference on robotics and biomimetics
Zhao JH, Dong ZY, Xu Z (2006) Effective feature preprocessing for time series forecasting. In: ADMA, pp 769–781
Zhao ZYJ, Sun J, Ge SS (2007) High performance quadratic classifier and the application on pendigits recognition. In: Proceedings of the 46th IEEE conference on decision and control, pp 3072–3077. doi:10.1109/CDC.2007.4434191

Download references

Acknowledgements

We express our gratitude to the donors of the different data sets.

Author information

Authors and Affiliations

Escuela Politécnica Superior, Universidad Europea Miguel de Cervantes, C/Padre Julio Chevalier, n2., 47012, Valladolid, Spain
Oscar J. Prieto
Departmento de Informática, ETSI Informática, University of Valladolid, Valladolid, Spain
Carlos J. Alonso-González
Departmento de Ingeniería Civil, Escuela Politécnica Superior, University of Burgos, Burgos, Spain
Juan J. Rodríguez

Authors

Oscar J. Prieto
View author publications
You can also search for this author in PubMed Google Scholar
Carlos J. Alonso-González
View author publications
You can also search for this author in PubMed Google Scholar
Juan J. Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan J. Rodríguez.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prieto, O.J., Alonso-González, C.J. & Rodríguez, J.J. Stacking for multivariate time series classification. Pattern Anal Applic 18, 297–312 (2015). https://doi.org/10.1007/s10044-013-0351-9

Download citation

Received: 21 April 2012
Accepted: 20 August 2013
Published: 05 September 2013
Issue Date: May 2015
DOI: https://doi.org/10.1007/s10044-013-0351-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stacking for multivariate time series classification

Abstract

Access this article

Similar content being viewed by others

Ensemble Methods for Time Series Forecasting

Stacking-based neural network for nonlinear time series analysis

Time Series Classification with Representation Ensembles

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Stacking for multivariate time series classification

Abstract

Access this article

Similar content being viewed by others

Ensemble Methods for Time Series Forecasting

Stacking-based neural network for nonlinear time series analysis

Time Series Classification with Representation Ensembles

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation