Self-labeling techniques for semi-supervised time series classification: an empirical study

Abstract

An increasing amount of unlabeled time series data available render the semi-supervised paradigm a suitable approach to tackle classification problems with a reduced quantity of labeled data. Self-labeled techniques stand out from semi-supervised classification methods due to their simplicity and the lack of strong assumptions about the distribution of the labeled and unlabeled data. This paper addresses the relevance of these techniques in the time series classification context by means of an empirical study that compares successful self-labeled methods in conjunction with various learning schemes and dissimilarity measures. Our experiments involve 35 time series datasets with different ratios of labeled data, aiming to measure the transductive and inductive classification capabilities of the self-labeled methods studied. The results show that the nearest-neighbor rule is a robust choice for the base classifier. In addition, the amending and multi-classifier self-labeled-based approaches reveal a promising attempt to perform semi-supervised classification in the time series context.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. 1.

    Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66

    Google Scholar 

  2. 2.

    Bagnall AJ, Janacek GJ (2004) Clustering time series from ARMA models with clipped data. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining’, KDD ’04. ACM, New York, NY, pp49–58

  3. 3.

    Bagnall A, Lines J, Hills J, Bostrom A (2015) Time-series classification with COTE: the collective of transformation-based ensembles. IEEE Trans Knowl Data Eng 27(9):2522–2535

    Article  Google Scholar 

  4. 4.

    Balakrishnan S, Madigan D (2006) Decision trees for functional variables. In: Sixth international conference on data mining, ICDM ’06, pp 798–802

  5. 5.

    Batista G, Hao Y, Keogh E, MafraNeto A (2011) Towards automatic classification on flying insects using inexpensive sensors. In: IEEE 10th international conference on machine learning and applications (ICMLA). IEEE, , vol 1, pp 364–369

  6. 6.

    Begum N, Hu B, Rakthanmanon T, Keogh E (2014) A minimum description length technique for semi-supervised time series classification. In: Integration of reusable systems, vol 263 of advances in intelligent systems and computing. Springer, Berlin, pp 171–192

  7. 7.

    Behera H, Dash P, Biswal B (2010) Power quality time series data mining using S-transform and fuzzy expert system. Appl Soft Comput 10(3):945–955

    Article  Google Scholar 

  8. 8.

    Ben-David A (2007) A lot of randomness is hiding in accuracy. Eng Appl Artif Intell 20(7):875–885

    Article  Google Scholar 

  9. 9.

    Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Eleventh annual conference on computational learning theory, COLT’ 98. ACM, New York, NY, pp 92–100

  10. 10.

    Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MathSciNet  MATH  Google Scholar 

  11. 11.

    Carden EP, Brownjohn JM (2008) ARMA modelled time-series classification for structural health monitoring of civil infrastructure. Mech Syst Signal Process 22(2):295–314

    Article  Google Scholar 

  12. 12.

    Chapelle O, Schölkopf B, Zien A (2006) Semi-supervised learning, vol 2. MIT Press, Cambridge

    Google Scholar 

  13. 13.

    Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories, In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, SIGMOD ’05. ACM, New York, NY, pp 491–502

  14. 14.

    Chen Y, Hu B, Keogh E, Batista GE (2013) DTW-D: time series semi-supervised learning from a single example. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’13. ACM, New York, NY, pp 383–391

  15. 15.

    Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G (2015) The ucr time series classification archive. www.cs.ucr.edu/~eamonn/time_series_data/

  16. 16.

    Cuturi M (2011) Fast global alignment kernels. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 929–936

  17. 17.

    Dash P, Behera H, Lee I (2008) Time sequence data mining using time-frequency analysis and soft computing techniques. Appl Soft Comput 8(1):202–215

    Article  Google Scholar 

  18. 18.

    De Sousa CAR, Souza VMA, Batista GEAPA (2014) Time series transductive classification on imbalanced data sets: an experimental study. In: 22nd international conference on pattern recognition (ICPR), pp 3780–3785

  19. 19.

    De Sousa CAR, Souza VMA, Batista GEAPA (2015) An experimental analysis on time series transductive classification on graphs. In: International joint conference on neural networks (IJCNN), pp 1–8

  20. 20.

    Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  21. 21.

    Douzal-Chouakria A, Amblard C (2012) Classification trees for time series. Pattern Recogn 45(3):1076–1091

    Article  Google Scholar 

  22. 22.

    Faloutsos, C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: Proceedings of the 1994 ACM SIGMOD international conference on management of data, SIGMOD ’94. ACM, New York, NY, pp 419–429

  23. 23.

    Flesca S, Manco G, Masciari E, Pontieri L, Pugliese A (2007) Exploiting structural similarity for effective web information extraction. Data Knowl Eng 60(1):222–234

    Article  Google Scholar 

  24. 24.

    Frank J, Mannor S, Pineau J, Precup D (2013) Time series analysis using geometric template matching. IEEE Trans Pattern Anal Mach Intell 35(3):740–754

    Article  Google Scholar 

  25. 25.

    Fu T (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181

    Article  Google Scholar 

  26. 26.

    Fulcher BD, Jones NS (2014) Highly comparative feature-based time-series classification. IEEE Trans Knowl Data Eng 26(12):3026–3037

    Article  Google Scholar 

  27. 27.

    García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064

    Article  Google Scholar 

  28. 28.

    Geler Z, Kurbalija V, Radovanović M, Ivanović M (2015) Comparison of different weighting schemes for the kNN classifier on time-series data. Knowl Inf Syst 48:331–378

    Article  Google Scholar 

  29. 29.

    Goldman SA, Zhou Y (2000) Enhancing supervised learning with unlabeled data. In: Seventeenth international conference on machine learning (ICML), pp 327–334

  30. 30.

    González M, Bergmeir C, Triguero I, Rodríguez Y, Benítez JM (2016) On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems. Inf Sci 328:42–59

    Article  Google Scholar 

  31. 31.

    Hochberg Y, Rom D (1995) Extensions of multiple testing procedures based on Simes’ test. J Stat Plan Inference 48(2):141–152

    MathSciNet  Article  MATH  Google Scholar 

  32. 32.

    Hodges J, Lehmann EL et al (1962) Rank methods for combination of independent experiments in analysis of variance. Ann Math Stat 33(2):482–497

    MathSciNet  Article  MATH  Google Scholar 

  33. 33.

    Jeong Y, Jayaraman R (2015) Support vector-based algorithms with weighted dynamic time warping kernel function for time series classification. Knowl-Based Syst 75:184–191

    Article  Google Scholar 

  34. 34.

    Kaya H, GunduzOguducu S (2015) A distance based time series classification framework. Inf Syst 51:27–42

    Article  Google Scholar 

  35. 35.

    Kim M (2013) Semi-supervised learning of hidden conditional random fields for time-series classification. Neurocomputing 119:339–349

    Article  Google Scholar 

  36. 36.

    Kurbalija V, Radovanović M, Geler Z, Ivanović M (2014) The influence of global constraints on similarity measures for time-series databases. Knowl-Based Syst 56:49–67

    Article  Google Scholar 

  37. 37.

    Lei H, Sun B (2007) A study on the dynamic time warping in kernel machines. In: Third international IEEE conference on signal-image technologies and internet-based system (SITIS), SITIS ’07, pp 839–845

  38. 38.

    Li M, Zhou Z (2005) Setred: self-training with editing. In: Advances in knowledge discovery and data mining, vol 3518 of Lecture notes in computer science. Springer, Berlin, pp 611–621

  39. 39.

    Lines J, Bagnall A (2014) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discov 29(3):565–592

    MathSciNet  Article  Google Scholar 

  40. 40.

    Liu Y, Yao K, Liu S, Raghavendra CS, Balogun O and Olabinjo L (2011) Semi-supervised failure prediction for oil production wells. In: IEEE 11th international conference on data mining workshops (ICDMW). IEEE, pp 434–441

  41. 41.

    Marteau PF (2009) Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans Pattern Anal Mach Intell 31(2):306–318

    Article  Google Scholar 

  42. 42.

    Marteau P, Gibet S (2015) On recursive edit distance kernels with application to time series classification. IEEE Trans Neural Netw Learn Syst 26(6):1121–1133

    MathSciNet  Article  Google Scholar 

  43. 43.

    Marussy K, Buza K (2013) Success: a new approach for semi-supervised classification of time-series. In: Rutkowski L, Korytkowski M, Scherer R, Tadeusiewicz R, Zadeh L, Zurada J (eds) Artificial intelligence and soft computing, vol 7894. Lecture notes in computer science. Springer, Berlin, pp 437–447

  44. 44.

    Meng J, Wu L, Wang X, Lin T (2011) Granulation-based symbolic representation of time series and semi-supervised classification. Comput Math Appl 62(9):3581–3590

    MathSciNet  Article  MATH  Google Scholar 

  45. 45.

    Petitjean F, Forestier G, Webb GI, Nicholson AE, Chen Y, Keogh E (2016) Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm. Knowl Inf Syst 47(1):1–26

    Article  Google Scholar 

  46. 46.

    Povinelli R, Johnson M, Lindgren A, Ye J (2004) Time series classification using gaussian mixture models of reconstructed phase spaces. IEEE Trans Knowl Data Eng 16(6):779–783

    Article  Google Scholar 

  47. 47.

    Pree H, Herwig B, Gruber T, Sick B, David K, Lukowicz P (2014) On general purpose time series similarity measures and their use as kernel functions in support vector machines. Inf Sci 281:478–495

    Article  Google Scholar 

  48. 48.

    R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/

  49. 49.

    Rabiner L (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286

    Article  Google Scholar 

  50. 50.

    Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12. ACM, New York, NY, pp 262–270

  51. 51.

    Ratanamahatana CA, Wanichsan D (2008) Stopping criterion selection for efficient semi-supervised time series classification. In: Lee R (ed) Software engineering, artificial intelligence, networking and parallel/distributed computing, vol 149. Studies in computational intelligence. Springer, Berlin, pp 1–14

  52. 52.

    Rodríguez JJ, Alonso CJ (2004) Interval and dynamic time warping-based decision trees. In: Proceedings of the 2004 ACM symposium on applied computing, SAC ’04. ACM, pp 548–552

  53. 53.

    Rodríguez JJ, Alonso CJ, Boström H (2000) Learning first order logic time series classifiers: rules and boosting. In: Principles of data mining and knowledge discovery, vol 1910 of Lecture notes in computer science. Springer, Berlin, pp 299–308

  54. 54.

    Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49

    Article  MATH  Google Scholar 

  55. 55.

    Serrà J, Arcos JL (2014) An empirical evaluation of similarity measures for time series classification. Knowl-Based Syst 67:305–314

    Article  Google Scholar 

  56. 56.

    Shimodaira H, Noma K, Nakai M, Sagayama S (2001) Dynamic time-alignment kernel in support vector machine. In: Proceedings of the 14th international conference on neural information processing systems: natural and synthetic, NIPS’01. MIT Press, Cambridge, MA, pp 921–928

  57. 57.

    Triguero I, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284

    Article  Google Scholar 

  58. 58.

    Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26(2):275–309

    MathSciNet  Article  Google Scholar 

  59. 59.

    Wang Y, Xu X, Zhao H, Hua Z (2010) Semi-supervised learning based on nearest neighbor rule and cut edges. Knowl-Based Syst 23(6):547–554

    Article  Google Scholar 

  60. 60.

    Wei L (2006) Datasets used for experimental evaluation in the paper: semi-supervised time series classification. www.cs.ucr.edu/~wli/selfTraining/

  61. 61.

    Wei L, Keogh E (2006) Semi-supervised time series classification. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 748–753

  62. 62.

    Weng X, Shen J (2008) Classification of multivariate time series using two-dimensional singular value decomposition. Knowl-Based Syst 21(7):535–539

    Article  Google Scholar 

  63. 63.

    Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, third, edition edn. Morgan Kaufmann, Boston

    Google Scholar 

  64. 64.

    Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23rd international conference on machine learning, ICML ’06. ACM, New York, pp 1033–1040

  65. 65.

    Xing Z, Pei J, Yu PS (2012) Early classification on time series. Knowl Inf Syst 31(1):105–127

    Article  Google Scholar 

  66. 66.

    Yamada Y, Suzuki E, Yokoi H, Takabayashi K (2003) Decision-tree induction from time-series data based on a standard-example split test. In: Twentieth international conference on machine learning, vol 3 of ICML ’03, pp 840–847

  67. 67.

    Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd annual meeting on association for computational linguistics. Association for Computational Linguistics, pp 189–196

  68. 68.

    Zhang D, Zuo W, Zhang D, Zhang H (2010) Time series classification using support vector machine with Gaussian elastic metric kernel. In: 20th international conference on pattern recognition (ICPR), ICPR ’10, pp 29–32

  69. 69.

    Zhou Y, Goldman S (2004) Democratic co-learning. In: IEEE 16th international conference on tools with artificial intelligence (ICTAI). IEEE, pp 594–602

  70. 70.

    Zhou Z, Li M (2005) Tri-training: exploiting unlabeled data using three classifiers. IEEE Trans Knowl Data Eng 17(11):1529–1541

    Article  Google Scholar 

  71. 71.

    Zighed DA, Lallich S, Muhlenbach F (2002) Principles of data mining and knowledge discovery: 6th European conference. In: Separability index in supervised learning, PKDD 2002 Helsinki, Finland, August 19–23, 2002 Proceedings. Springer, Berlin, pp 475–487

Download references

Acknowledgements

We thank anonymous reviewers for their very useful comments and suggestions. This work was supported in part by “Proyecto de Investigación de Excelencia de la Junta de Andalucía, P12-TIC-2958,” “Proyecto de Investigación del Ministerio de Economía y Competitividad, TIN2013-47210-P” and TIN-2016-81113-R. This work was partly performed while M. González held a travel grant from the Asociación Iberoamericana de Postgrado (AUIP), supported by Junta de Andalucía, to undertake a research stay at University of Granada.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Christoph Bergmeir.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

González, M., Bergmeir, C., Triguero, I. et al. Self-labeling techniques for semi-supervised time series classification: an empirical study. Knowl Inf Syst 55, 493–528 (2018). https://doi.org/10.1007/s10115-017-1090-9

Download citation

Keywords

  • Semi-supervised classification
  • Self-labeled
  • Time series classification
  • Semi-supervised learning
  • Self-training