Error-space representations for multi-dimensional data streams with temporal dependence

  • Jesse ReadEmail author
  • Nikolaos Tziortziotis
  • Michalis Vazirgiannis
Short Paper


In many application scenarios, data points are not only temporally dependent, but also expected in the form of a fast-moving stream. A broad selection of efficient learning algorithms exists which may be applied to data streams, but they typically do not take into account the temporal nature of the data. We motivate and design a method which creates an efficient representation of a data stream, where temporal information is embedded into each instance via the error space of forecasting models. Unlike many other methods in the literature, our approach can be rapidly initialized and does not require iterations over the full data sequence, thus it is suitable for a streaming scenario. This allows the application of off-the-shelf data-stream methods, depending on the application domain. In this paper, we investigate classification. We compare to a large variety of methods (auto-encoders, HMMs, basis functions, clustering methodologies, and PCA) and find that our proposed methods perform very competitively, and offers much promise for future work.


Data streams Concept drift Multi-dimensional data Feature representations Time series 


  1. 1.
    Matsubara Y, Sakurai Y, Faloutsos C (2014) Autoplait: automatic mining of co-evolving time sequences. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, ser. SIGMOD ’14, pp 193–204. ACM, New York, NY, USA.
  2. 2.
    Barber D (2012) Bayesian reasoning and machine learning. Cambridge University Press, CambridgezbMATHGoogle Scholar
  3. 3.
    Sutton RS, Barto AG (1998) Introduction to reinforcement learning. MIT Press, CambridgeGoogle Scholar
  4. 4.
    Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. CrossRefGoogle Scholar
  5. 5.
    Bifet A, Holmes G, Pfahringer B, Gavaldà R (2009) Improving adaptive bagging methods for evolving data streams. In: Asian conference on machine learningGoogle Scholar
  6. 6.
    Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC, Boca RatonCrossRefzbMATHGoogle Scholar
  7. 7.
    Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res (JMLR) 11:1601–1604Google Scholar
  8. 8.
    Žliobaitė I, Bifet A, Read J, Pfahringer B, Holmes G (2014) Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach Learn 98(3):455–482MathSciNetzbMATHGoogle Scholar
  9. 9.
    Hollmén J, Tresp V (1999) Call-based fraud detection in mobile communications networks using a hierarchical regime-switching model. In: Proceedings of the 1998 conference advances in neural information processing systems II (NIPS’11), pp 889–895Google Scholar
  10. 10.
    Zafeiriou L, Nicolaou MA, Zafeiriou S, Nikitidis S, Pantic M (2016) Probabilistic slow features for behavior analysis. IEEE Trans Neural Netw Learn Syst 27(5):1034–1048. MathSciNetCrossRefGoogle Scholar
  11. 11.
    Lughofer E, Weigl E, Heidl W, Eitzinger C, Radauer T (2016) Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances. Inf Sci 355(C):127–151. CrossRefGoogle Scholar
  12. 12.
    Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37. CrossRefzbMATHGoogle Scholar
  13. 13.
    Tilo S (2016) Data fitting and uncertainty. Springer, BerlinGoogle Scholar
  14. 14.
    Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9–10):1469–1495. MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Romeu P, Zamora-Martínez F, Botella-Rocamora P, Pardo J (2015) Stacked denoising auto-encoders for short-term time series forecasting. Springer, Cham, pp 463–486. Google Scholar
  16. 16.
    Hoffman MD, Blei DM, Wang C, Paisley J (2013) Stochastic variational inference. J Mach Learn Res (JMLR) 14:1303–1347MathSciNetzbMATHGoogle Scholar
  17. 17.
    Read J, Perez-Cruz F, Bifet A (2015) Deep learning in multi-label data-streams. In: SAC 2015: 30th ACM symposium on applied computing. ACMGoogle Scholar
  18. 18.
    Oates T, Firoiu L, Cohen PR (1999) Clustering time series with hidden Markov models and dynamic time warping. In: Proceedings of the IJCAI-99 workshop on neural, symbolic and reinforcement learning methods for sequence learning, pp 17–21Google Scholar
  19. 19.
    Kohlmorgen J, Lemm S (2001) A dynamic hmm for on–line segmentation of sequential data. In: Proceedings of the 14th international conference on neural information processing systems: natural and synthetic, ser. NIPS’01, pp 793–800Google Scholar
  20. 20.
    Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Fawcett T, Mishra N (eds) ICML, pp 186–193Google Scholar
  21. 21.
    Pichler K, Lughofer E, Pichler M, Buchegger T, Klement EP, Huschenbett M (2016) Fault detection in reciprocating compressor valves under varying load conditions. Mech Syst Signal Process 70–71:104–119CrossRefGoogle Scholar
  22. 22.
    Fisher M, Huang F, Wright Z, Patton J (2014) Distributions in the error space: goal-directed movements described in time and state-space representations. In: International conference of the IEEE engineering in medicine and biology society, vol 2014, pp 6953–6956. Institute of Electrical and Electronics Engineers Inc.Google Scholar
  23. 23.
    Montiel J, Read J, Bifet A, Abdessalem T (2018) Scikit-Multiflow: a multi-output streaming framework. CoRR.

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  1. 1.LIXÉcole PolytechniquePalaiseauFrance

Personalised recommendations