Skip to main content

Early classification of time series

Cost-based optimization criterion and algorithms

Abstract

An increasing number of applications require to recognize the class of an incoming time series as quickly as possible without unduly compromising the accuracy of the prediction. In this paper, we put forward a new optimization criterion which takes into account both the cost of misclassification and the cost of delaying the decision. Based on this optimization criterion, we derived a family of non-myopic algorithms which try to anticipate the expected future gain in information in balance with the cost of waiting. In one class of algorithms, unsupervised-based, the expectations use the clustering of time series, while in a second class, supervised-based, time series are grouped according to the confidence level of the classifier used to label them. Extensive experiments carried out on real datasets using a large range of delay cost functions show that the presented algorithms are able to solve the earliness vs. accuracy trade-off, with the supervised partition based approaches faring better than the unsupervised partition based ones. In addition, all these methods perform better in a wide variety of conditions than a state of the art method based on a myopic strategy which is recognized as being very competitive. Furthermore, our experiments show that the non-myopic feature of the proposed approaches explains in large part the obtained performances.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Notes

  1. 1.

    This restricts these methods to binary classification problems.

  2. 2.

    Available at : http://www.timeseriesclassification.com.

  3. 3.

    More details are available in: https://docs.google.com/spreadsheets/d/13u7L_5IX3XxFuq_SnbOZF1dXQfcBB0wR3PXhvevhPYA/.

  4. 4.

    XGBoost is available in: https://xgboost.readthedocs.io.

References

  1. Alonso González, C. J., & Diez, J. J. R. (2004). Boosting interval-based literals: Variable length and early classification. In Data mining in time series databases, World Scientific (pp. 149–171).

  2. Anderson, H. S., Parrish, N., Tsukida, K., & Gupta, M. (2012). Early time-series classification with reliability guarantee. Sandria Report.

  3. Barandas, M., Folgado, D., Fernandes, L., Santos, S., Abreu, M., Bota, P., Liu, H., Schultz, T., & Gamboa, H. (2020). Tsfel: Time series feature extraction library. SoftwareX 11:100456, https://github.com/fraunhoferportugal/tsfel.

  4. Berger, J. O. (1985). Statistical decision theory and Bayesian analysis. Springer.

  5. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.

    Article  Google Scholar 

  6. Dachraoui, A., Bondu, A., & Cornuejols, A. (2013). Early classification of individual electricity consumptions. In RealStream2013 (ECML) (pp. 18–21).

  7. Dachraoui, A., Bondu, A., & Cornuéjols, A. (2015). Early classification of time series as a non myopic sequential decision making problem. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 433–447). Springer.

  8. DeGroot, M. H. (2005). Optimal statistical decisions, (Vol. 82). Wiley.

  9. Flach, P. A. (2016). Classifier calibration. Encyclopedia of Machine Learning and Data Mining (pp. 1–8).

  10. Ghalwash, M. F., Ramljak, D., & Obradović, Z. (2012). Early classification of multivariate time series using a hybrid hmm/svm model. In 2012 IEEE International Conference on Bioinformatics and Biomedicine (pp. 1–6). IEEE, .

  11. Ghalwash, M. F., Radosavljevic, V., & Obradovic, Z. (2014). Utilizing temporal patterns for estimating uncertainty in interpretable early decision making. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, (pp. 402–411).

  12. Gupta, A., Gupta, H. P., Biswas, B., & Dutta, T. (2020). An early classification approach for multivariate time series of on-vehicle sensors in transportation. IEEE Transactions on Intelligent Transportation Systems.

  13. Hatami, N., & Chira, C. (2013). (2013) Classifiers with a reject option for early time-series classification. In IEEE symposium on computational intelligence and ensemble learning (CIEL) (pp. 9–16). IEEE.

  14. He, G., Duan, Y., Peng, R., Jing, X., Qian, T., & Wang, L. (2015). Early classification on multivariate time series. Neurocomputing, 149, 777–787.

    Article  Google Scholar 

  15. Lemaire, V., Alaoui Ismaili, O., Cornuéjols, A., & Gay, D. (2020). Predictive k-means with local models. In Workshop LDRC-2020 (Workshop on Learning Data Representation for Clustering) in PAKDD-2020 (The 24th Pacific-Asia Conference On Knowledge Discovery and Data Mining), Singapore, 11-16 May 2020.

  16. Liu, Y., & Li, X. R. (2013). Performance analysis of sequential probability ratio test. Sequential Analysis, 32(4), 469–497.

    MathSciNet  Article  Google Scholar 

  17. Mathukia, C., Fan, W., Vadyak, K., Biege, C., & Krishnamurthy, M. (2015). Modified early warning system improves patient safety and clinical outcomes in an academic community hospital. Journal of Community Hospital Internal Medicine Perspectives, 5(2), 26716.

    Article  Google Scholar 

  18. Mori, U., Mendiburu, A., Dasgupta, S., & Lozano, J. (2015). Early classification of time series from a cost minimization point of view. In Proceedings of the NIPS Time Series Workshop.

  19. Mori, U., Mendiburu, A., Dasgupta, S., & Lozano, J. A. (2017). Early classification of time series by simultaneously optimizing the accuracy and earliness. IEEE Transactions on Neural Networks and Learning Systems, 29(10), 4569–4578.

    Article  Google Scholar 

  20. Mori, U., Mendiburu, A., Miranda, I. M., & Lozano, J. A. (2019). Early classification of time series using multi-objective optimization techniques. Information Sciences, 492, 204–218.

    MathSciNet  Article  Google Scholar 

  21. Nemenyi, P. (1962). Distribution-free multiple comparisons. Biometrics, 18(2), 263.

    Google Scholar 

  22. Novikov, A. (2008). Optimal sequential tests for two simple hypotheses based on independent observations. International Journal of Pure and Applied Mathematics, 45(2), 291–314.

    MathSciNet  MATH  Google Scholar 

  23. Parrish, N., Anderson, H. S., Gupta, M. R., & Hsiao, D. Y. (2013). Classifying with confidence from incomplete information. The Journal of Machine Learning Research, 14(1), 3561–3589.

    MathSciNet  MATH  Google Scholar 

  24. Rußwurm, M., Lefevre, S., Courty, N., Emonet, R., Körner, M., & Tavenard, R. (2019). End-to-end learning for early classification of time series. arXiv preprint arXiv:190110681.

  25. Schäfer, P., & Leser, U. (2020). Teaser: Early and accurate time series classification. Data Mining and Knowledge Discovery, 34(5), 1336–1362.

    MathSciNet  Article  Google Scholar 

  26. Vapnik, V., & Vashist, A. (2009). A new learning paradigm: Learning using privileged information. Neural networks, 22(5–6), 544–557.

    Article  Google Scholar 

  27. Wald, A., & Wolfowitz, J. (1948). Optimum character of the sequential probability ratio test. The Annals of Mathematical Statistics (pp. 326–339).

  28. Xing, Z., Pei, J., & Philip, S. Y. (2009). Early prediction on time series: A nearest neighbor approach. In IJCAI, Citeseer (pp. 1297–1302).

  29. Xing, Z., Pei, J., Philip, S. Y., & Wang, K. (2011). Extracting interpretable features for early classification on time series. SDM, SIAM, 11, 247–258.

    Google Scholar 

Download references

Acknowledgements

We thank Vincent Lemaire and Fabrice Clérot (Orange Labs) for their advice and interesting discussions. We also thank Orange Labs for supporting this research.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Antoine Cornuéjols.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Editors: Ira Assent, Carlotta Domeniconi, Aristides Gionis, Eyke Hüllermeier.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Achenchabe, Y., Bondu, A., Cornuéjols, A. et al. Early classification of time series. Mach Learn 110, 1481–1504 (2021). https://doi.org/10.1007/s10994-021-05974-z

Download citation

Keywords

  • Early classification of time series
  • Cost estimation
  • Sequential decision making