An increasing number of applications require to recognize the class of an incoming time series as quickly as possible without unduly compromising the accuracy of the prediction. In this paper, we put forward a new optimization criterion which takes into account both the cost of misclassification and the cost of delaying the decision. Based on this optimization criterion, we derived a family of non-myopic algorithms which try to anticipate the expected future gain in information in balance with the cost of waiting. In one class of algorithms, unsupervised-based, the expectations use the clustering of time series, while in a second class, supervised-based, time series are grouped according to the confidence level of the classifier used to label them. Extensive experiments carried out on real datasets using a large range of delay cost functions show that the presented algorithms are able to solve the earliness vs. accuracy trade-off, with the supervised partition based approaches faring better than the unsupervised partition based ones. In addition, all these methods perform better in a wide variety of conditions than a state of the art method based on a myopic strategy which is recognized as being very competitive. Furthermore, our experiments show that the non-myopic feature of the proposed approaches explains in large part the obtained performances.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
This restricts these methods to binary classification problems.
Available at : http://www.timeseriesclassification.com.
More details are available in: https://docs.google.com/spreadsheets/d/13u7L_5IX3XxFuq_SnbOZF1dXQfcBB0wR3PXhvevhPYA/.
XGBoost is available in: https://xgboost.readthedocs.io.
Alonso González, C. J., & Diez, J. J. R. (2004). Boosting interval-based literals: Variable length and early classification. In Data mining in time series databases, World Scientific (pp. 149–171).
Anderson, H. S., Parrish, N., Tsukida, K., & Gupta, M. (2012). Early time-series classification with reliability guarantee. Sandria Report.
Barandas, M., Folgado, D., Fernandes, L., Santos, S., Abreu, M., Bota, P., Liu, H., Schultz, T., & Gamboa, H. (2020). Tsfel: Time series feature extraction library. SoftwareX 11:100456, https://github.com/fraunhoferportugal/tsfel.
Berger, J. O. (1985). Statistical decision theory and Bayesian analysis. Springer.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Dachraoui, A., Bondu, A., & Cornuejols, A. (2013). Early classification of individual electricity consumptions. In RealStream2013 (ECML) (pp. 18–21).
Dachraoui, A., Bondu, A., & Cornuéjols, A. (2015). Early classification of time series as a non myopic sequential decision making problem. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 433–447). Springer.
DeGroot, M. H. (2005). Optimal statistical decisions, (Vol. 82). Wiley.
Flach, P. A. (2016). Classifier calibration. Encyclopedia of Machine Learning and Data Mining (pp. 1–8).
Ghalwash, M. F., Ramljak, D., & Obradović, Z. (2012). Early classification of multivariate time series using a hybrid hmm/svm model. In 2012 IEEE International Conference on Bioinformatics and Biomedicine (pp. 1–6). IEEE, .
Ghalwash, M. F., Radosavljevic, V., & Obradovic, Z. (2014). Utilizing temporal patterns for estimating uncertainty in interpretable early decision making. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, (pp. 402–411).
Gupta, A., Gupta, H. P., Biswas, B., & Dutta, T. (2020). An early classification approach for multivariate time series of on-vehicle sensors in transportation. IEEE Transactions on Intelligent Transportation Systems.
Hatami, N., & Chira, C. (2013). (2013) Classifiers with a reject option for early time-series classification. In IEEE symposium on computational intelligence and ensemble learning (CIEL) (pp. 9–16). IEEE.
He, G., Duan, Y., Peng, R., Jing, X., Qian, T., & Wang, L. (2015). Early classification on multivariate time series. Neurocomputing, 149, 777–787.
Lemaire, V., Alaoui Ismaili, O., Cornuéjols, A., & Gay, D. (2020). Predictive k-means with local models. In Workshop LDRC-2020 (Workshop on Learning Data Representation for Clustering) in PAKDD-2020 (The 24th Pacific-Asia Conference On Knowledge Discovery and Data Mining), Singapore, 11-16 May 2020.
Liu, Y., & Li, X. R. (2013). Performance analysis of sequential probability ratio test. Sequential Analysis, 32(4), 469–497.
Mathukia, C., Fan, W., Vadyak, K., Biege, C., & Krishnamurthy, M. (2015). Modified early warning system improves patient safety and clinical outcomes in an academic community hospital. Journal of Community Hospital Internal Medicine Perspectives, 5(2), 26716.
Mori, U., Mendiburu, A., Dasgupta, S., & Lozano, J. (2015). Early classification of time series from a cost minimization point of view. In Proceedings of the NIPS Time Series Workshop.
Mori, U., Mendiburu, A., Dasgupta, S., & Lozano, J. A. (2017). Early classification of time series by simultaneously optimizing the accuracy and earliness. IEEE Transactions on Neural Networks and Learning Systems, 29(10), 4569–4578.
Mori, U., Mendiburu, A., Miranda, I. M., & Lozano, J. A. (2019). Early classification of time series using multi-objective optimization techniques. Information Sciences, 492, 204–218.
Nemenyi, P. (1962). Distribution-free multiple comparisons. Biometrics, 18(2), 263.
Novikov, A. (2008). Optimal sequential tests for two simple hypotheses based on independent observations. International Journal of Pure and Applied Mathematics, 45(2), 291–314.
Parrish, N., Anderson, H. S., Gupta, M. R., & Hsiao, D. Y. (2013). Classifying with confidence from incomplete information. The Journal of Machine Learning Research, 14(1), 3561–3589.
Rußwurm, M., Lefevre, S., Courty, N., Emonet, R., Körner, M., & Tavenard, R. (2019). End-to-end learning for early classification of time series. arXiv preprint arXiv:190110681.
Schäfer, P., & Leser, U. (2020). Teaser: Early and accurate time series classification. Data Mining and Knowledge Discovery, 34(5), 1336–1362.
Vapnik, V., & Vashist, A. (2009). A new learning paradigm: Learning using privileged information. Neural networks, 22(5–6), 544–557.
Wald, A., & Wolfowitz, J. (1948). Optimum character of the sequential probability ratio test. The Annals of Mathematical Statistics (pp. 326–339).
Xing, Z., Pei, J., & Philip, S. Y. (2009). Early prediction on time series: A nearest neighbor approach. In IJCAI, Citeseer (pp. 1297–1302).
Xing, Z., Pei, J., Philip, S. Y., & Wang, K. (2011). Extracting interpretable features for early classification on time series. SDM, SIAM, 11, 247–258.
We thank Vincent Lemaire and Fabrice Clérot (Orange Labs) for their advice and interesting discussions. We also thank Orange Labs for supporting this research.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Editors: Ira Assent, Carlotta Domeniconi, Aristides Gionis, Eyke Hüllermeier.
About this article
Cite this article
Achenchabe, Y., Bondu, A., Cornuéjols, A. et al. Early classification of time series. Mach Learn 110, 1481–1504 (2021). https://doi.org/10.1007/s10994-021-05974-z
- Early classification of time series
- Cost estimation
- Sequential decision making