Abstract
Time series has attracted much attention in recent years, with thousands of methods for diverse tasks such as classification, clustering, prediction, and anomaly detection. Among all these tasks, classification is likely the most prominent task, accounting for most of the applications and attention from the research community. However, in spite of the huge number of methods available, there is a significant body of empirical evidence indicating that the 1-nearest neighbor algorithm (\(1\)-NN) in the time domain is “extremely difficult to beat”. In this paper, we evaluate the use of different data representations in time series classification. Our work is motivated by methods used in related areas such as signal processing and music retrieval. In these areas, a change of representation frequently reveals features that are not apparent in the original data representation. Our approach consists of using different representations such as frequency, wavelets, and autocorrelation to transform the time series into alternative decision spaces. A classifier is then used to provide a classification for each test time series in the alternative domain. We investigate how features provided in different domains can help in time series classification. We also experiment with different ensembles to investigate if the data representations are a good source of diversity for time series classification. Our extensive experimental evaluation approaches the issue of combining sets of representations and ensemble strategies, resulting in over 300 ensemble configurations.
This work partially funded by grant #2012/08923-8, #2013/26151-5, and #2015/07628-0, São Paulo Research Foundation (FAPESP); and CNPq #446330/2014-0 and #303083/2013-1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Lomet, David B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993)
Antoniou, A.: Digital Signal Processing. McGraw-Hill, New York (2006)
Atiya, A.F.: Estimating the posterior probabilities using the k-nearest neighbor rule. Neural Comput. 17(3), 731–740 (2005)
Bagnall, A., Lines, J., Hills, J., Bostrom, A.: Time-series classification with COTE: the collective of transformation-based ensembles. IEEE Trans. Knowl. Data Eng. PP(99), 1–14 (2015)
Bresolin, A.d.A., Neto, A., Alsina, P.: Digit recognition using wavelet and SVM in Brazilian Portuguese. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1545–1548 (2008)
Burrus, C.S., Gopinath, R.A., Guo, H.: Introduction to Wavelets and Wavelet Transforms, vol. 998. Prentice Hall, New Jersey (1998)
Chan, K.P., Fu, A.C.: Efficient time series matching by wavelets. In: International Conference on Data Engineering, pp. 126–133 (1999)
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. VLDB Endowment 1(2), 1542–1552 (2008)
Giusti, R., Silva, D.F., Batista, G.E.A.P.A.: Time series classification with representation ensembles (2015). http://sites.labic.icmc.usp.br/rgiusti/ida15. (URL verified valid as of July 2015)
Jolliffe, I.: Principal Component Analysis. Springer, New York (2002)
Keogh, E., Xi, X., Wei, L., Ratanamahatana, C.A.: The UCR time series classification/clustering homepage (2006). http://www.cs.ucr.edu/~eamonn/time_series_data/. (URL verified valid as of July 2015)
Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)
Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Discov. 15(2), 107–144 (2007)
Lin, J., Khade, R., Li, Y.: Rotation-invariant similarity in time series using bag-of-patterns representation. J. Intell. Inf. Syst. 39(2), 287–315 (2012)
Lines, J., Bagnall, A.: Time series classification with ensembles of elastic distance measures. Data Min. Knowl. Discovery 29, 565–592 (2014)
Oates, T., Mackenzie, C., Stein, D., Stansbury, L., Dubose, J., Aarabi, B., Hu, P.: Exploiting representational diversity for time series classification. In: International Conference on Machine Learning and Applications, vol. 2, pp. 538–544 (2012)
Ueno, K., Xi, X., Keogh, E.J., Lee, D.J.: Anytime classification using the nearest neighbor algorithm with applications to stream mining. In: IEEE International Conference on Data Mining, pp. 623–632 (2006)
Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Discovery 26, 275–309 (2013)
Ye, L., Keogh, E.: Time series shapelets: a new primitive for data mining. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 947–956. ACM (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Giusti, R., Silva, D.F., Batista, G.E.A.P.A. (2015). Time Series Classification with Representation Ensembles. In: Fromont, E., De Bie, T., van Leeuwen, M. (eds) Advances in Intelligent Data Analysis XIV. IDA 2015. Lecture Notes in Computer Science(), vol 9385. Springer, Cham. https://doi.org/10.1007/978-3-319-24465-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-24465-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24464-8
Online ISBN: 978-3-319-24465-5
eBook Packages: Computer ScienceComputer Science (R0)