Advertisement

Forests of Randomized Shapelet Trees

  • Isak KarlssonEmail author
  • Panagotis Papapetrou
  • Henrik Boström
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9047)

Abstract

Shapelets have recently been proposed for data series classification, due to their ability to capture phase independent and local information. Decision trees based on shapelets have been shown to provide not only interpretable models, but also, in many cases, state-of-the-art predictive performance. Shapelet discovery is, however, computationally costly, and although several techniques for speeding up this task have been proposed, the computational cost is still in many cases prohibitive. In this work, an ensemble-based method, referred to as Random Shapelet Forest (RSF), is proposed, which builds on the success of the random forest algorithm, and which is shown to have a lower computational complexity than the original shapelet tree learning algorithm. An extensive empirical investigation shows that the algorithm provides competitive predictive performance and that a proposed way of calculating importance scores can be used to successfully identify influential regions.

Keywords

Data series classification Shapelets Decision trees  Ensemble 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bagnall, A., Davis, L.M., Hills, J., Lines, J.: Transformation basedensembles for time series classification. In: SDM, vol.12, pp. 307–318. SIAM (2012)Google Scholar
  2. 2.
    Batista, G.E., Wang, X., Keogh, E.J.: A complexity-invariant distance measure for time series. In: SDM, vol. 11, pp. 699–710. SIAM (2011)Google Scholar
  3. 3.
    Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: KDD workshop, vol. 10, pp. 359–370. Seattle, WA (1994)Google Scholar
  4. 4.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)zbMATHMathSciNetGoogle Scholar
  5. 5.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  6. 6.
    Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. CRC Press (1984)Google Scholar
  7. 7.
    Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3)Google Scholar
  8. 8.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)zbMATHGoogle Scholar
  9. 9.
    Deng, H., Runger, G., Tuv, E., Vladimir, M.: A time series forest for classification and feature extraction. Information Sciences 239, 142–153 (2013)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. Proc. of the VLDB Endowment 1(2), 1542–1552 (2008)CrossRefGoogle Scholar
  11. 11.
    Fayyad, U.M., Irani, K.B.: On the handling of continuous-valued attributes in decision tree generation. Machine Learning 8(1), 87–102 (1992)zbMATHGoogle Scholar
  12. 12.
    Gordon, D., Hendler, D., Rokach, L.: Fast randomized model generation for shapelet-based time series classification. arXiv preprint arXiv:1209.5038 (2012)
  13. 13.
    Hills, J., Lines, J., Baranauskas, E., Mapp, J., Bagnall, A.: Classification of time series by shapelet transformation. Data Mining and Know. Discovery 28(4) (2014)Google Scholar
  14. 14.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. on Pat. Analysis and Machine Intelligence 20(8), 832–844 (1998)CrossRefGoogle Scholar
  15. 15.
    Kampouraki, A., Manis, G., Nikou, C.: Heartbeat time series classification with support vector machines. Inf. Tech. in Biomedicine 13(4) (2009)Google Scholar
  16. 16.
    Keogh, E., Zhu, Q., Hu, B., Hao, Y., Xi, X., Wei, L., Ratanamahatana, C.A.: The ucr time series classification/clustering homepage, www.cs.ucr.edu/ eamonn/time_series_data/
  17. 17.
    Mueen, A., Keogh, E., Young, N.: Logical-shapelets: an expressive primitive for time series classification. In: Proc. 17th ACM SIGKDD. ACM (2011)Google Scholar
  18. 18.
    Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proc. of the 18th ACM SIGKDD. ACM (2012)Google Scholar
  19. 19.
    Rakthanmanon, T., Keogh, E.: Fast shapelets: a scalable algorithm for discovering time series shapelets. In: Proc. 13th SDM. SIAM (2013)Google Scholar
  20. 20.
    Rebbapragada, U., Protopapas, P., Brodley, C.E., Alcock, C.: Finding anomalous periodic time series. Machine Learning 74(3), 281–313 (2009)CrossRefGoogle Scholar
  21. 21.
    Sakoe, H., Chiba, S.. In: Transactions on ASSP, vol. 26, pp. 43–49Google Scholar
  22. 22.
    Schmidhuber, J.: Deep learning in neural networks: An overview. arXiv preprint arXiv:1404.7828 (2014)
  23. 23.
    Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowl. Discovery 26(2) (2013)Google Scholar
  24. 24.
    Ye, L., Keogh, E.: Time series shapelets: a new primitive for data mining. In: Proc. of the 15th ACM SIGKDD. ACM (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Isak Karlsson
    • 1
    Email author
  • Panagotis Papapetrou
    • 1
  • Henrik Boström
    • 1
  1. 1.Department of Computer and Systems SciencesStockholm UniversityKistaSweden

Personalised recommendations