Data Mining and Knowledge Discovery

, Volume 30, Issue 5, pp 1053–1085

Generalized random shapelet forests

  • Isak Karlsson
  • Panagiotis Papapetrou
  • Henrik Boström
Article

Abstract

Shapelets are discriminative subsequences of time series, usually embedded in shapelet-based decision trees. The enumeration of time series shapelets is, however, computationally costly, which in addition to the inherent difficulty of the decision tree learning algorithm to effectively handle high-dimensional data, severely limits the applicability of shapelet-based decision tree learning from large (multivariate) time series databases. This paper introduces a novel tree-based ensemble method for univariate and multivariate time series classification using shapelets, called the generalized random shapelet forest algorithm. The algorithm generates a set of shapelet-based decision trees, where both the choice of instances used for building a tree and the choice of shapelets are randomized. For univariate time series, it is demonstrated through an extensive empirical investigation that the proposed algorithm yields predictive performance comparable to the current state-of-the-art and significantly outperforms several alternative algorithms, while being at least an order of magnitude faster. Similarly for multivariate time series, it is shown that the algorithm is significantly less computationally costly and more accurate than the current state-of-the-art.

Keywords

Multivariate time series Time series classification Time series shapelets Decision trees Ensemble methods 

Copyright information

© The Author(s) 2016

Authors and Affiliations

  1. 1.Stockholm UniversityStockholmSweden

Personalised recommendations