Data Mining and Knowledge Discovery

, Volume 30, Issue 2, pp 476–509

Time series representation and similarity based on local autopatterns

Article

DOI: 10.1007/s10618-015-0425-y

Cite this article as:
Baydogan, M.G. & Runger, G. Data Min Knowl Disc (2016) 30: 476. doi:10.1007/s10618-015-0425-y

Abstract

Time series data mining has received much greater interest along with the increase in temporal data sets from different domains such as medicine, finance, multimedia, etc. Representations are important to reduce dimensionality and generate useful similarity measures. High-level representations such as Fourier transforms, wavelets, piecewise polynomial models, etc., were considered previously. Recently, autoregressive kernels were introduced to reflect the similarity of the time series. We introduce a novel approach to model the dependency structure in time series that generalizes the concept of autoregression to local autopatterns. Our approach generates a pattern-based representation along with a similarity measure called learned pattern similarity (LPS). A tree-based ensemble-learning strategy that is fast and insensitive to parameter settings is the basis for the approach. Then, a robust similarity measure based on the learned patterns is presented. This unsupervised approach to represent and measure the similarity between time series generally applies to a number of data mining tasks (e.g., clustering, anomaly detection, classification). Furthermore, an embedded learning of the representation avoids pre-defined features and an extraction step which is common in some feature-based approaches. The method generalizes in a straightforward manner to multivariate time series. The effectiveness of LPS is evaluated on time series classification problems from various domains. We compare LPS to eleven well-known similarity measures. Our experimental results show that LPS provides fast and competitive results on benchmark datasets from several domains. Furthermore, LPS provides a research direction and template approach that breaks from the linear dependency models to potentially foster other promising nonlinear approaches.

Keywords

Time series Similarity Pattern discovery Autoregression  Regression tree 

Funding information

Funder NameGrant NumberFunding Note
Scientic and Technological Research Council of Turkey (TUBITAK)
  • 114C103

Copyright information

© The Author(s) 2015

Authors and Affiliations

  1. 1.Department of Industrial EngineeringBoğaziçi UniversityIstanbulTurkey
  2. 2.School of Computing, Informatics and Decision Systems EngineeringArizona State UniversityTempeUSA

Personalised recommendations