Comparing Several Parametric and Nonparametric Approaches to Time Series Clustering: A Simulation Study
- First Online:
- Cite this article as:
- Díaz, S.P. & Vilar, J.A. J Classif (2010) 27: 333. doi:10.1007/s00357-010-9064-6
- 449 Downloads
One key point in cluster analysis is to determine a similarity or dissimilarity measure between data objects. When working with time series, the concept of similarity can be established in different ways. In this paper, several non-parametric statistics originally designed to test the equality of the log-spectra of two stochastic processes are proposed as dissimilarity measures between time series data. Their behavior in time series clustering is analyzed throughout a simulation study, and compared with the performance of several model-free and model-based dissimilarity measures. Up to three different classification settings were considered: (i) to distinguish between stationary and non-stationary time series, (ii) to classify different ARMA processes and (iii) to classify several non-linear time series models. As it was expected, the performance of a particular dissimilarity metric strongly depended on the type of processes subjected to clustering. Among all the measures studied, the nonparametric distances showed the most robust behavior.