Clustering time series by linear dependency
- 299 Downloads
We present a new way to find clusters in large vectors of time series by using a measure of similarity between two time series, the generalized cross correlation. This measure compares the determinant of the correlation matrix until some lag k of the bivariate vector with those of the two univariate time series. A matrix of similarities among the series based on this measure is used as input of a clustering algorithm. The procedure is automatic, can be applied to large data sets and it is useful to find groups in dynamic factor models. The cluster method is illustrated with some Monte Carlo experiments and a real data example.
KeywordsUnsupervised learning Dynamic factor models Correlation matrix Correlation coefficient
This research has been supported by Consejo Superior de Investigaciones Científicas (Grant No. ECO2015-66593-P) of MINECO/FEDER/UE. We thanks to professor Tomohiro Ando for making available their code and kindly answer some questions regarding the implementation.
- García-Martos, C., Conejo, A.J.: Price forecasting techniques in power system. In: Webster, J. (ed.) Wiley Encyclopedia of Electrical and Electronics Engineering. Wiley, New York (2013)Google Scholar
- Larsen, B., Aone, C.: Fast and effective text mining using linear time document clustering. In: Proceedings of the Conference on Knowledge Discovery and Data Mining, pp. 16–22 (1999)Google Scholar