Skip to main content

The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure

Abstract

We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms. The extent of similarity between a pair of time series is measured using the total variation distance between their estimated spectral densities. At each step of the algorithm, every time two clusters merge, a new spectral density is estimated using the whole information present in both clusters, which is representative of all the series in the new cluster. The method is implemented in an R package HSMClust. We present two applications of the HSM method, one to data coming from wave-height measurements in oceanography and the other to electroencefalogram (EEG) data.

This is a preview of subscription content, access via your institution.

References

  • ALVAREZ-ESTEBAN, P.C., EUÁN, C., and ORTEGA, J. (2016), “Time Series Clustering Using the Total Variation Distance with Applications in Oceanography”, Environmetrics, 27, 355–369.

    MathSciNet  Article  Google Scholar 

  • BRODTKORB, P.A., JOHANNESSON, P., LINDGREN, G., RYCHLIK, I., RYDÉN, J., and SJÖ, E. (2010), “WAFO - A Matlab Toolbox for Analysis of Random Waves and Loads”, in Proceedings of the 10th International Offshore and Polar Engineering Conference, Vol. 3, Seattle, USA, pp. 343–350.

  • CAIADO, J., CRATO, N., and PEÑA, D. (2006), “A Periodogram-Based Metric for Time Series Classification”, Computational Statistics and Data Analysis, 50, 2668–2684.

    MathSciNet  Article  MATH  Google Scholar 

  • CAIADO, J., CRATO, N., and PEÑA, D. (2009), “Comparison of Times Series with Unequal Length in the Frequency Domain", Communications in Statistics - Simulation and Computation, 38, 527–540.

    MathSciNet  Article  MATH  Google Scholar 

  • CAIADO, J., MAHARAJ, E.A., and D’URSO, P. (2015), “Time Series Clustering”, in Handbook of Cluster Analysis, eds. C. Hennig, M. Meila, F. Murtagh, and R. Rocci, Handbooks of Modern Statistical Methods, Chap. 12, Chapman and Hall/CRC, pp. 241–263.

  • CATTELL, R.B. (1966), “The Scree Test For The Number Of Factors”, Multivariate Behavioral Research, 1, 245–276.

    Article  Google Scholar 

  • CONTRERAS, P., and MURTAGH, F. (2015), "Hierarchical Clustering", in Handbook of Cluster Analysis, eds. C. Hennig, M. Meila, F. Murtagh, and R. Rocci, Handooks of Modern Statistiacl Methods, Chap. 12, Chapman and Hall/CRC, pp. 103–123.

  • EUÁN, C. (2016), “Detection of Changes in Time Series: A Frequency Domain Approach”, PhD dissertation, CIMAT.

  • GAVRILOV, M., ANGUELOV, D., INDYK, P., and MOTWANI, R. (2000), “Mining the Stock Market: Which Measure is Best”, in Proceedings of the 6 th ACM Internationall Conference on Knowledge Discovery and Data Mining, pp. 487–496.

  • GOUTTE, C., TOFT, P., ROSTRUP, E., NIELSEN, F., and HANSEN, L.K. (1999), “On Clustering fMRI Time Series”, NeuroImage, 9, 298–310.

    Article  Google Scholar 

  • KRAFTY, R.T. (2016), “Discriminant Analysis of Time Series in the Presence of Within-Group Spectral Variability”, Journal of Time Series Analysis, 37, 435–450.

    MathSciNet  Article  MATH  Google Scholar 

  • KRAFTY, R.T., HALL, M., and GUO, W. (2011), “Functional Mixed Effects Spectral Analysis”, Biometrika, 98, 583–598.

    MathSciNet  Article  MATH  Google Scholar 

  • KREISS, J.-P., and PAPARODITIS, E. (2015), “Bootstrapping Locally Stationary Processes”, Journal of the Royal Statistical Society. Series B. Statistical Methodology, 77, 267–290.

    MathSciNet  Article  Google Scholar 

  • LIAO, T.W. (2005), “Clustering of Time Series Data – A Survey”, Pattern Recognition, 38, 1857–1874.

    Article  MATH  Google Scholar 

  • LONGUETT-HIGGINS, M. (1957), “The Statistical Analysis of a Random Moving Surface”, Philosophical Transactions of the Royal Society of London, Series A, 249, 321–387.

    MathSciNet  Article  Google Scholar 

  • MAHARAJ, E., D’URSO, P., and GALAGEDERA, D. (2010), “Wavelet-Based Fuzzy Clustering of Time Series”, Journal of Classification, 27, 231–275.

    MathSciNet  Article  MATH  Google Scholar 

  • MAHARAJ, E.A. (2002), “Comparison of Non-Stationary Time Series in the Frequency Domain”, Computational Statistics and Data Analysis, 40, 131–141.

    MathSciNet  Article  MATH  Google Scholar 

  • MAHARAJ, E.A., and ALONSO, A.M. (2007), “Discrimination of Locally Stationary Time Series Using Wavelets”, Computational Statistics and Data Analysis, 52, 879–895.

    MathSciNet  Article  MATH  Google Scholar 

  • MAHARAJ, E.A., and ALONSO, A.M (2014), “Discriminant Analysis of Multivariate Time Series: Application to Diagnosis Based on ECG Signals”, Computational Statistics and Data Analysis, 70, 67–87.

  • MAHARAJ, E.A., and D’URSO, P. (2011), “Fuzzy Clustering of Time Series in the Frequency Domain”, Information Sciences, 181, 1187–1211.

    Article  MATH  Google Scholar 

  • MAHARAJ, E.A., and D’URSO, P. (2012), “Wavelets-Based Clustering of Multivariate Time Series”, Fuzzy Sets and Systems, 193, 33–61.

    MathSciNet  Article  MATH  Google Scholar 

  • MONTERO, P., and VILAR, J. (2014), “TsClust: An R package for Time Series Clustering”, Journal of Statistical Software, 62(1), 1–43

    Article  Google Scholar 

  • OCHI, M.K. (1998), Ocean Waves: The Stochastic Approach, Cambridge, U.K: Cambridge University Press.

    Book  MATH  Google Scholar 

  • PÉRTEGA DÍAZ, S., and VILAR, J.A. (2010), “Comparing Several Parametric and Nonparametric Approaches to Time Series Clustering: A Simulation Study”, Journal of Classification, 27, 333–362.

    MathSciNet  Article  MATH  Google Scholar 

  • PIERSON, W.J. (1955), “Wind-Generated Gravity Waves”, Advances in Geophysics, 2, 93–178.

    MathSciNet  Article  Google Scholar 

  • R CORE TEAM (2014), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria.

    Google Scholar 

  • SHUMWAY, R.H., and STOFFER, D.S. (2011), Time Series Analysis and Its Applications. With R Examples (3rd ed.), New York: Springer.

    Book  MATH  Google Scholar 

  • THORNDIKE, R.L. (1953), “Who Belongs in the Family”, Psychometrika, 18(4), 267–276.

    Article  Google Scholar 

  • TIBSHIRANI, R., WALTHER, G., and HASTIE, T. (2001), “Estimating the Number of Clusters in a Data Set via the Gap Statistic”, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 411–423.

    MathSciNet  Article  MATH  Google Scholar 

  • WU, J., SRINIVASAN, R., KAUR, A., and CRAMER, S.C. (2014), “Resting-State Cortical Connectivity Predicts Motor Skill Acquision”, NeuroImage, 91, 84–90.

    Article  Google Scholar 

  • XU, R., and WUNSCH, D. (2005), “Survey of Clustering Algorithms”, IEEE Transactions on Neural Networks, 16, 645–678.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carolina Euán.

Additional information

The authors would like to thank the reviewers for their comments which led to improvements in this work.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Euán, C., Ombao, H. & Ortega, J. The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure. J Classif 35, 71–99 (2018). https://doi.org/10.1007/s00357-018-9250-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-018-9250-5

Keywords

  • Hierarchical spectral merger clustering: Time series clustering
  • Hierarchical clustering
  • Total variation distance
  • Time series
  • Spectral analysis