Skip to main content
Log in

Model-based clustering and segmentation of time series with changes in regime

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Mixture model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the Expectation–Maximization (EM) algorithm. Within the context of a railway application, this paper introduces a novel mixture model for dealing with time series that are subject to changes in regime. The proposed approach, called ClustSeg, consists in modeling each cluster by a regression model in which the polynomial coefficients vary according to a discrete hidden process. In particular, this approach makes use of logistic functions to model the (smooth or abrupt) transitions between regimes. The model parameters are estimated by the maximum likelihood method solved by an EM algorithm. This approach can also be regarded as a clustering approach which operates by finding groups of time series having common changes in regime. In addition to providing a time series partition, it therefore provides a time series segmentation. The problem of selecting the optimal numbers of clusters and segments is solved by means of the Bayesian Information Criterion. The ClustSeg approach is shown to be efficient using a variety of simulated time series and real-world time series of electrical power consumption from rail switching operations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Banfield JD, Raftery AE (1993) Model-based gaussian and non-gaussian clustering. Biometrics 49: 803–821

    Article  MathSciNet  MATH  Google Scholar 

  • Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7): 719–725

    Article  Google Scholar 

  • Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recogn. 28(5): 781–793

    Article  Google Scholar 

  • Chamroukhi F, Samé A, Govaert G, Aknin P (2010) A hidden process regression model for functional data description. application to curve discrimination. Neurocomputing 73: 1210–1221

    Article  Google Scholar 

  • Chiou J, Li P (2007) Functional clustering and identifying substructures of longitudinal data. J Royal Stat Soc Ser B (Stat Methodol) 69(4): 679–699

    Article  MathSciNet  Google Scholar 

  • Coke G, Tsao M (2010) Random effects mixture models for clustering electrical load series. J Time Ser Anal 31(6): 451–464

    Article  MathSciNet  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm (with discussion). J Royal Stat Soc B 39: 1–38

    MathSciNet  MATH  Google Scholar 

  • Gaffney S, Smyth P (1999) Trajectory clustering with mixtures of regression models. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, San Diego, CA, USA

  • Gaffney S, Smyth P (2003) Curve clustering with random effects regression mixtures. In: Proceedings of the ninth international workshop on artificial intelligence and statistics, society for artificial intelligence and statistics, Key West, Florida, USA

  • Green P (1984) Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives. J Royal Stat Soc B 46(2): 149–192

    MATH  Google Scholar 

  • Hébrail G, Hugueney B, Lechevallier Y, Rossi F (2010) Exploratory analysis of functional data via clustering and optimal segmentation. Neurocomputing 73(7–9): 1125–1141

    Article  Google Scholar 

  • James G, Sugar C (2003) Clustering for sparsely sampled functional data. J Am Stat Assoc 98(462): 397–408

    Article  MathSciNet  MATH  Google Scholar 

  • Liu X, Yang M (2009) Simultaneous curve registration and clustering for functional data. Comput Stat Data Anal 53(4): 1361–1376

    Article  MATH  Google Scholar 

  • McLachlan GJ, Krishnan K (2008) The EM algorithm and extension, 2nd edn. Wiley, New York

    Book  Google Scholar 

  • McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York

    Book  MATH  Google Scholar 

  • Ng S, McLachlan G, Wang K, Ben-Tovim Jones L, Ng S (2006) A mixture model with random-effects components for clustering correlated gene-expression profiles. Bioinformatics 22(14): 1745

    Article  Google Scholar 

  • Ramsay JO, Silverman BW (1997) Fuctional data analysis. Springer Series in Statistics, Springer, New York

    Google Scholar 

  • Schwarz G (1978) Estimating the number of components in a finite mixture model. Ann Stat 6: 461–464

    Article  MATH  Google Scholar 

  • Shi J, Wang B (2008) Curve prediction and clustering with mixtures of gaussian process functional regression models. Stat Comput 18(3): 267–283

    Article  MathSciNet  Google Scholar 

  • Wong C, Li W (2000) On a mixture autoregressive model. J Royal Stat Soc Ser B Stat Methodol 62(1): 95–115

    MathSciNet  MATH  Google Scholar 

  • Xiong Y, Yeung D (2004) Time series clustering with arma mixtures. Pattern Recogn 37(8): 1675–1689

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Allou Samé.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Samé, A., Chamroukhi, F., Govaert, G. et al. Model-based clustering and segmentation of time series with changes in regime. Adv Data Anal Classif 5, 301–321 (2011). https://doi.org/10.1007/s11634-011-0096-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-011-0096-5

Keywords

Mathematics Subject Classification (2010)

Navigation