Abstract
In this paper, we propose a Bayesian framework for estimation of parameters of a mixture of autoregressive models for time series clustering. The proposed approach is based on variational principles and provides a tractable approximation to the true posterior density that minimizes Kullback–Liebler (KL) divergence with respect to prior distribution. This method simultaneously addresses the model complexity and parameter estimation problems. The proposed approach is applied both on simulated and real-world time series datasets. It is found to be useful in exploring and finding the true number of underlying clusters, starting from an arbitrarily large number of clusters.
Similar content being viewed by others
References
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Liao TW (2005) Clustering of time series data —a survey. J Pattern Recognit 38:1857–1874
Box GEP, Jenkins GM, Reinsel GC (1994) Time series analysis. Forecasting and control, 3rd edn. Prentice Hall
Xiong Y, Yeung D (2004) Time series clustering with ARMA mixtures. J Pattern Recognit 37:1675–1689
Gaffney S, Smyth P (2003) Curve clustering with random effects regression mixtures. In: Proceedings of the ninth international workshop on artificial intelligence and statistics, Key West, Florida, USA, pp 3–6
Kalpakis K, Gada D, Puttagunta V (2001) Distance measures for effective clustering of ARIMA time series. In: the Proceedings of the IEEE international conference on data mining, San Jose, CA, pp 273–280
Pardey J, Roberts S, Tarassenko L (1995) A review of parametric modelling techniques for EEG analysis. J Med Eng Phys 18:2–11
Burnham KP, Anderson DR (2002) Model selection and multi-model inference, 2nd edn. Springer
Gelman A, Rubin DB, Carlin JB, Stern HS, Carlin JB (2000) Bayesian data analysis, second edn. Chapman and Hall
MacKay DJC (2003) Information theory, inference, and learning algorithms. Cambridge University Press
Jaynes ET (2003) Probability theory: the logic of science. Cambridge University Press
Bishop CM (2006) Pattern recognition and machine learning. Springer
Corduneanu A, Bishop CM (2001) Variational Bayesian model selection for mixture distributions. In: Proceedings of eighth international conference on artificial intelligence and statistics, pp 27–34
Beal MJ (2003) Variational algorithms for approximate Bayesian inference. PhD thesis, Gatsby Computational Neuroscience Unit, University College, London
Watanabe S, Minami Y, Nakamura A (2004) Variational Bayesian estimation and clustering for speech recognition. IEEE Trans Speech Audio Process 12(4):365–381
Ting JA, D’Souza A, Kenji Y, Toshinori Y, Donna H, Shinji K, Lauren S, John K, Mitsuo K, Peter S, Stefan S (2008) Variational Bayesian least squares: an application to brain–machine interface data. Int J Neural Netw 21:1112–1131
Teschendorff AE, Wang Y, Barbosa-Morais NL, Brenton JD, Caldas C (2005) A variational Bayesian mixture modeling framework for cluster analysis of gene expression data. J Bioinform 21(13):3025–3033
Penny WD, Roberts SJ (2002) Variational Bayes for generalized AR models. IEEE Trans Signal Process 50(9):2245–2257
Penny WD, Roberts SJ (2000) Bayesian methods for autoregressive models. In: Proceedings of IEEE international workshop neural networks signal processing, Sydney, Australia, pp 125–134
Tipping ME (2006) Bayesian inference: an introduction to principles and practice in machine learning. In: Bousquet O, von Luxburg U, Ratsch G (eds) Advanced lectures on machine learning. Springer, pp 41–62
Lappalainen H, Miskin JW (2000) Ensemble learning. In: Girolami M (ed) Advances in independent components analysis. Springer, pp 75–92
Attias H (2000) A variational Bayesian framework for graphical models. In: Advances in neural information processing systems. 12:209–215
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39:1–38
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Percival DB, Walden AT (2000) Wavelet methods for time series analysis, Cambridge Series in Statistical and Probabilistic Mathematics
Gavrilov M, Anguelov D, Indyk P, Motwani R (2000) Mining the stock market: which measure is best? In: Proceedings of the sixth international conference on knowledge discovery and data mining, pp 487–496
Hamilton JD (1989) A new approach to economic analysis of nonstationary time series and the business cycle. J Econom 57:357–384
Cassidy MJ, Penny WD (2002) Bayesian nonstationary autoregressive models for biomedical signal analysis. IEEE Trans Biomed Eng 49:1142–1152
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Venkataramana Kini, B., Chandra Sekhar, C. Bayesian mixture of AR models for time series clustering. Pattern Anal Applic 16, 179–200 (2013). https://doi.org/10.1007/s10044-011-0247-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-011-0247-5