Skip to main content
Log in

Finite Mixture Modeling of Gaussian Regression Time Series with Application to Dendrochronology

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Finite mixture modeling is a popular statistical technique capable of accounting for various shapes in data. One popular application of mixture models is model-based clustering. This paper considers the problem of clustering regression autoregressive moving average time series. Two novel estimation procedures for the considered framework are developed. The first one yields the conditional maximum likelihood estimates which can be used in cases when the length of times series is substantial. Simple analytical expressions make fast parameter estimation possible. The second method incorporates the Kalman filter and yields the exact maximum likelihood estimates. The procedure for assessing variability in obtained estimates is discussed. We also show that the Bayesian information criterion can be successfully used to choose the optimal number of mixture components and correctly assess time series orders. The performance of the developed methodology is evaluated on simulation studies. An application to the analysis of tree ring data is thoroughly considered. The results are very promising as the proposed approach overcomes the limitations of other methods developed so far.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • AKAIKE, H. (1974), “A New Look at the Statistical Model Identification”, IEEE Transactions on Automatic Control, 19, 716–723.

  • BIERNACKI, C., CELEUX, G., and GOVAERT, G. (2003), “Choosing Starting Values for the EM Algorithm for Getting the Highest Likelihood in Multivariate Gaussian Mixture Models”, Computational Statistics and Data Analysis, 41, 561–575.

  • BRIDGE, M. (2012), “Locating the Origins ofWood Resources: A Review of Dendroprovenancing”, Journal of Archaeological Science, 3, 2828–2834.

  • BROYDEN, C.G. (1970), “The Convergence of a Class of Double-Rank Minimization Algorithms”, Journal of the Institute of Mathematics and Its Applications, 6, 76–90.

  • CHEN, W.-C., and MAITRA, R. (2011), “Model-Based Clustering of Regression Time Series Data via APECM—An AECM Algorithm Sung to an Even Faster Beat”, Statistical Analysis and Data Mining, 4, 567–578.

  • DEMPSTER, A.P., LAIRD, N.M., and RUBIN, D.B. (1977), “Maximum Likelihood for Incomplete Data via the EM Algorithm (With Discussion)”, Journal of the Royal Statistical Society, Series B, 39, 1–38.

  • ESPER, J., COOK, E., and SCHWEINGRUBER, F. (2002), “Low-Frequency Signals in Long Tree-Ring Chronologies for Reconstructing Past Temperature Variability”, Science, 295, 2250–2253.

  • FLETCHER, R. (1970), ”A New Approach to Variable Metric Algorithms”, Computer Journal, 13, 317–322.

  • FORGY, E. (1965), ”Cluster Analysis of Multivariate Data: Efficiency vs. Interpretability of Classifications”, Biometrics, 21, 768–780.

  • FRALEY, C., and RAFTERY, A.E. (2002), “Model-Based Clustering, Discriminant Analysis, and Density Estimation”, Journal of the American Statistical Association, 97, 611–631.

  • GOLDFARB, D. (1970), “A Family of Variable Metric Updates Derived by Variational Means”, Mathematics of Computation, 24, 23–26.

  • GRISSINO-MAYERI, H.D., and FRITTS, H. (1997), “The International Tree-Ring Data Bank: An Enhanced Global Database Serving the Global Scientific Community”, The Holocene, 7, 235–238.

  • HAMILTON, J.D. (1994), Time Series Analysis, Princeton NJ: Princeton University Press.

  • HANECA, K., WAZNY, T., VAN ACKER, J., and BEECKMAN, H. (2005), “Provenancing Baltic Timber from Art Historical Objects: Success and Limitations”, Journal of Archaeological Science, 32, 261–271.

  • HARVEY, A.C. (1989), Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge UK: Cambridge University Press.

  • HARVEY, A.C., and PHILLIPS, G.D.A. (1979), “Maximum Likelihood Estimation of Regression-Models with Autoregressive-Moving Averages Disturbances”, Biometrika, 66, 49–58.

  • HOLLSTEIN, E. (1980), Mitteleurop¨aische Eichenchronologie, Vol. 11, Mainz am Rhein: Phillip Von Zabern.

  • KHALILI, A., and CHEN, J. (2007), “Variable Selection in Finite Mixture of Regression Models”, Journal of the American Statistical Association, 102, 1025–1038.

  • KINI, B.V., and SEKHAR, C.C. (2013), “Bayesian Mixture of AR Models for Time Series Clustering”, Pattern Analysis and Applications, 16, 179–200.

  • LEISCH, F. (2004), “FlexMix: A General Framework for Finite Mixture Models and Latent Class Regression in R”, Journal of Statistical Software, 11, 1–18.

  • LIAO, T.W. (2005), ”Clustering of Time Series Data—A Survey”, Pattern Recognition, 38, 1857–1874.

  • MACQUEEN, J. (1967), “Some Methods for Classification and Analysis of Multivariate Observations”, Proceedings of the Fifth Berkeley Symposium, 1, 281–297.

  • MAITRA, R. (2009), ”Initializing Partition-Optimization Algorithms”, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 6, 144–157.

  • MARTINELLI, N. (2004), “Climate from Dendrochronology: Latest Developments and Results”, Global and Plantery Change, 40, 129–139.

  • MCLACHLAN, G., and KRISHNAN, T. (2008), The EM Algorithm and Extensions (2nd ed.), New York: Wiley.

  • MCLACHLAN, G., and PEEL, D. (2000), Finite Mixture Models, New York: Wiley.

  • MELNYKOV, V. (2012), ”Efficient Estimation in Model-Based Clustering of Gaussian Regression Time Series”, Statistical Analysis and Data Mining, 5, 95–99.

  • MELNYKOV, V., and MAITRA, R. (2010), “Finite Mixture Models and Model-Based Clustering”, Statistics Surveys, 4, 80–116.

  • MELNYKOV, V., and MELNYKOV, I. (2012), “Initializing the EM Algorithm in Gaussian Mixture Models with an Unknown Number of Components”, Computational Statistics and Data Analysis, 56, 1381–1395.

  • MELNYKOV, V., MICHAEL, S., and MELNYKOV, I. (2015), “Recent Developments in Model-Based Clustering with Applications”, in Partitional Clustering Algorithms, ed. M.E. Celebi, New York: Springer, Chap 1, pp. 1–39.

  • NELDER, J.A., and MEAD, R. (1965), “A Simplex Algorithm for Function Minimization”, Computer Journal, 7, 308–313.

  • SCHWARZ, G. (1978), “Estimating the Dimensions of a Model”, Annals of Statistics, 6, 461–464.

  • SHANNO, D.F. (1970), “Conditioning of Quasi-Newton Method for Function Minimization”, Mathematics of Computation, 24, 647–656.

  • SHUMWAY, R., and STOFFER, D. (2006), Time Series Analysis and Its Applications—With R Examples, New York: Wiley.

  • SOHAR, K., VITAS, A., and LAEAENELAID, A. (2012), “Sapwood Estimates of Pedunculate Oak (Quercus Robur L.) in Eastern Baltic”, Dendrochronologia, 30, 49–56.

  • WARD, J.H. (1963), “Hierarchical Grouping to Optimize an Objective Function”, Journal of the American Statistical Association, 58, 236–244.

  • WATSON, M.W., and ENGLE, R.F. (1983), “Alternative Algorithms for the Estimation of Dynamic Factor, Mimic and Varying Coefficient Regression-Models”, Journal of Econometrics, 23, 385–400.

  • WONG, C.S., and LI, W.K. (2000), “On a Mixture Autoregressive Model”, Journal of the Royal Statistical Society B, 62, 95–115.

  • XIONG, Y., and YEUNG, D.-Y. (2004), “Time Series Clustering with ARMA Mixtures”, Pattern Recognition, 37, 1675–1689.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Semhar Michael.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Michael, S., Melnykov, V. Finite Mixture Modeling of Gaussian Regression Time Series with Application to Dendrochronology. J Classif 33, 412–441 (2016). https://doi.org/10.1007/s00357-016-9216-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-016-9216-4

Keywords

Navigation