Abstract
An open issue in the statistical literature is the selection of the number of components for model-based clustering of time series data with a finite number of states (categories) that are observed several times. We set a finite mixture of Markov chains for which the performance of selection methods that use different information criteria is compared across a large experimental design. The results show that the performance of the information criteria vary across the design. Overall, AIC3 outperforms more widespread information criteria such as AIC and BIC for these finite mixture models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
AKAIKE, H. (1974): A New Look at Statistical Model Identification. IEEE Transactions on Automatic Control, AC-19, 716–723.
BANFIELD, J.D. and Raftery, A.E. (1993): Model-based Gaussian and Non-Gaussian Grupoing. Biometrics, 49, 803–821.
BOZDOGAN, H. (1987): Model Selection and Akaike’s Information Criterion (AIC): The General Theory and Its Analytical Extensions. Psychometrika, 52, 345–370.
BOZDOGAN, H. (1993): Choosing the Number of Component Clusters in the Mixture-Model Using a New Informational Complexity Criterion of the Inverse-Fisher Information Matrix. In: O. Opitz, B. Lausen and R. Klar (Eds.): Information and Classification, Concepts, Methods and Applications. Springer, Berlin, 40–54.
CADEZ, I., HECKERMAN, D., MEEK, C., SMYTH, P. and WHITE, S. (2003): Visualization of Navigation Patterns on a Web Site Using Model-Based Clustering. Data Mining and Knowledge Discovery, 7, 399–424.
DESARBO, W.S., LEHMANN, D.R. and HOLLMAN, F.G. (2004): Modeling Dynamic Effects in Repeated-measures Experiments Involving Preference/Choice: An Illustration Involving Stated Preference Analysis. Applied Psychological Measurement, 28, 186–209.
DIAS, J.G. (2004): Controlling the Level of Separation of Components in Monte Carlo Studies of Latent Class Models. In: D. Banks, L. House, F.R. McMorris, P. Arabie and W. Gaul (Eds.): Classification, Clustering, and Data Mining Applications. Springer, Berlin, 77–84.
DIAS, J.G. (2006): Model Selection for the Binary Latent Class Model. A Monte Carlo Simulation. In: V. Batagelj, H.-H. Bock, A. Ferligoj and A. Ziberna (Eds.): Data Science and Classification. Springer, Berlin, 91–99.
DIAS, J.G. and WILLEKENS, F. (2005): Model-based Clustering of Sequential Data with an Application to Contraceptive Use Dynamics. Mathematical Population Studies, 12, 135–157.
LO, Y., MENDELL, N.R. and RUBIN, D.B. (2001): Testing the Number of Components in a Normal Mixture. Biometrika, 88, 767–778.
MCLACHLAN, G.J. and PEEL, D. (2000): Finite Mixture Models. John Wiley & Sons, New York.
POULSEN, C.S. (1990): Mixed Markov and Latent Markov Modelling Applied to Brand Choice Behavior. International Journal of Research in Marketing, 7, 5–19.
RAMASWAMY, V., DESARBO, W.S., REIBSTEIN, D.J. and ROBINSON, W.T. (1993): An Empirical Pooling Approach for Estimating Marketing Mix Elasticities with PIMS Data. Marketing Science, 12, 103–124.
SCHWARZ, G. (1978): Estimating the Dimension of a Model. Annals of Statistics, 6, 461–464.
WILKS, S.S. (1938): The Large Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses. Annals of Mathematical Statistics, 9, 60–62.
WOLFE, J.H. (1970): Pattern Clustering by Multivariate Mixture Analysis. Multivariate Behavioral Research, 5, 329–350.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dias, J.G. (2007). Model Selection Criteria for Model-Based Clustering of Categorical Time Series Data: A Monte Carlo Study. In: Decker, R., Lenz, H.J. (eds) Advances in Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70981-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-70981-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70980-0
Online ISBN: 978-3-540-70981-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)