Abstract
We compare different selection criteria to choose the number of latent states of a multivariate latent Markov model for longitudinal data. This model is based on an underlying Markov chain to represent the evolution of a latent characteristic of a group of individuals over time. Then, the response variables observed at different occasions are assumed to be conditionally independent given this chain. Maximum likelihood estimation of the model is carried out through an Expectation–Maximization algorithm based on forward–backward recursions which are well known in the hidden Markov literature for time series. The selection criteria we consider are based on penalized versions of the maximum log-likelihood or on the posterior probabilities of belonging to each latent state, that is, the conditional probability of the latent state given the observed data. Among the latter criteria, we propose an appropriate entropy measure tailored for the latent Markov models. We show the results of a Monte Carlo simulation study aimed at comparing the performance of the above states selection criteria on the basis of a wide set of model specifications.
Similar content being viewed by others
References
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov B, Csaki F (eds) Second International Symposium on Information Theory. Akademinai Kiado, Budapest, pp 267–281
Bartolucci F (2006) Likelihood inference for a class of latent Markov models under linear hypotheses on the transition probabilities. J Royal Stat Soc Ser B 68:155–178
Bartolucci F, Pennoni F, Francis B (2007) A latent Markov model for detecting patterns of criminal activity. J Royal Stat Soc Ser A 170:115–132
Bartolucci F, Farcomeni A, Pennoni F (2013) Latent Markov models for longitudinal data. Chapman & Hall/CRC, Boca Raton
Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41:164–171
Biernacki C, Govaert G (1997) Using the classification likelihood to choose the number of clusters. Comput Sci Stat 29:451–457
Biernacki C, Govaert G (1999) Choosing models in model-based clustering and discriminant analysis. J Stat Comput Simul 64:49–71
Biernacki C, Celeux G, Govaert G (1999) An improvement of the NEC criterion for assessing the number of clusters in a mixture model. Pattern Recognit Lett 20:267–272
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intel 22:719–725
Bozdogan H (1987) Model selection and Akaike’s Information Criterion (AIC): The general theory and its analytical extensions. Psychometrika 52:345–370
Bozdogan H (1993) Choosing the number of component clusters in the mixture-model using a new informational complexity criterion of the inverse-Fisher information matrix. In: Opitz O, Lausen B, Klar R (eds) Information and classification. Springer, Berlin, pp 40–54
Celeux G, Durand JB (2008) Selecting hidden Markov model state number with cross-validated likelihood. Comput Stat 23:541–564
Celeux G, Soromenho G (1996) An entropy criterion for assessing the number of clusters in a mixture model. J Classif 13:195–212
Costa M, De Angelis L (2010) Model selection in hidden Markov models: a simulation study. Quaderni di Dipartimento 7, Dipartimento di Scienze Statistiche “Paolo Fortunati” Alma Mater Studiorum Università di Bologna
Dempster AP, Laird NM, Rubin DB (1977) Maximum Likelihood from incomplete data via the EM algorithm. J Royal Stat Soc Ser B 39:1–38
Dias JG (2006) Model selection for the binary latent class model: A Monte Carlo simulation. In: Batagelj V, Bock HH, Ferligoj A, Z̆iberna A (eds) Data science and classification. Springer, Berlin, pp 91–99
Durand JB, Guédon Y (2012) Localizing the latent structure canonical uncertainty: entropy profiles for hidden Markov models. Tech. rep., Research Report 7896, Project-Teams Mistis and Virtual Plants
Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97:611–631
Goodman LA (1974) Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61:215–231
Hathaway RJ (1986) Another interpretation of the EM algorithm for mixture distributions. Stat Probab Lett 4:53–56
Hernando D, Crespi V, Cybenko G (2005) Efficient computation of the hidden Markov model entropy for a given observation sequence. IEEE Trans Inform Theory 51:2681–2685
Hurvich CM, Tsai CL (1993) A corrected Akaike information criterion for vector autoregressive model selection. J Time Ser Anal 14:271–279
Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795
Langeheine R (1994) Latent variables Markov models. In: von Eye A, Clogg CC (eds) Latent variables analysis: applications for developmental research. Sage, Thousand Oaks, pp 373–395
Langeheine R, Van de Pol F (1994) Discrete-time mixed Markov latent class models. In: Dale A, Davies RB (eds) Analyzing social and political change: a casebook of methods. Sage Publications, London, pp 167–197
Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Houghton Mifflin, Boston
Li J (2005) Clustering based on a multilayer mixture model. J Comput Graph Stat 14:547–568
Lin TH, Dayton CM (1997) Model selection information criteria for non-nested latent class models. J Educ Behav Stat 22:249–264
Magidson J, Vermunt JK (2001) Latent class factor and cluster models, bi-plots, and related graphical displays. Sociol Methodol 31:223–264
Marcus B, Petersen K, Weissman T (2011) Entropy of Hidden Markov Processes and Connections to Dynamical Systems: Papers from the Banff International Research Station Workshop. Cambridge University Press, Cambridge
McLachlan G, Peel D (2000) Finite mixture models. Wiley, NewYork
Nylund KL, Asparouhov T, Muthén BO (2007) Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Struct Equ Model 14:535–569
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Tierney L, Kadane JB (1986) Accurate approximations for posterior moments and marginal densities. J Am Stat Assoc 81:82–86
Welch LR (2003) Hidden Markov models and the Baum-Welch algorithm. IEEE Inform Theory Soc Newslett 53:1–13
Wiggins LM (1973) Panel analysis: latent probability models for attitude and behavior processes. Elsevier, New York
Wolfe JH (1970) Pattern clustering by multivariate mixture analysis. Multivar Behav Res 5:329–350
Zucchini W, MacDonald IL (2009) Hidden Markov models for time series: an introduction using R. Chapman & Hall/CRC Press, Boca Raton
Acknowledgments
The authors thank Professor F. Bartolucci (University of Perugia, Italy) for useful insights on the topic. S. Bacci and F. Pennoni acknowledge the financial support from the grant “Finite mixture and latent variable models for causal inference and analysis of socio-economic data” (FIRB—Futuro in ricerca) funded by the Italian Government (RBFR12SHVV).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bacci, S., Pandolfi, S. & Pennoni, F. A comparison of some criteria for states selection in the latent Markov model for longitudinal data. Adv Data Anal Classif 8, 125–145 (2014). https://doi.org/10.1007/s11634-013-0154-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-013-0154-2
Keywords
- Akaike information criterion
- Bayesian information criterion
- Entropy
- Mixture model
- Multivariate latent Markov model
- Normalized entropy criterion