Statistics and Computing

, 19:381 | Cite as

A semiparametric approach to hidden Markov models under longitudinal observations

  • Antonello MaruottiEmail author
  • Tobias Rydén


We propose a hidden Markov model for longitudinal count data where sources of unobserved heterogeneity arise, making data overdispersed. The observed process, conditionally on the hidden states, is assumed to follow an inhomogeneous Poisson kernel, where the unobserved heterogeneity is modeled in a generalized linear model (GLM) framework by adding individual-specific random effects in the link function.

Due to the complexity of the likelihood within the GLM framework, model parameters may be estimated by numerical maximization of the log-likelihood function or by simulation methods; we propose a more flexible approach based on the Expectation Maximization (EM) algorithm. Parameter estimation is carried out using a non-parametric maximum likelihood (NPML) approach in a finite mixture context. Simulation results and two empirical examples are provided.


Hidden Markov models Longitudinal data Mixed hidden Markov models Random effects NPML 


  1. Aitkin, M.: A general maximum likelihood analysis of overdispersion in generalized linear models. Stat. Comput. 6, 251–262 (1996) CrossRefGoogle Scholar
  2. Alfó, M., Trovato, G.: Semiparametric mixture models for multivariate count data, with application. Econom. J. 7, 1–29 (2004) CrossRefMathSciNetGoogle Scholar
  3. Bago d’Uva, T.: Latent class models for utilisation of health care. J. Health Econ. 15, 329–343 (2006) CrossRefGoogle Scholar
  4. Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Stat. 41, 164–171 (1970) zbMATHCrossRefMathSciNetGoogle Scholar
  5. Böhning, D.: The EM algorithm with gradient function update for discrete mixtures with known (fixed) number of components. Stat. Comput. 13, 257–265 (2003) CrossRefMathSciNetGoogle Scholar
  6. Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer, New York (2005) zbMATHGoogle Scholar
  7. Gueorguieva, R.: A multivariate generalized linear mixed model for joint modelling of clustered outcomes in the exponential family. Stat. Model. 1, 177–199 (2001) zbMATHCrossRefGoogle Scholar
  8. Heckman, J., Singer, B.: A method for minimizing the impact of distributional assumptions in econometric models of duration. Econometrica 52, 271–320 (1984) zbMATHCrossRefMathSciNetGoogle Scholar
  9. Hinde, J.P., Wood, A.T.A.: Binomial variance component models with a non-parametric assumption concerning random effects. In: Crouchley, R. (ed.) Longitudinal Data Analysis. Averbury, Hants (1987) Google Scholar
  10. Hughes, P.J., Guttorp, P., Charles, S.P.: A non-homogeneous hidden Markov model for precipitation occurrence. Appl. Stat. 48, 15–30 (1999) zbMATHGoogle Scholar
  11. Knorr-Held, L., Raßer, S.: Bayesian detection of clusters and discontinuities in disease maps. Biometrics 56, 13–21 (2000) zbMATHCrossRefGoogle Scholar
  12. Leroux, B.G., Puterman, M.L.: Maximum-penalized-likelihood estimation for independent and Markov dependent mixture models. Biometrics 48, 545–558 (1992) CrossRefGoogle Scholar
  13. Lindsay, B.G.: The geometry of mixture likelihoods: a general theory. Ann. Stat. 11, 86–94 (1983a) zbMATHCrossRefMathSciNetGoogle Scholar
  14. Lindsay, B.G.: The geometry of mixture likelihoods, part II: the exponential family. Ann. Stat. 11, 783–792 (1983b) zbMATHCrossRefMathSciNetGoogle Scholar
  15. MacDonald, I.L., Zucchini, W.: Hidden Markov Models and Other Models for Discrete-Valued Time Series. Chapman & Hall, London (1997) zbMATHGoogle Scholar
  16. MacKay, R.J.: Estimating the order of a hidden Markov model. Can. J. Stat. 30, 573–589 (2002) zbMATHCrossRefMathSciNetGoogle Scholar
  17. MacKay, R.J.: Mixed hidden Markov models: an extension of the hidden Markov model to the longitudinal data setting. J. Am. Stat. Assoc. 102, 201–210 (2007) zbMATHCrossRefGoogle Scholar
  18. McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000) zbMATHCrossRefGoogle Scholar
  19. Min, Y., Agresti, A.: Random effect models for repeated measures of zero-inflated count data. Stat. Model. 5, 1–19 (2005) zbMATHCrossRefMathSciNetGoogle Scholar
  20. Munkin, M.K., Trivedi, P.K.: Simulated maximum likelihood estimation of multivariate mixed-Poisson regression models, with application. Econom. J. 2, 29–48 (1999) zbMATHCrossRefGoogle Scholar
  21. Newhouse, J.P., the Insurance Experiment: Free for All? Lessons from the RAND Health Insurance Experiment. Harvard University Press, Cambridge (1993) Google Scholar
  22. Poskitt, D.S., Zhang, J.: Estimating components in finite mixtures and hidden Markov models. Aust. N. Z. J. Stat. 47, 269–286 (2005) zbMATHCrossRefMathSciNetGoogle Scholar
  23. van Ophem, H.: Modeling selectivity in count data models. J. Bus. Econ. Stat. 18, 503–510 (2000) CrossRefGoogle Scholar
  24. Wang, P., Puterman, M.L.: Analysis of longitudinal data of epileptic seizure counts—a two-state hidden Markov regression approach. Biom. J. 43, 941–962 (2001) zbMATHMathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Dipartimento di Istituzioni Pubbliche Economia e SocietàUniversità di Roma TreRomeItaly
  2. 2.Centre for Mathematical SciencesLund UniversityLundSweden

Personalised recommendations