Skip to main content
Log in

A semiparametric approach to hidden Markov models under longitudinal observations

  • Published:
Statistics and Computing Aims and scope Submit manuscript


We propose a hidden Markov model for longitudinal count data where sources of unobserved heterogeneity arise, making data overdispersed. The observed process, conditionally on the hidden states, is assumed to follow an inhomogeneous Poisson kernel, where the unobserved heterogeneity is modeled in a generalized linear model (GLM) framework by adding individual-specific random effects in the link function.

Due to the complexity of the likelihood within the GLM framework, model parameters may be estimated by numerical maximization of the log-likelihood function or by simulation methods; we propose a more flexible approach based on the Expectation Maximization (EM) algorithm. Parameter estimation is carried out using a non-parametric maximum likelihood (NPML) approach in a finite mixture context. Simulation results and two empirical examples are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others


  • Aitkin, M.: A general maximum likelihood analysis of overdispersion in generalized linear models. Stat. Comput. 6, 251–262 (1996)

    Article  Google Scholar 

  • Alfó, M., Trovato, G.: Semiparametric mixture models for multivariate count data, with application. Econom. J. 7, 1–29 (2004)

    Article  MathSciNet  Google Scholar 

  • Bago d’Uva, T.: Latent class models for utilisation of health care. J. Health Econ. 15, 329–343 (2006)

    Article  Google Scholar 

  • Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Stat. 41, 164–171 (1970)

    Article  MATH  MathSciNet  Google Scholar 

  • Böhning, D.: The EM algorithm with gradient function update for discrete mixtures with known (fixed) number of components. Stat. Comput. 13, 257–265 (2003)

    Article  MathSciNet  Google Scholar 

  • Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer, New York (2005)

    MATH  Google Scholar 

  • Gueorguieva, R.: A multivariate generalized linear mixed model for joint modelling of clustered outcomes in the exponential family. Stat. Model. 1, 177–199 (2001)

    Article  MATH  Google Scholar 

  • Heckman, J., Singer, B.: A method for minimizing the impact of distributional assumptions in econometric models of duration. Econometrica 52, 271–320 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  • Hinde, J.P., Wood, A.T.A.: Binomial variance component models with a non-parametric assumption concerning random effects. In: Crouchley, R. (ed.) Longitudinal Data Analysis. Averbury, Hants (1987)

    Google Scholar 

  • Hughes, P.J., Guttorp, P., Charles, S.P.: A non-homogeneous hidden Markov model for precipitation occurrence. Appl. Stat. 48, 15–30 (1999)

    MATH  Google Scholar 

  • Knorr-Held, L., Raßer, S.: Bayesian detection of clusters and discontinuities in disease maps. Biometrics 56, 13–21 (2000)

    Article  MATH  Google Scholar 

  • Leroux, B.G., Puterman, M.L.: Maximum-penalized-likelihood estimation for independent and Markov dependent mixture models. Biometrics 48, 545–558 (1992)

    Article  Google Scholar 

  • Lindsay, B.G.: The geometry of mixture likelihoods: a general theory. Ann. Stat. 11, 86–94 (1983a)

    Article  MATH  MathSciNet  Google Scholar 

  • Lindsay, B.G.: The geometry of mixture likelihoods, part II: the exponential family. Ann. Stat. 11, 783–792 (1983b)

    Article  MATH  MathSciNet  Google Scholar 

  • MacDonald, I.L., Zucchini, W.: Hidden Markov Models and Other Models for Discrete-Valued Time Series. Chapman & Hall, London (1997)

    MATH  Google Scholar 

  • MacKay, R.J.: Estimating the order of a hidden Markov model. Can. J. Stat. 30, 573–589 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  • MacKay, R.J.: Mixed hidden Markov models: an extension of the hidden Markov model to the longitudinal data setting. J. Am. Stat. Assoc. 102, 201–210 (2007)

    Article  MATH  Google Scholar 

  • McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)

    Book  MATH  Google Scholar 

  • Min, Y., Agresti, A.: Random effect models for repeated measures of zero-inflated count data. Stat. Model. 5, 1–19 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Munkin, M.K., Trivedi, P.K.: Simulated maximum likelihood estimation of multivariate mixed-Poisson regression models, with application. Econom. J. 2, 29–48 (1999)

    Article  MATH  Google Scholar 

  • Newhouse, J.P., the Insurance Experiment: Free for All? Lessons from the RAND Health Insurance Experiment. Harvard University Press, Cambridge (1993)

    Google Scholar 

  • Poskitt, D.S., Zhang, J.: Estimating components in finite mixtures and hidden Markov models. Aust. N. Z. J. Stat. 47, 269–286 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • van Ophem, H.: Modeling selectivity in count data models. J. Bus. Econ. Stat. 18, 503–510 (2000)

    Article  Google Scholar 

  • Wang, P., Puterman, M.L.: Analysis of longitudinal data of epileptic seizure counts—a two-state hidden Markov regression approach. Biom. J. 43, 941–962 (2001)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Antonello Maruotti.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maruotti, A., Rydén, T. A semiparametric approach to hidden Markov models under longitudinal observations. Stat Comput 19, 381 (2009).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: