Abstract
Traditional Markov chain Monte Carlo (MCMC) sampling of hidden Markov models (HMMs) involves latent states underlying an imperfect observation process, and generates posterior samples for top-level parameters concurrently with nuisance latent variables. When potentially many HMMs are embedded within a hierarchical model, this can result in prohibitively long MCMC runtimes. We study combinations of existing methods, which are shown to vastly improve computational efficiency for these hierarchical models while maintaining the modeling flexibility provided by embedded HMMs. The methods include discrete filtering of the HMM likelihood to remove latent states, reduced data representations, and a novel procedure for dynamic block sampling of posterior dimensions. The first two methods have been used in isolation in existing application-specific software, but are not generally available for incorporation in arbitrary model structures. Using the NIMBLE package for R, we develop and test combined computational approaches using three examples from ecological capture–recapture, although our methods are generally applicable to any embedded discrete HMMs. These combinations provide several orders of magnitude improvement in MCMC sampling efficiency, defined as the rate of generating effectively independent posterior samples. In addition to being computationally significant for this class of hierarchical models, this result underscores the potential for vast improvements to MCMC sampling efficiency which can result from combinations of known algorithms.
Similar content being viewed by others
References
Amstrup SC, McDonald TL, Manly BFJ (2010) Handbook of capture–recapture analysis. Princeton University Press, Princeton, p 173
Brooks S et al (2011) Handbook of Markov chain Monte Carlo. CRC Press, Boca Raton, pp 3–47
Brownie C et al (1985) Statistical inference from band recovery data: a handbook. U.S. Department of the Interior, Fish and Wildlife Service
Capp O, Moulines E, Rydn T (2006) Inference in hidden Markov models. Springer Science & Business Media, Berlin
Choquet R et al (2004) M-SURGE: new software specifically designed for multistate capture–recapture models. Anim Biodivers Conserv 27(1):207–215
Elliott RJ, Aggoun L, Moore JB (2008) Hidden Markov models: estimation and control, vol 29. Springer Science & Business Media, Berlin
Gales M, Young S (2008) The application of hidden Markov models in speech recognition. Found Trends Signal Process 1(3):195–304
Gilks WR (2005) Markov Chain Monte Carlo. Encyclopedia of Biostatistics. Wiley, New York
Gimenez O et al (2007) State-space modelling of data on marked individuals. Ecol Model 206(34):431–438
Jolly GM (1965) Explicit estimates from capture–recapture data with both death and immigration-stochastic model. Biometrika 52(1/2):225–247
Kéry M, Gregg KB (2004) Demographic analysis of dormancy and survival in the terrestrial orchid Cypripedium reginae. J Ecol 92(4):686–695
Kéry M, Schaub M (2012) Bayesian population analysis using WinBUGS: a hierarchical perspective. Academic Press, San Diego, pp 261–265
King R (2012) A review of Bayesian state-space modelling of capture–recapture–recovery data. Interface Focus 2(2):190–204
Langrock R et al (2012) Flexible and practical modeling of animal telemetry data: hidden Markov models and extensions. Ecology 93(11):2336–2342
Lebreton J-D et al (1992) Modeling survival and testing biological hypotheses using marked animals: a unified approach with case studies. Ecol Monogr 62(1):67–118 (JSTOR: 2937171)
Lebreton J-D et al (2009) Modeling individual animal histories with multistate capture–recapture models. Adv Ecol Res 41:87–173
Liu JS, Wong WH, Kong A (1994) Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81(1):27–40
Lunn D et al (2012) The BUGS book: a practical introduction to Bayesian analysis. CRC Press, San Diego, p 402
MacDonald IL, Zucchini W (1997) Hidden Markov and other models for discrete-valued time series, vol 110. CRC Press, San Diego
Marzolin G (1988) Polygynie du Cincle plongeur (Cinclus cinclus) dans les côtes de Lorraine. Oiseau et la Revue Francaise d’Ornithologie 58(4):277–286
McCrea RS, Morgan BJ (2011) Multistate mark-recapture model selection using score tests. Biometrics 67(1):234–241
Mengersen KL, Tweedie RL (1996) Rates of convergence of the Hastings and Metropolis algorithms. Ann Stat 24(1):101–121
Nichols JD, Pollock KH (1983) Estimation methodology in contemporary small mammal capture–recapture studies. J Mammal 64(2):253–260
NIMBLE Development Team (2015) NIMBLE: an R package for programming with BUGS models, Version 0.5–1. http://r-nimble.org
Plummer M (2003) JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In: Proceedings of the 3rd international workshop on distributed statistical computing, vol 124. Vienna, p 125
Risk BB, De Valpine P, Beissinger SR (2011) A robust-design formulation of the incidence function model of metapopulation dynamics applied to two species of rails. Ecology 92(2):462–474
Roberts GO, Gelman A, Gilks WR (1997) Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl Probab 7(1):110–120
Roberts GO, Sahu SK (1997) Updating schemes, correlation structure, blocking and parameterization for the gibbs sampler. J R Stat Soc Ser B (Stat Methodol) 59(2):291–317
Roberts GO, Tweedie RL (1996) Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms. Biometrika 83(1):95–110
Royle JA (2008) Modeling Individual effects in the CormackJollySeber model: a statespace formulation. Biometrics 64(2):364–370
Royle JA, Dorazio RM (2008) Hierarchical modeling and inference in ecology: the analysis of data from populations, metapopulations and communities. Academic Press, San Diego
Seber GA (1965) A note on the multiple-recapture census. Biometrika 52(1/2):249–259
Turek D et al (2016) Automated parameter blocking for efficient Markov-Chain Monte Carlo sampling. Bayesian Analysis (Advance Publication)
White GC, Burnham KP (1999) Program MARK: survival estimation from populations of marked animals. Bird Study 46(S1):S120–S139
Acknowledgments
This work was supported by the NSF under Grant DBI-1147230 and by support to DT from the Berkeley Institute for Data Science. We thank Marc Kéry, Byron Morgan, and Michael Schaub for reviewing earlier versions of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling Editor: Bryan F. J. Manly.
Rights and permissions
About this article
Cite this article
Turek, D., de Valpine, P. & Paciorek, C.J. Efficient Markov chain Monte Carlo sampling for hierarchical hidden Markov models. Environ Ecol Stat 23, 549–564 (2016). https://doi.org/10.1007/s10651-016-0353-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10651-016-0353-z