Selecting the Number of States in Hidden Markov Models: Pragmatic Solutions Illustrated Using Animal Movement

Abstract

We discuss the notorious problem of order selection in hidden Markov models, that is of selecting an adequate number of states, highlighting typical pitfalls and practical challenges arising when analyzing real data. Extensive simulations are used to demonstrate the reasons that render order selection particularly challenging in practice despite the conceptual simplicity of the task. In particular, we demonstrate why well-established formal procedures for model selection, such as those based on standard information criteria, tend to favor models with numbers of states that are undesirably large in situations where states shall be meaningful entities. We also offer a pragmatic step-by-step approach together with comprehensive advice for how practitioners can implement order selection. Our proposed strategy is illustrated with a real-data case study on muskox movement.

Supplementary materials accompanying this paper appear online.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

References

  1. Biernacki, C., Celeux, G. & Govaert, G. (2013), Assessing a mixture model for clustering with the integrated completed likelihood IEEE Transactions on pattern analysis and machine intelligence, 22, 719–725.

    Google Scholar 

  2. Broekhuis, F., Grünewälder, S., McNutt, J.W. & Macdonald, D.W. (2014), Optimal hunting conditions drive circalunar behavior of a diurnal carnivore. Behavioral Ecology, 25, 1285–1275.

    Article  Google Scholar 

  3. Burnham, K.P. & Anderson, D.R. (2002), Model Selection and Multimodel Inference, Second Edition, Springer, New York.

    Google Scholar 

  4. Celeux, G. & Durand, J.-B. (2008), Selecting hidden Markov model state number with cross-validated likelihood. Computational Statistics, 23, 541–564.

    MathSciNet  Article  MATH  Google Scholar 

  5. DeRuiter, S.L., Langrock, R., Skirbutas, T., Goldbogen, J.A., Calambokidis, J., Friedlaender, A.S. & Southall, B.L. (in press), A multivariate mixed hidden Markov model for blue whale behaviour and responses to sound exposure. Annals of Applied Statistics, 11, 362–392.

  6. Gneiting, T. & Raftery, A.E. (2007), Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102, 359–378.

    MathSciNet  CAS  Article  MATH  Google Scholar 

  7. Hennig, C. (2015), What are the true clusters? Pattern Recognition Letters, 64, 53–62.

    Article  MATH  Google Scholar 

  8. Langrock, R. (2012), Flexible latent-state modelling of Old Faithful’s eruption inter-arrival times in 2009. Australian and New Zealand Journal of Statistics, 54, 261–279.

    MathSciNet  Article  MATH  Google Scholar 

  9. Langrock, R., King, R., Matthiopoulos, J., Thomas, L., Fortin, D. & Morales, J.M. (2012), Flexible and practical modeling of animal telemetry data: hidden Markov models and extensions. Ecology, 93, 2336–2342.

    Article  PubMed  Google Scholar 

  10. Langrock, R., Kneib, T., Sohn, A. & DeRuiter, S.L. (2015), Nonparametric inference in hidden Markov models using P-splines. Biometrics, 71, 520–528.

    MathSciNet  Article  PubMed  MATH  Google Scholar 

  11. Langrock, R., Marques, T.A., Baird, R.W. & Thomas, L. (2014), Modeling the diving behavior of whales: a latent-variable approach with feedback and semi-Markovian components. Journal of Agricultural, Biological and Environmental Statistics, 19, 82–100.

    MathSciNet  Article  MATH  Google Scholar 

  12. Leos-Barajas, V., Photopoulou, T., Langrock, R., Patterson, T.A., Watanabe, Y.Y., Murgatroyd, M. & Papastamatiou, Y.P. (in press), Analysis of animal accelerometer data using hidden Markov models. Methods in Ecology and Evolution, 8, 161–173.

  13. Li, M. & Bolker, B.M. (2017), Incorporating periodic variability in hidden Markov models for animal movement Movement Ecology, 5, DOI:10.1186/s40462-016-0093-6.

  14. Michelot, T., Langrock, R. & Patterson, T.A. (2016), moveHMM: An R package for analysing animal movement data using hidden Markov models. Methods in Ecology and Evolution, 7, 1308–1315.

    Article  Google Scholar 

  15. Morales, J.M., Haydon, D.T., Frair, J., Holsinger, K.E. & Fryxell, J.M. (2004), Extracting more out of relocation data: building movement models as mixtures of random walks. Ecology, 85, 2436–2445.

    Article  Google Scholar 

  16. Patterson, T.A., Basson, M., Bravington, M.V. & Gunn, J.S. (2009), Classifying movement behaviour in relation to environmental conditions using hidden Markov models. Journal of Animal Ecology, 78, 1113–1123.

    Article  PubMed  Google Scholar 

  17. Patterson, T.A., Parton, A., Langrock, R., Blackwell, P.G., Thomas, L. & King, R. (2016), Statistical modelling of animal movement: a myopic review and a discussion of good practice. arXiv:1603.07511.

  18. Pradel, R. (2005), Multievent: an extension of multistate capture–recapture models to uncertain states, Biometrics, 61, 442–447.

    MathSciNet  Article  PubMed  MATH  Google Scholar 

  19. Robert, C.P., Rydén, T. & Titterington, D.M. (2000), Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method. Journal of the Royal Statistical Society Series B, 62, 57–75.

    MathSciNet  Article  MATH  Google Scholar 

  20. Schmidt, N.M., van Beest, F.M., Mosbacher, J.B., Stelvig, M., Hansen, L.H. & C. Grøndahl. (2016), Ungulate movement in an extreme seasonal environment: Year-round movement patterns of high-arctic muskoxen. Wildlife Biology, 22, 253–267.

    Article  Google Scholar 

  21. Schwarz, G. (1978), Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.

    ADS  MathSciNet  Article  MATH  Google Scholar 

  22. Stone, M. (1977), An asymptotic equivalence of choice of model by cross-validation and Akaike’s Criterion. Journal of the Royal Statistical Society Series B, 39, 44–47.

    ADS  MathSciNet  MATH  Google Scholar 

  23. Towner, A., Leos-Barajas, V., Langrock, R., Schick, R.S., Smale, M.J., Jewell, O., Kaschke, T. & Papastamatiou, Y.P. (2016), Sex-specific and individual preferences for hunting strategies in white sharks. Functional Ecology, 30, 1397–1407.

    Article  Google Scholar 

  24. Zucchini, W. (2000), An introduction to model selection. Journal of Mathematical Psychology, 44, 41–61.

    CAS  Article  PubMed  MATH  Google Scholar 

  25. Zucchini, W., MacDonald, I.L. & Langrock, R. (2016), Hidden Markov Models for Time Series: An Introduction using R, Second Edition, Chapman & Hall/CRC, Boca Raton.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jennifer Pohle.

Electronic supplementary material

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pohle, J., Langrock, R., van Beest, F.M. et al. Selecting the Number of States in Hidden Markov Models: Pragmatic Solutions Illustrated Using Animal Movement. JABES 22, 270–293 (2017). https://doi.org/10.1007/s13253-017-0283-8

Download citation

Keywords

  • Animal movement
  • Information criteria
  • Selection bias
  • Unsupervised learning