Methodology and Computing in Applied Probability

, Volume 18, Issue 3, pp 597–627 | Cite as

On the Accuracy of the MAP Inference in HMMs

  • Kristi KuljusEmail author
  • Jüri Lember


In a hidden Markov model, the underlying Markov chain is usually unobserved. Often, the state path with maximum posterior probability (Viterbi path) is used as its estimate. Although having the biggest posterior probability, the Viterbi path can behave very atypically by passing states of low marginal posterior probability. To avoid such situations, the Viterbi path can be modified to bypass such states. In this article, an iterative procedure for improving the Viterbi path in such a way is proposed and studied. The iterative approach is compared with a simple batch approach where a number of states with low probability are all replaced at the same time. It can be seen that the iterative way of adjusting the Viterbi state path is more efficient and it has several advantages over the batch approach. The same iterative algorithm for improving the Viterbi path can be used when it is possible to reveal some hidden states and estimating the unobserved state sequence can be considered as an active learning task. The batch approach as well as the iterative approach are based on classification probabilities of the Viterbi path. Classification probabilities play an important role in determining a suitable value for the threshold parameter used in both algorithms. Therefore, properties of classification probabilities under different conditions on the model parameters are studied.


Hidden Markov model Viterbi state path Segmentation Active learning Classification probability 

Mathematics Subject Classification (2010)

60J10 60J22 62M05 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bahl LR, Cocke J, Jelinek F, Raviv J (1974) Optimal decoding of linear codes for minimizing symbol error rate (Corresp.) IEEE Trans Inf Theory 20(2):284–287MathSciNetCrossRefzbMATHGoogle Scholar
  2. Brejová B, Brown DG, Vinař T (2007) The most probable annotation problem in HMMs and its application to bioinformatics. J Comput Syst Sci 73(7):1060–1077MathSciNetCrossRefzbMATHGoogle Scholar
  3. Brushe GD, Mahony RE, Moore JB (1998) A soft output hybrid algorithm for ML/MAP sequence estimation. IEEE Trans Inf Theory 44(7):3129–3134MathSciNetCrossRefzbMATHGoogle Scholar
  4. Cao L, Chen CW (2003) A novel product coding and recurrent alternate decoding scheme for image transmission over noisy channels. IEEE Trans Commun 51(9):1426–1431MathSciNetCrossRefGoogle Scholar
  5. Cappé O, Moulines E, Rydén T (2005) Inference in hidden Markov models. Springer, New YorkzbMATHGoogle Scholar
  6. Colella S, Yau C, Taylor JM, Mirza G, Butler H et al (2007) QuantiSNP: an objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucl Acids Res 35(6):2013–2025CrossRefGoogle Scholar
  7. Doob JL (1953) Stochastic processes. Wiley, New YorkzbMATHGoogle Scholar
  8. Durbin R, Eddy SR, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  9. Ephraim Y, Merhav N (2002) Hidden Markov processes. IEEE Trans Inf Theory 48(6):1518–1569MathSciNetCrossRefzbMATHGoogle Scholar
  10. Gerencsér L, Molnár-Sáska G (2002) A new method for the analysis of hidden Markov model estimates. In: Proceedings of the 15th IFAC world congress, Barcelona, SpainGoogle Scholar
  11. Hayes JF, Cover TM, Riera JB (1982) Optimal sequence detection and optimal symbol-by-symbol detection: similar algorithms. IEEE Trans Commun 30(1):152–157CrossRefzbMATHGoogle Scholar
  12. Jelinek F (1997) Statistical methods for speech recognition. The MIT Press, CambridgeGoogle Scholar
  13. Koloydenko A, Lember J (2008) Infinite Viterbi alignments in the two state hidden Markov models. Acta Comment Univ Tartu Math 12:109–124MathSciNetzbMATHGoogle Scholar
  14. Koski T (2001) Hidden Markov models for bioinformatics, volume 2 of computational biology series. Kluwer Academic Publishers, DordrechtCrossRefzbMATHGoogle Scholar
  15. Kuljus K, Lember J (2012) Asymptotic risks of Viterbi segmentation. Stoch Process Appl 122(9):3312–3341MathSciNetCrossRefzbMATHGoogle Scholar
  16. Le Gland F, Mevel L (2000) Exponential forgetting and geometric ergodicity in hidden Markov models. Math. Control Signals Systems 13(1):63–93MathSciNetCrossRefzbMATHGoogle Scholar
  17. Lember J (2011a) A correction on approximation of smoothing probabilities for hidden Markov models. Stat Probab Lett 81(9):1463–1464MathSciNetCrossRefzbMATHGoogle Scholar
  18. Lember J (2011b) On approximation of smoothing probabilities for hidden Markov models. Stat Probab Lett 81(2):310–316MathSciNetCrossRefzbMATHGoogle Scholar
  19. Lember J, Koloydenko A (2008) The adjusted Viterbi training for hidden Markov models. Bernoulli 14(1):180–206MathSciNetCrossRefzbMATHGoogle Scholar
  20. Lember J, Koloydenko A (2010) A constructive proof of the existence of Viterbi processes. IEEE Trans Inf Theory 56(4):2017–2033MathSciNetCrossRefGoogle Scholar
  21. Lember J, Koloydenko A (2014) Bridging Viterbi and posterior decoding: a generalized risk approach to hidden path inference based on hidden Markov models. J Mach Learn Res 15 :1–58MathSciNetzbMATHGoogle Scholar
  22. Lember J, Kuljus K, Koloydenko A (2011) Theory of segmentation. In: Dymarsky P (ed) Hidden Markov models, theory and applications. InTech, pp 51–84Google Scholar
  23. Li J, Gray RM, Olshen RA (2000) Multiresolution image classification by hierarchical modeling with two-dimensional hidden Markov models. IEEE Trans Inform Theory 46(5):1826–1841MathSciNetCrossRefzbMATHGoogle Scholar
  24. Och FJ, Ney H (2000) Improved statistical alignment models. In: Proc 38th ann meet assoc comput linguist, pp 440–447Google Scholar
  25. Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286CrossRefGoogle Scholar
  26. Rue H (1995) New loss functions in Bayesian imaging. J Am Stat Assoc 90 (431):900–908MathSciNetCrossRefzbMATHGoogle Scholar
  27. Sznitman R, Jedynak B (2010) Active testing for face detection and localization. IEEE Trans Pattern Anal Mach Intell 32(10):1914–1920CrossRefGoogle Scholar
  28. Udupa RU, Maji HK (2005) Theory of alignment generators and applications to statistical machine translation. In: Kaelbling LP, Saffiotti A (eds) Proceedings of the 19th international joint conference on artificial intelligence (IJCAI-05), Edinburgh, Scotland, pp 1142–1147Google Scholar
  29. Viterbi AJ (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inf Theory 13(2):260–269CrossRefzbMATHGoogle Scholar
  30. Wang K, Li M, Hadley D, Liu R, Glessner J et al (2007) PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 17:1665–1674CrossRefGoogle Scholar
  31. Winkler G (2003) Image analysis, random fields and Markov Chain Monte Carlo methods, volume 27 of stochastic modelling and applied probability. Springer, BerlinCrossRefGoogle Scholar
  32. Yau C, Holmes CC (2013) A decision-theoretic approach for segmental classification. Ann Appl Stat 7(3):1814–1835MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Mathematics and Mathematical StatisticsUmeå UniversityUmeåSweden
  2. 2.Institute of Mathematical StatisticsUniversity of TartuTartuEstonia

Personalised recommendations