# Hidden Markov Models with Confidence

## Abstract

We consider the problem of training a Hidden Markov Model (HMM) from fully observable data and predicting the hidden states of an observed sequence. Our attention is focused to applications that require a list of potential sequences as a prediction. We propose a novel method based on Conformal Prediction (CP) that, for an arbitrary confidence level \(1-\varepsilon \), produces a list of candidate sequences that contains the correct sequence of hidden states with probability at least \(1-\varepsilon \). We present experimental results that confirm this holds in practice. We compare our method with the standard approach (i.e.: the use of Maximum Likelihood and the List–Viterbi algorithm), which suffers from violations to the assumed distribution. We discuss advantages and limitations of our method, and suggest future directions.

## Keywords

Conformal Prediction Hidden Markov Models List–Viterbi algorithm## Notes

### Acknowledgements

Giovanni Cherubin was supported by the EPSRC and the UK government as part of the Centre for Doctoral Training in Cyber Security at Royal Holloway, University of London (EP/K035584/1). This project has received funding from the European Unions Horizon 2020 Research and Innovation programme under Grant Agreement no. 671555 (ExCAPE). This work was also supported by EPSRC grant EP/K033344/1 (“Mining the Network Behaviour of Bots”); by Thales grant (“Development of automated methods for detection of anomalous behaviour”); by the National Natural Science Foundation of China (No.61128003) grant; and by the grant “Development of New Venn Prediction Methods for Osteoporosis Risk Assessment” from the Cyprus Research Promotion Foundation.

We are grateful to Alexander Gammerman, Kenneth Paterson, and Vladimir Vovk for useful discussions. We also would like to thank the anonymous reviewers for their insightful comments.

## Supplementary material

## References

- 1.Forney Jr., G.D.: The Viterbi algorithm. Proc. IEEE
**61**(3), 268–278 (1973)MathSciNetCrossRefGoogle Scholar - 2.Gammerman, A., Vovk, V.: Hedging predictions in machine learning. Comput. J.
**50**(2), 151–163 (2007)CrossRefGoogle Scholar - 3.Melluish, T., Saunders, C., Nouretdinov, I., Vovk, V.: The typicalness framework: a comparison with the Bayesian approach. University of London, Royal Holloway (2001)Google Scholar
- 4.Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE
**77**(2), 257–286 (1989)CrossRefGoogle Scholar - 5.Seshadri, N., Sundberg, C.W.: List Viterbi decoding algorithms with applications. IEEE Trans. Commun.
**42**(234), 313–323 (1994)CrossRefGoogle Scholar - 6.Shafer, G., Vovk, V.: A tutorial on conformal prediction. J. Mach. Learn. Res.
**9**, 371–421 (2008)MathSciNetzbMATHGoogle Scholar - 7.Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inf. Theory
**13**(2), 260–269 (1967)CrossRefzbMATHGoogle Scholar - 8.Vovk, V., Fedorova, V., Nouretdinov, I., Gammerman, A.: Criteria of efficiency for conformal prediction. In: Gammerman, A., Luo, Z., Vega, J., Vovk, V. (eds.) COPA 2016. LNCS(LNAI), vol. 9653, pp. 23–39. Springer, Heidelberg (2016)Google Scholar
- 9.Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, New York (2005)zbMATHGoogle Scholar