Encyclopedia of Operations Research and Management Science

2001 Edition
| Editors: Saul I. Gass, Carl M. Harris

Hidden Markov models

  • Yariv Ephraim
Reference work entry
DOI: https://doi.org/10.1007/1-4020-0611-X_417


Hidden Markov models (HMMs) constitute a family of versatile statistical models that have proven useful in many applications. HMMs were introduced in their full generality in 1966 by Baum and Petrie (Baum and Petrie, 1966; Baum et al., 1970). Baum, Petrie and other colleagues at the Institute for Defense Analysis also developed and analyzed a maximum likelihood (ML) procedure for efficient estimation of the HMM parameters from a training sequence. This procedure turned out to be an instance of the now well known EM (Expectation-Maximization) algorithm of Dempster, Laird and Rubin (1977). A form of HMM, referred to as a Markov Source, was introduced as early as 1948 by Shannon in developing a model for the English language (Shannon, 1948).

Baum et al. (1970)referred to HMMs as probabilistic functions of Markov chains. Indeed, an HMM process comprises a Markov chain whose states are associated with some probability distributions. For example, the Markov states may be...

This is a preview of subscription content, log in to check access.


  1. [1]
    Baum, L.E. and Petrie, T. (1966). “Statistical inference for probabilistic functions of finite state Markov chains,” Ann. Math. Statist., 37, 1554–1563.Google Scholar
  2. [2]
    Baum, L.E., Petrie, T., Soules, G., and Weiss, N. (1970). “A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains,” Ann. Math. Statist., 41, 164–171.Google Scholar
  3. [3]
    Couvreur, C. (1996). Hidden Markov Models and Their Mixtures, Department of Mathematics, Université Catholique de Louvain, Belgium [http://thor.fpms.ac.be/~couvreur/listpub.html].Google Scholar
  4. [4]
    Dempster, A.P., Laird, N.M., and Rubin, D.B. (1977). “Maximum likelihood from incomplete data via the EM algorithm,” Jl. Royal Stat. Soc. B, 39, 1–38.Google Scholar
  5. [5]
    Ferguson, J.D., editor (1980). Proc. of the Symposium on the applications of hidden Markov models to text and speech. IDA-CRD, Princeton, New Jersey.Google Scholar
  6. [6]
    Grimmett, G.R. and Stirzaker, D.R. (1995). Probability and Random Processes. Oxford Science Publications, Oxford, UK.Google Scholar
  7. [7]
    Jelinek, F. (1974). “Continuous speech recognition by statistical methods,” Proc. IEEE, 64, 532–556.Google Scholar
  8. [8]
    Leroux, B.G. (1992). “Maximum likelihood estimation for hidden Markov models,” Stochastic Processes and Their Applications, 40, 127–143.Google Scholar
  9. [9]
    Rabiner, L.R. (1989). “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, 257–286. Google Scholar
  10. [10]
    Shannon, C.E. (1948). “A mathematical theory of communication,” Bell Syst. Tech. Jl., 27, 379–423, 623–656.Google Scholar
  11. [11]
    Wu, C.F.J. (1983). “On the convergence properties of the EM algorithm, Ann. Statist., 11, 95–103.Google Scholar

Copyright information

© Kluwer Academic Publishers 2001

Authors and Affiliations

  • Yariv Ephraim
    • 1
  1. 1.George Mason UniversityFairfaxUSA