PAC-Learning of Markov Models with Hidden State

  • Ricard Gavaldà
  • Philipp W. Keller
  • Joelle Pineau
  • Doina Precup
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4212)

Abstract

The standard approach for learning Markov Models with Hidden State uses the Expectation-Maximization framework. While this approach had a significant impact on several practical applications (e.g. speech recognition, biological sequence alignment) it has two major limitations: it requires a known model topology, and learning is only locally optimal. We propose a new PAC framework for learning both the topology and the parameters in partially observable Markov models. Our algorithm learns a Probabilistic Deterministic Finite Automata (PDFA) which approximates a Hidden Markov Model (HMM) up to some desired degree of accuracy. We discuss theoretical conditions under which the algorithm produces an optimal solution (in the PAC-sense) and demonstrate promising performance on simple dynamical systems.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Carrasco, R., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: Carrasco, R.C., Oncina, J. (eds.) ICGI 1994. LNCS, vol. 862. Springer, Heidelberg (1994)Google Scholar
  2. Clark, A., Thollard, F.: PAC-learnability of Probabilistic Deterministic Finite State Automata. Journal of Machine Learning Research 5 (2004)Google Scholar
  3. Dupont, P., Denis, F., Esposito, Y.: Links between Probabilistic Automata and Hidden Markov Models. Pattern Recognition 38(9) (2005)Google Scholar
  4. Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.J.: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge (1998)MATHCrossRefGoogle Scholar
  5. Holmes, M., Isbell, C.: Looping Suffix Tree-Based Inference of Partially Observable Hidden State. In: Proceedings of ICML (2006)Google Scholar
  6. Jaeger, H., Zhao, M., Kolling, A.: Efficient estimation of OOMs. In: Proceedings of NIPS (2005)Google Scholar
  7. Kearns, M., Mansour, Y., Ron, D., Rubinfeld, R., Schapire, R.E., Sellie, L.: On the learnability of discrete distributions. In: ACM Symposium on the Theory of Computing (1995)Google Scholar
  8. Lipton, R.J., Naughton, J.F.: Query size estimation by adaptive sampling. J. Computer and System Sciences 51, 18–25 (1995)MATHCrossRefMathSciNetGoogle Scholar
  9. Ostendorf, M., Singer, H.: HMM topology design using maximum likelihood successive state splitting. Computer Speech and Language 11 (1997)Google Scholar
  10. Rabiner, L.R.: A tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77(2) (1989)Google Scholar
  11. Ron, D., Singer, Y., Tishby, N.: On the learnability and usage of acyclic robabilistic finite automata. In: Proceedings of COLT (1995)Google Scholar
  12. Rosencrantz, M., Gordon, G., Thrun, S.: Learning Low Dimensional Predictive Representations. In: Proceedings of ICML (2004)Google Scholar
  13. Singh, S., Littman, M.L., Jong, N.K., Pardoe, D., Stone, P.: Learning Predictive State Representations. In: Proceedings of ICML (2003)Google Scholar
  14. Stolcke, A., Omohundro, S.M.: Hidden Markov Model Induction by Bayesian Model Merging. In: Proceedings of NIPS (1993)Google Scholar
  15. Thollard, F., Dupont, P., de la Higuera, C.: Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality. In: Proceedings of ICML (2000)Google Scholar
  16. Valiant, L.: A theory of the learnable. Communications of the ACM 27(11) (1984)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ricard Gavaldà
    • 1
  • Philipp W. Keller
    • 2
  • Joelle Pineau
    • 2
  • Doina Precup
    • 2
  1. 1.Universitat Politècnica de CatalunyaBarcelonaSpain
  2. 2.School of Computer ScienceMcGill UniversityMontrealCanada

Personalised recommendations