Inducing Hidden Markov Models to Model Long-Term Dependencies

  • Jérôme Callut
  • Pierre Dupont
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3720)


We propose in this paper a novel approach to the induction of the structure of Hidden Markov Models. The induced model is seen as a lumped process of a Markov chain. It is constructed to fit the dynamics of the target machine, that is to best approximate the stationary distribution and the mean first passage times observed in the sample. The induction relies on non-linear optimization and iterative state splitting from an initial order one Markov chain.


HMM topology induction Partially observable Markov models Mean first passage times Lumped Markov process State splitting algorithm 


  1. 1.
    Bengio, Y., Frasconi, P.: Diffusion of context and credit information in markovian models. Journal of Artificial Intelligence Research 3, 223–244 (1995)Google Scholar
  2. 2.
    Callut, J., Dupont, P.: A Markovian approach to the induction of regular string distributions. In: Paliouras, G., Sakakibara, Y. (eds.) ICGI 2004. LNCS (LNAI), vol. 3264, pp. 77–90. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Callut, J., Dupont, P.: Learning hidden markov models to fit long-term dependencies. Technical Report 2005-9, Université catholique de Louvain (July 2005)Google Scholar
  4. 4.
    Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological sequence analysis. Cambridge University Press, Cambridge (1998)zbMATHCrossRefGoogle Scholar
  5. 5.
    Freitag, D., McCallum, A.: Information extraction with HMMs and shrinkage. In: Proc. of the AAAI 1999 Workshop on Machine Learning for Information Extraction (1999)Google Scholar
  6. 6.
    Freitag, D., McCallum, A.: Information extraction with HMM structures learned by stochastic optimization. In: Proc. of the Seventeenth National Conference on Artificial Intelligence, AAAI, pp. 584–589 (2000)Google Scholar
  7. 7.
    Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Springer, Heidelberg (1983)zbMATHGoogle Scholar
  8. 8.
    Levin, E., Pieraccini, R.: Planar Hidden Markov modeling: from speech to optical character recognition. In: Giles, C.L., Hanton, S.J., Cowan, J.D. (eds.) Advances in Neural Information Processing Systems, vol. 5, pp. 731–738. Morgan Kaufmann, San Francisco (1993)Google Scholar
  9. 9.
    Ostendorf, M., Singer, H.: HMM topology design using maximum likelihood successive state splitting. Computer Speech and Language 11, 17–41 (1997)CrossRefGoogle Scholar
  10. 10.
    Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs (1993)Google Scholar
  11. 11.
    Senata, E.: Non-negative Matrices and Markov Chains. Springer, Heidelberg (1981)Google Scholar
  12. 12.
    Stolcke, A.: Bayesian Learning of Probabilistic Language Models. Ph. D. dissertation, University of California (1994)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jérôme Callut
    • 1
  • Pierre Dupont
    • 1
  1. 1.Department of Computing Science and Engineering, INGIUniversité catholique de LouvainLouvain-la-NeuveBelgium

Personalised recommendations