Abstract
We present a simplified EM algorithm and an approximate algorithm for training hierarchical hidden Markov models (HHMMs), an extension of hidden Markov models. The EM algorithm we present is proved to increase the likelihood of training sentences at each iteration unlike the existing algorithm called the generalized Baum-Welch algorithm. The approximate algorithm is applicable to tasks like robot navigation in which we observe sentences and train parameters simultaneously. These algorithms and their derivations are simplified by making use of stochastic context-free grammars.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baker, J.: Trainable grammars for speech recognition. In Speech CommunicationPapers for the 97th Meeting of the Acoustical Society of America, pp. 547–550,1979.
Baldi, P. and Chauvin, Y.: Smooth on-line learning algorithms for hidden Markovmodels. Neural Computation, 6 (2), pp. 305–316, 1994.
Bengio, Y. and Fransconi, P.: An input-output HMM architecture. Advances inNeural Information Processing Systems 7, pp. 427–434, 1995.
Charniak, E.: Statistical language learning, The MIT Press, 1993.
Dempster, A., Laird, N., and Rubin, D.: Maximum likelihood from incomplete dataviathe EM algorithm. Journal of the Royal Statistical Society Series B, 39, pp.1–38, 1977.
Fine, S., Singer, Y., and Tishby, N.: The hierarchical hidden Markov model: analysisand applications. Machine Learning, 32, pp. 41–62, 1998.
Ghahramani, Z. and Jordan, M. I.: Factorial hidden Markov models. MachineLearning, 29, pp. 245–274, 1997.
Krogh, A., Brown, M., Mian, I. S., Hughey, R., Sjölander, K., and Haussler, D.:Hidden Markov models in computational biology: Application to protein modeling.Journal of Molecular Biology, 235, pp. 1501–1531, 1994.
Lafferty, J. D.: A derivation of the inside-outside algorithm from the EM algorithm.IBM research report, IBM T. J. Watson Research Center, 1993.
Lari, K. and Young, S. J.: The estimation of stochastic context-free grammarsusing the Inside-Outside algorithm. Computer Speech and Language, 4, pp. 35–56,1990.
McCallum, A., Freitag, D., and Pereira, F.: Maximum entropy Markov modelsfor information extraction and segmentation. In Proceedings of 17th InternationalConference on Machine Learning, pp. 591–598, 2000.
Rabiner, L. R.: A tutorial on hidden Markov models and selected applications inspeech recognition. Proceedings of the IEEE, 77, pp. 257–284, 1989.
Stolcke, A.: An efficient probabilistic context-free parsing algorithm that computesprefix probabilities. Computational Linguistics, 21, pp. 165–201, 1995.
Theocharous, G., Rohanimanesh, K., and Mahadevan, S.: Learning and planningwith hierarchical stochastic models for robot navigation. ICML2000 Workshop onMachine Learning of Spatial Knowledge, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ueda, N., Sato, T. (2001). Simplified Training Algorithms for Hierarchical Hidden Markov Models. In: Jantke, K.P., Shinohara, A. (eds) Discovery Science. DS 2001. Lecture Notes in Computer Science(), vol 2226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45650-3_34
Download citation
DOI: https://doi.org/10.1007/3-540-45650-3_34
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42956-2
Online ISBN: 978-3-540-45650-6
eBook Packages: Springer Book Archive