Abstract
All algorithms for the evaluation and decoding of HMMs or n-gram models presented so far represent the basic methods only for handling these models. In order to achieve the efficiency necessary in practical applications, these methods have to be extended and modified such that as many “unnecessary” computations as possible are avoided. This can be achieved by a suitable reorganization of data structures involved or by explicitly discarding “less promising” solutions early during the evaluation process.
This chapter gives an overview over the most important methods for the efficient evaluation of Markov models. At the beginning methods for speeding up the computation of output probability densities on the basis of mixture models are presented. Then the standard method for the efficient application of Viterbi decoding to larger HMMs is described. The following section presents techniques for efficiently generating first-best segmentation result as well as alternative solutions organized in the form of so-called n-best lists. Subsequently, methods are explained that apply techniques of search space pruning for the acceleration of the parameter training of HMMs. The chapter concludes with a section on tree-like model structures, which can be used both in HMMs and in n-gram models in order to increase the efficiency when processing these models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The actual choice of this lower bound for vanishing density values is actually a quite critical parameter for the overall performance of the system. An empirical investigation of its influence is, for example, presented in [156].
- 2.
Whether or not a density candidate is promising can be determined by a pruning strategy similar to the beam-search method presented in Sect. 10.2.
- 3.
Despite the enormous practical relevance of the method hardly any descriptions of it can be found in the relevant monographs. Instead, the interested reader is referred to the original work of Lowerre [185] which, unfortunately, is rather difficult to access.
- 4.
This computation scheme is conceptually similar to the propagation of tokens representing partial path hypotheses in the so-called token-passing framework [326].
- 5.
This additional back-linking can, for example, be represented as individual path identifiers (cf. [326]) or as a pair of state-identifier and end-time that augments the information associated with active states during model decoding.
- 6.
In fact, the segmentation of a larger sample set into partial observation sequences already applies this principle procedure.
- 7.
A further considerable compression of n-gram models can be achieved if rare events are neglected. For singletons—i.e. n-grams observed only once—this results from the application of absolute discounting with a discounting constant β=1. Additionally, also parameters of other rarely observed n-grams can be eliminated from the model if the modeling quality is not of primary concern, but a representation as compact as possible should be generated.
- 8.
If the more general distributions are not combined with the special model by interpolation but via backing-off, the normalization factor K y must be taken into account (cf. Eq. (6.8)).
References
Bocchieri, E.: Vector quantization for efficient computation of continuous density likelihoods. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Minneapolis, vol. 2, pp. 692–695 (1993)
Chow, Y.-L., Schwartz, R.: The N-best algorithm. In: Speech and Natural Language Workshop, pp. 199–202. Morgan Kaufmann, San Mateo (1989)
Davenport, J., Nguyen, L., Matsoukas, S., Schwartz, R., Makhoul, J.: The 1998 BBN BYBLOS 10x real time system. In: Proc. DARPA Broadcast News Workshop, Herndon, VA (1999)
Deng, L.: The semi-relaxed algorithm for estimating parameters of Hidden Markov Models. Comput. Speech Lang. 5(3), 231–236 (1991)
Fritsch, J., Rogina, I.: The bucket box intersection (BBI) algorithm for fast approximative evaluation of diagonal mixture Gaussians. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Atlanta, vol. 1, pp. 837–840 (1996)
Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall, Englewood Cliffs (2001)
Huang, X.D., Ariki, Y., Jack, M.A.: Hidden Markov Models for Speech Recognition. Information Technology Series, vol. 7. Edinburgh University Press, Edinburgh (1990)
Knill, K.M., Gales, M.J.F., Young, S.J.: Use of Gaussian selection in large vocabulary continuous speech recognition using HMMs. In: International Conference on Spoken Language Processing, Philadelphia, PA, Oct 1996, vol. 1, pp. 470–473 (1996)
Lowerre, B., Reddy, R.: The Harpy speech understanding system. In: Lea, W.A. (ed.) Trends in Speech Recognition, pp. 340–360. Prentice-Hall, Englewood Cliffs (1980)
Lowerre, B.T.: The HARPY speech recognition system. PhD thesis, Carnegie-Mellon University, Department of Computer Science, Pittsburgh (1976)
Ney, H., Haeb-Umbach, R., Tran, B.H., Oerder, M.: Improvements in beam search for 10000-word continuous speech recognition. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, San Francisco, vol. 1, pp. 9–12 (1992)
Ney, H., Ortmanns, S.: Dynamic programming search for continuous speech recognition. IEEE Signal Process. Mag. 16(5), 64–83 (1999)
Nilsson, N.J.: Artificial Intelligence: A New Synthesis. Morgan Kaufmann, San Francisco (1998)
Ortmanns, S., Firzlaff, T., Ney, H.: Fast likelihood computation methods for continuous mixture densities in large vocabulary speech recognition. In: Proc. European Conf. on Speech Communication and Technology, Rhodes, vol. 1, pp. 139–142 (1997)
Ortmanns, S., Ney, H.: Look-ahead techniques for fast beam search. Comput. Speech Lang. 14, 15–32 (2000)
Paul, D.: An investigation of Gaussian shortlists. In: Furui, S., Huang, B.H., Chu, W. (eds.) Proc. Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society, Piscataway (1997)
Schukat-Talamazzini, E.G., Bielecki, M., Niemann, H., Kuhn, T., Rieck, S.: A non-metrical space search algorithm for fast Gaussian vector quantization. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Minneapolis, pp. 688–691 (1993)
Schwartz, R., Austin, S.: A comparison of several approximate algorithms for finding multiple (n-best) sentence hypotheses. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Toronto, pp. 701–704 (1991)
Schwartz, R., Chow, Y.-L.: The n-best algorithms: an efficient and exact procedure for finding the N most likely sentence hypotheses. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, vol. 1, pp. 81–84 (1990)
Soong, F.K., Huang, E.-F.: A tree-trellis based fast search for finding the n best sentence hypotheses in continuous speech recognition. In: Speech and Natural Language Workshop, pp. 12–19. Morgan Kaufmann, Hidden Valley (1990)
Wessel, F., Ortmanns, S., Ney, H.: Implementation of word based statistical language models. In: Proc. SQEL Workshop on Multi-Lingual Information Retrieval Dialogs, Plzen, pp. 55–59 (1997)
Young, S.J., Russell, N.H., Thornton, J.H.S.: Token passing: a simple conceptual model for connected speech recognition systems. Technical report, Cambridge University Engineering Department (1989)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag London
About this chapter
Cite this chapter
Fink, G.A. (2014). Efficient Model Evaluation. In: Markov Models for Pattern Recognition. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-6308-4_10
Download citation
DOI: https://doi.org/10.1007/978-1-4471-6308-4_10
Publisher Name: Springer, London
Print ISBN: 978-1-4471-6307-7
Online ISBN: 978-1-4471-6308-4
eBook Packages: Computer ScienceComputer Science (R0)