Abstract
Word pre-selection by means of partial phonetic descriptions is a method of lexical access in speech recognition systems for very large vocabularies that is being receiving particular attention. It can be effective provided that segmentation errors are taken into account within the lexical access procedure, and that the resulting candidate word set is reasonably sized. As errors in the segmentation of input utterances are unavoidable, even if a limited number of phonetic categories must be discriminated, a lattice of segmentation hypotheses is generated. Word pre-selection is obtained, therefore, by matching a lattice of phonetic hypotheses against a graph structure that represents a generic word. A Dynamic Programming procedure is introduced that solves this problem. A sub-optimal solution and heuristic constraints have been investigated that improve the algorithm efficiency. In the second step, word verification, a detailed ‘representation of the phonemic structure of word candidates is used for estimating the most likely words. Words are modeled by sequences of sub-word units represented by Hidden Markov Models and a beam-search Viterbi algorithm estimates their likelihood. Experimental results on large vocabularies demonstrate the effectiveness of the method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Billi R., Massia G., Nesti F., ‘Word Preselection for Large Vocabulary Speech Recognition’, Int. Conf. on Acoustics, Speech and Signal Processing, pp.23.6.1–23.6.4, (1986).
Cravero M., Pieraccini R., Raineri F., ‘Definition and Evaluation of Phonetic Units for Speech Recognition by Hidden Markov Models’, Int. Conf. on Acoustics, Speech and Signal Processing, Tokyo, pp.42.3.1–42.3.4 (1986).
Giordana A., Laface P., Saitta L., ‘Discrimination of Words in a Large Vocabulary using Phonetic Descriptions’, Int. Journal of Man-Machine Studies, n.24,pp.453–473 (1986).
Gupta V.N., Lenning M., Mermelstein P. ‘Integration of Acoustic Information in a Large Vocabulary Word Recognizer’, Int. Conf. on Acoustics, Speech and Signal Processing, Dallas pp.17.2.1–17.2.4, (1987).
Huttenlocher D.P., Zue V.W., ‘A Model of Lexical Access from Partial Phonetic Information’, Int. Conf. on Acoustics, Speech and Signal Processing, pp.26.4.1–26.4.4, (1984).
Kaneko T., Dixon N.R., ‘A Hierarchical Decision Approach to Large-Vocabulary Discrete Utterance Recognition’, IEEE Trans, on Acoustic Speech and Signal Processing, Vol. ASSP-31, n.5, pp.1061,1066, (1983).
Koonen T. et al., ‘On-Line Recognition of Spoken Words from a Large Vocabulary’, Information Sciences, n.22, pp. 3–30, (1984)
Laface P., Micca G., Pieraccini R., ‘Experimental Results on a Large Lexicon Access Task’, Int. Conf. on Acoustics, Speech and Signal Processing, Dallas pp.20.4.1–20.4.4 (1987).
Lagger H., Waibel A., ‘A Coarse Phonetic Knowledge Source for Template Independent Large Vocabulary Word Recognition’, Int. Conf. on Acoustics, Speech and Signal Processing, pp.2.7.1–2.7.4, (1985).
Levinson S.E, Rabiner L.R., Sondhi M.M., ‘Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech Recognition’, Bell System Technical Journal, Vol.62, n.4, Part 1, pp.1035–1074, (1983).
Pisoni D.B., Nusbaum H.C., Luce P.A., Slowiaczek L.M., ‘Speech Perception, Word Recognition and the Structure of the Lexicon’, Speech Cotamunication, Vol.4, n.1–3, pp.75–96 (1985)
Schukat-Talamazzini G., Niemann H. ‘Generating Word Hypotheses in Continuous Speech’, Int. Conf. on Acoustics, Speech and Signal Processing, Tokyo pp.30.2.1–30.2.4 (1986).
Zelinsky R., Class F., ‘A Segmentation Algorithm for Connetted Word Recognition Based on Estimation Principles’, IEEE Trans, on ASSP, Vol.31, No.4 (1983).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1988 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Laface, P., Micca, G., Pieraccini, R. (1988). Recognition of Words in Very Large Vocabulary. In: Niemann, H., Lang, M., Sagerer, G. (eds) Recent Advances in Speech Understanding and Dialog Systems. NATO ASI Series, vol 46. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-83476-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-83476-9_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-83478-3
Online ISBN: 978-3-642-83476-9
eBook Packages: Springer Book Archive