Abstract p ] This paper presents a combination of out-of-vocabulary (OOV)word modeling and rejection techniques in an attempt to accept utterances embedding a keyword and reject utterances with nonkeywords. The goal of this research is to develop a robust, task-independent Spanish keyword spotter and to develop a method for optimizing confidence thresholds for a particular context. To model OOV words, we employed both word and sub-word units as fillers, combined with n-gram language models. We also introduce a methodology for optimizing confidence thresholds to control the tradeoffs between acceptance, confirmation, and rejection of utterances. Our experiments are based on a Mexican Spanish auto-attendant system using the Speech Works recognizer release 6.5 Second Edition, in which we achieved a reduction in error of 8.9%as compared to the baseline system. Most of the error reduction is attributed to better keyword detection in utterances that contain both keywords and OOV words.
This paper presents a combination of out-of-vocabulary (OOV)word modeling and rejection techniques in an attempt to accept utterances embedding a keyword and reject utterances with nonkeywords. The goal of this research is to develop a robust, task-independent Spanish keyword spotter and to develop a method for optimizing confidence thresholds for a particular context. To model OOV words, we employed both word and sub-word units as fillers, combined with n-gram language models. We also introduce a methodology for optimizing confidence thresholds to control the tradeoffs between acceptance, confirmation, and rejection of utterances. Our experiments are based on a Mexican Spanish auto-attendant system using the Speech Works recognizer release 6.5 Second Edition, in which we achieved a reduction in error of 8.9%as compared to the baseline system. Most of the error reduction is attributed to better keyword detection in utterances that contain both keywords and OOV words.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lleida, J. B., Salavedra, J., Bonafonte, A., Monte, E., Martinez, A.: Out-Of Vocabulary Word Modeling and Rejection for Keyword Spotting. In Proc.EU ROSPEECH, pp.1265–1268, 1993.
Manos, A.: A Study on Out-Of-Vocabulary Word Modeling for a Segment-Based Keyword Spotting System. Master Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, April 1996.
Bazzi, I., Glass, J.: Learning Units for Domain-Independent Out-of-Vocabulary Word Modeling. In Proc.EUROSPEECH, Aalborg, Denmark, September 2001.
Hazen, J. T., Bazzi, I.: A Comparison and Combination of Methods for OOV Detection and Word Confidence Scoring. In Proc. ICASSP, Salt Lake City, USA, May 2001.
Qing, G., Yonghong, Y., Zhiwei, L, Baosshen, Y., Quingwei, Z., Juian, L.: Keyword Spotting in Auto-Attendant System.In Proc.ICSLP,Beijing,China, October 2000.
Benitez, C. M., Rubio, A., Garcia, P., Verdejo, D.J.: Word Verification Using Confidence Measures in Speech Recognition. In Proc.ICASSP, Istanbul,Turkey, June 2000.
Jouvet, D, Bartkova, K. Mercier, G.: Hypothesis Dependent Threshold Setting for Improved Out-Of-Vocabulary Data Rejection. In Proc.ICASSP, Phoenix, Arizona, USA, March 1999.
Bouwman, G., Sturm, J., Boves, L.: Effect of OOV rates on Keyphrase Rejection Schemes. In Proc.EUROSPEECH, Aalborg, Denmark, September 2001.
Zhilong, H., Schalkwyk, J., Barnard, E., Cole, R.: Speech Recognition Using Syllable-Like Units. In Proc.ICSLP, Philadelphia, USA, October 1996.
Zue, V., Glass, J., Phillips, M., Sennef, S.: The SUMMIT Speech Recognition System: Phonological Modeling and Lexical Access. In Proc.ICASSP, pp.49–52, 1990.
Cuayáhuitl, H.: Técnicas para Mejorar el Reconocimiento de Voz en Presencia de Habla Fuera del Vocabulario.Master Thesis, Universidad de las Américas Puebla, Cholula, Puebla, Mexico, May 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Heriberto, C., Ben, S. (2002). Out-of-Vocabulary Word Modeling and Rejection for Spanish Keyword Spotting Systems. In: Coello Coello, C.A., de Albornoz, A., Sucar, L.E., Battistutti, O.C. (eds) MICAI 2002: Advances in Artificial Intelligence. MICAI 2002. Lecture Notes in Computer Science(), vol 2313. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46016-0_17
Download citation
DOI: https://doi.org/10.1007/3-540-46016-0_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43475-7
Online ISBN: 978-3-540-46016-9
eBook Packages: Springer Book Archive