Skip to main content

Out-of-Vocabulary Word Modeling and Rejection for Spanish Keyword Spotting Systems

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2313))

Abstract p ] This paper presents a combination of out-of-vocabulary (OOV)word modeling and rejection techniques in an attempt to accept utterances embedding a keyword and reject utterances with nonkeywords. The goal of this research is to develop a robust, task-independent Spanish keyword spotter and to develop a method for optimizing confidence thresholds for a particular context. To model OOV words, we employed both word and sub-word units as fillers, combined with n-gram language models. We also introduce a methodology for optimizing confidence thresholds to control the tradeoffs between acceptance, confirmation, and rejection of utterances. Our experiments are based on a Mexican Spanish auto-attendant system using the Speech Works recognizer release 6.5 Second Edition, in which we achieved a reduction in error of 8.9%as compared to the baseline system. Most of the error reduction is attributed to better keyword detection in utterances that contain both keywords and OOV words.

This paper presents a combination of out-of-vocabulary (OOV)word modeling and rejection techniques in an attempt to accept utterances embedding a keyword and reject utterances with nonkeywords. The goal of this research is to develop a robust, task-independent Spanish keyword spotter and to develop a method for optimizing confidence thresholds for a particular context. To model OOV words, we employed both word and sub-word units as fillers, combined with n-gram language models. We also introduce a methodology for optimizing confidence thresholds to control the tradeoffs between acceptance, confirmation, and rejection of utterances. Our experiments are based on a Mexican Spanish auto-attendant system using the Speech Works recognizer release 6.5 Second Edition, in which we achieved a reduction in error of 8.9%as compared to the baseline system. Most of the error reduction is attributed to better keyword detection in utterances that contain both keywords and OOV words.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lleida, J. B., Salavedra, J., Bonafonte, A., Monte, E., Martinez, A.: Out-Of Vocabulary Word Modeling and Rejection for Keyword Spotting. In Proc.EU ROSPEECH, pp.1265–1268, 1993.

    Google Scholar 

  2. Manos, A.: A Study on Out-Of-Vocabulary Word Modeling for a Segment-Based Keyword Spotting System. Master Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, April 1996.

    Google Scholar 

  3. Bazzi, I., Glass, J.: Learning Units for Domain-Independent Out-of-Vocabulary Word Modeling. In Proc.EUROSPEECH, Aalborg, Denmark, September 2001.

    Google Scholar 

  4. Hazen, J. T., Bazzi, I.: A Comparison and Combination of Methods for OOV Detection and Word Confidence Scoring. In Proc. ICASSP, Salt Lake City, USA, May 2001.

    Google Scholar 

  5. Qing, G., Yonghong, Y., Zhiwei, L, Baosshen, Y., Quingwei, Z., Juian, L.: Keyword Spotting in Auto-Attendant System.In Proc.ICSLP,Beijing,China, October 2000.

    Google Scholar 

  6. Benitez, C. M., Rubio, A., Garcia, P., Verdejo, D.J.: Word Verification Using Confidence Measures in Speech Recognition. In Proc.ICASSP, Istanbul,Turkey, June 2000.

    Google Scholar 

  7. Jouvet, D, Bartkova, K. Mercier, G.: Hypothesis Dependent Threshold Setting for Improved Out-Of-Vocabulary Data Rejection. In Proc.ICASSP, Phoenix, Arizona, USA, March 1999.

    Google Scholar 

  8. Bouwman, G., Sturm, J., Boves, L.: Effect of OOV rates on Keyphrase Rejection Schemes. In Proc.EUROSPEECH, Aalborg, Denmark, September 2001.

    Google Scholar 

  9. Zhilong, H., Schalkwyk, J., Barnard, E., Cole, R.: Speech Recognition Using Syllable-Like Units. In Proc.ICSLP, Philadelphia, USA, October 1996.

    Google Scholar 

  10. Zue, V., Glass, J., Phillips, M., Sennef, S.: The SUMMIT Speech Recognition System: Phonological Modeling and Lexical Access. In Proc.ICASSP, pp.49–52, 1990.

    Google Scholar 

  11. Cuayáhuitl, H.: Técnicas para Mejorar el Reconocimiento de Voz en Presencia de Habla Fuera del Vocabulario.Master Thesis, Universidad de las Américas Puebla, Cholula, Puebla, Mexico, May 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Heriberto, C., Ben, S. (2002). Out-of-Vocabulary Word Modeling and Rejection for Spanish Keyword Spotting Systems. In: Coello Coello, C.A., de Albornoz, A., Sucar, L.E., Battistutti, O.C. (eds) MICAI 2002: Advances in Artificial Intelligence. MICAI 2002. Lecture Notes in Computer Science(), vol 2313. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46016-0_17

Download citation

  • DOI: https://doi.org/10.1007/3-540-46016-0_17

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43475-7

  • Online ISBN: 978-3-540-46016-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics