Abstract
In this paper we present the creation of a Mexican Spanish version of the CMU Sphinx-III speech recognition system. We trained acoustic and N-gram language models with a phonetic set of 23 phonemes. Our speech data for training and testing was collected from an auto-attendant system under telephone environments. We present experiments with different language models. Our best result scored an overall error rate of 6.32%. Using this version is now possible to develop speech applications for Spanish speaking communities. This version of the CMU Sphinx system is freely available for non-commercial use under request.
Chapter PDF
Similar content being viewed by others
Keywords
- Automatic Speech Recognition
- Automatic Gain Control
- Speech Data
- Speech Recognition System
- Word Error Rate
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Huerta, J.M., Thayer, E., Ravishankar, M., Stern, R.M.: The Development of the 1997 CMU Spanish Broadcast News Transcription System. In: Proc. of the DARPA Broadcast News Transcription and Understanding Workshop, Landsdowne, Virginia (February 1998)
Huerta, J.M., Chen, S.J., Stern, R.M.: The 1998 Carnegie Mellon University Sphinx-III Spanish Broadcast News Transcription System. In: The proceedigns of the DARPA Broadcast News Transcription and Understanding Workshop, Herndon, Virginia (March 1999)
Cuayáhuitl, H., Serridge, B.: Out-Of-Vocabulary Word Modeling and Rejection for Spanish Keyword Spotting Systems. In: Coello Coello, C.A., de Albornoz, Á., Sucar, L.E., Battistutti, O.C. (eds.) MICAI 2002. LNCS (LNAI), vol. 2313, pp. 158–167. Springer, Heidelberg (2002)
Hwang, M.-Y.: Subphonetic Acoustic Modeling for Speaker-Independent Continuous Speech Recognition. Ph.D. thesis, Carnegie Mellon University (1993)
Hieronymus, L.J.: ASCII Phonetic Symbols for World’s Languages: worldbet. Technical report, Bell Labs (1993)
Clarkson, P., Rosenfeld, R.: Statistical Language Modeling Using the CMU Cambridge Toolkit. In: The proceedings of Eurospeech, Rodhes, Greece, pp. 2707–2710 (1997)
Farfán, F., Cuayáhuitl, H., Portilla, A.: Evaluating Dialogue Strategies in a Spoken Dialogue System for Email. In: The proceedings of the IASTED Artificial Intelligence and Applications, September 2003, ACTA Press, Manalmádena (2003)
CMU Robust Speech Group, Carnegie Mellon University, http://www.cs.cmu.edu/afs/cs/user/robust/www/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Varela, A., Cuayáhuitl, H., Nolazco-Flores, J.A. (2003). Creating a Mexican Spanish Version of the CMU Sphinx-III Speech Recognition System. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds) Progress in Pattern Recognition, Speech and Image Analysis. CIARP 2003. Lecture Notes in Computer Science, vol 2905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24586-5_30
Download citation
DOI: https://doi.org/10.1007/978-3-540-24586-5_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20590-6
Online ISBN: 978-3-540-24586-5
eBook Packages: Springer Book Archive