Scaling Acoustic and Language Model Probabilities in a CSR System

Varona, Amparo; Torres, M. Inés

doi:10.1007/978-3-540-30463-0_49

Amparo Varona¹⁹ &
M. Inés Torres¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3287))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

1167 Accesses

Abstract

It is well known that a direct integration of acoustic and language models (LM) into a Continuous Speech Recognition (CSR) system leads to low performances. This problem has been analyzed in this work as a practical numerical problem. There are two ways to get optimum system performances: scaling acoustic or language model probabilities. Both approaches have been analyzed from a numerical point of view. They have also been experimentally tested on a CSR system over two Spanish databases. These experiments show similar reductions in word recognition rates but very different computational cost behaviors. They also show that the values of scaling factors required to get optimum CSR systems performances are closely related to other heuristic parameters in the system like the beam search width.

This work has been partially supported by the Spanish CICYT under grant TIC2002-04103-C03-02 and by the Basque Country University (00224.310-13566/2001)

Download to read the full chapter text

Chapter PDF

Weighting Schemes Based Discriminative Model Combination Technique for Robust Speech Recognition

The CHiME Challenges: Robust Speech Recognition in Everyday Environments

Enhancing Performance of Noise-Robust Gujarati Language ASR Utilizing the Hybrid Acoustic Model and Combined MFCC + GTCC Feature

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Jelinek, F.: Five speculations (and a divertimento) on the themes of h. bourlard, h. hermansky and n. morgan. Speech Communication 18, 242–246 (1996)
Article Google Scholar
Rubio, J.A., Diaz-Verdejo, J.E., García, P., Segura, J.C.: On the influence of of frame-asynchronous grammar scoring in a csr system. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. II, pp. 895–899 (1997)
Google Scholar
Ogawa, A., Takeda, K., Itakura, F.: Balancing acoustic and linguistic probabilities. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. II, pp. 181–185 (1998)
Google Scholar
Varona, A., Torres, I.: High and low smoothed lms in a csr system. In: Brauer, W. (ed.) Progress in Pattern Recognition Speech and Image Analysis. Computer. LNCS, vol. 1, pp. 236–243. Springer, Heidelberg (1973)
Google Scholar
Díaz, J., Rubio, A., Peinado, A., Segarra, E., Prieto, N.: F.Casacuberta: Albayzin: a task-oriented spanish speech corpus. In: First Int. Conf. on language resources and evaluation, vol. II, pp. 497–501 (1998)
Google Scholar
Rodríguez, L., Torres, I., Varona, A.: Evaluation of sublexical and lexical models of acoustic disfluencies for spontaneous speech recognition in spanish. In: Proc. of European Conference on Speech Technology, vol. 3, pp. 1665–1668 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Electricidad y Electrónica, Fac. de Ciencia y Tecnología, UPV/EHU, Apartado 644, 48080, Bilbao, Spain
Amparo Varona & M. Inés Torres

Authors

Amparo Varona
View author publications
You can also search for this author in PubMed Google Scholar
M. Inés Torres
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. System Engineering and Automation, Universitat Politècnica de Catalunya (UPC), Barcelona, Spain
Alberto Sanfeliu
Computer Science Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, 72840, Sta. Maria Tonantzintla, Puebla, Mexico
José Francisco Martínez Trinidad
Computer Science Department, National Institute of Astrophysics, Optics and Electronics, (INAOE), Luis Enrique Erro No.1, 72840, Sta. Maria Tonantzintla, Puebla, Mexico
Jesús Ariel Carrasco Ochoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Varona, A., Torres, M.I. (2004). Scaling Acoustic and Language Model Probabilities in a CSR System. In: Sanfeliu, A., Martínez Trinidad, J.F., Carrasco Ochoa, J.A. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2004. Lecture Notes in Computer Science, vol 3287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30463-0_49

Download citation

DOI: https://doi.org/10.1007/978-3-540-30463-0_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23527-9
Online ISBN: 978-3-540-30463-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Scaling Acoustic and Language Model Probabilities in a CSR System

Abstract

Chapter PDF

Similar content being viewed by others

Weighting Schemes Based Discriminative Model Combination Technique for Robust Speech Recognition

The CHiME Challenges: Robust Speech Recognition in Everyday Environments

Enhancing Performance of Noise-Robust Gujarati Language ASR Utilizing the Hybrid Acoustic Model and Combined MFCC + GTCC Feature

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Scaling Acoustic and Language Model Probabilities in a CSR System

Abstract

Chapter PDF

Similar content being viewed by others

Weighting Schemes Based Discriminative Model Combination Technique for Robust Speech Recognition

The CHiME Challenges: Robust Speech Recognition in Everyday Environments

Enhancing Performance of Noise-Robust Gujarati Language ASR Utilizing the Hybrid Acoustic Model and Combined MFCC + GTCC Feature

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation