Skip to main content
Log in

Scaling Smoothed Language Models

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

In Continuous Speech Recognition (CSR) systems a Language Model (LM) is required to represent the syntactic constraints of the language. Then a smoothing technique needs to be applied to avoid null LM probabilities. Each smoothing technique leads to a different LM probability distribution. Test set perplexity is usually used to evaluate smoothing techniques but the relationship with acoustic models is not taken into account. In fact, it is well-known that to obtain optimum CSR performances a scaling exponential parameter must be applied over LMs in the Bayes’ rule. This scaling factor implies a new redistribution of smoothed LM probabilities. The shape of the final probability distribution is due to both the smoothing technique used when designing the language model and the scaling factor required to get the optimum system performance when integrating the LM into the CSR system. The main object of this work is to study the relationship between the two factors, which result in dependent effects. Experimental evaluation is carried out over two Spanish speech application tasks. Classical smoothing techniques representing very different degrees of smoothing are compared. A new proposal, Delimited discounting, is also considered. The results of the experiments showed a strong dependence between the amount of smoothing given by the smoothing technique and the way that the LM probabilities need to be scaled to get the best system performance, which is perplexity independent in many cases. This relationship is not independent of the task and available training data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bimbot, F., El-Béze, M., Igounet, S., Jardino, M., Smaili, K., and Zitouni, I. (2001). An alternative schme for perplexity estimation and its assessment for the evaluation of language models. Computer, Speech and Language, 15(1):1–13.

    Article  Google Scholar 

  • Bonafonte, A., Aibar, P., Castell, N., Lleida, E., Mariño, J., Sanchis, E., and Torres, I. (2000). Desarrollo de un sistema de diálogo oral en dominios restringidos. In Primeras jornadas en Tecnología del Habla.

  • Bordel, G., Torres, I., and Vidal, E. (1994). Back-off smoothing in a syntactic approach to Language Modelling. In Proc. International Conference on Speech and Language Processing, pp. 851–854.

  • Chandhuri, R. and Booth, T. (1986). Approximating Grammar Probabilities: Solution of a Conjecture. Journal ACM, 33(4):702–705.

    Article  Google Scholar 

  • Chen, F. S., and Goodman, J. (1999). An empirical study of smoothing techniques for language modeling. Computer, Speech and Language, 13:359–394.

    Article  Google Scholar 

  • Chen, F. S., and Rosenfeld, R. (2000). A survey of smoothing for me models. IEEE Transactions on Speech and Audio Processing, 8(1):37–50.

    Article  Google Scholar 

  • Clarkson, P., and Robinson, R. (1999). Improved language modelling through better language model evaluation measures. In P roc. of European Conference on Speech Technology, Vol. 5, pp. 1927–1930.

  • Clarkson, P., and Rosenfeld, R. (1997). Statistical language modeling using the CMU-Cambridge toolkit. In Proc. of European Conference on Speech Technology, pp. 2707–2710.

  • Díaz, J., Rubio, A., Peinado, A., Segarra, E., Prieto, N., and Casacuberta, F. (1998). Albayzin: A task-oriented spanish speech corpus. In First Int. Conf. on Language Resources and Evaluation, vol. II, pp. 497–501.

  • Dupont, P., and Miclet, L. (1998). Inférence grammaticale régulière. fondements théoriques et principaux algorithmes. Rapport de reserche N 3449, Institut National de recherche en informatique et en automatique, Rennes.

  • García, P., and Vidal, E. (1990). Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(9):920–925.

    Article  Google Scholar 

  • Goodman, J. (2001). A bit of progress in language modeling. Computer, Speech and Language, 15:403–434.

    Article  Google Scholar 

  • Hopcroft, J., and Ullman, J. (1979). Introduction to Automata Theory, Languages and Computation. Addison-Wesley.

  • Jelinek, F. (1985). Markov source modelling of text generation. In J.K., Skwirzynski, and M., Nijhoff, (Eds.), The Impact of Processing Techniques on Communication, Dordrecht, The Netherlands, pp. 569–598.

  • Jelinek, F. (1996). Five speculations (and a divertimento) on the themes of h. bourlard, h. hermansky and n. morgan. Speech Communication 18:242–246.

    Article  Google Scholar 

  • Jelinek, F. and Mercer, R.L. (1980). Interpolated estimation of markov source parameters from sparse data. In Workshop on Pattern Recognition in practise. The Netherlands: North-Holland, pp. 381–397.

  • Jurafsky, D. and Martin, J.H. (2000). Speech and Language processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. New Jersey:Prentice Hall.

  • Katz, S. (1987). Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP- 35(3):400–401.

    Article  Google Scholar 

  • Klakow, D., and Peters, J. (2002). Testing the correlation of word error rate and perplexity. Speech Communication, 38:19–28.

    Article  Google Scholar 

  • Kneser, R., and Ney, H. (1995). Improved backing-off for m-gram language moleling. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing. Detroit, M.I., vol. I., pp. 1783–1786.

  • L. Mangu, E. B., Stolcke, A. (2000). Finding consensus in speech recognition: Word error minimization and other applications bof confusion networks. Computer, Speech and Language, 14:373–400.

    Article  Google Scholar 

  • Ney, H., Martin, S., and Wessel, F. (1997). Statistical Language Modeling using leaving-one-out. In S., Young, G., Bloothooft, (Eds.), Corpus-based Methods in Language and Speech Processing. Kluwer Academic Publishers, pp. 174–207.

  • Ogawa, A., Takeda, K., and Itakura, F. (1998). Balancing acoustic and linguistic probabilities. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. II, pp. 181–185.

  • Rodríguez, L., Torres, I., and Varona, A. (2001a)An ISCA Tutorial and research workshop.

  • Rodríguez, L., Torres, I., and Varona, A. (2001b). Evaluation of sublexical and lexical models of acoustic disfluencies for spontaneous speech recognition in spanish. In Proc. of European Conference on Speech Technology, vol. 3, pp. 1665–1668.

  • Rodríguez, L., Varona, A., de Ipiña, K.L., and Torres, I. (2000). Tornasol: An integrated continuous speech recognition system. In Pattern recognition and Applicactions. Frontiers in Artificial Intelligence and Applications series. The Netherlands:IOS Press, pp. 271–278.

  • Rosenfeld, R. (2000). Two decades of statistical language modeling: Where do we go from here. Proceedings of the IEEE, 88(8).

  • Rubio, J.A., Diaz-Verdejo, J.E., García, P., and Segura, J.C. (1997). On the influence of of frame-asynchronous grammar scoring in a csr system. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. II, pp. 895–899.

  • Torres, I. and Varona, A. (2000). An efficient representation of k-TSS language models. Computación y Sistemas (Revista Iberoamericana de Computación), 3(4):237–244.

    Google Scholar 

  • Torres, I., and Varona, A. (2001). k-tss language models in a speech recognition systems. Computer, Speech and Language, 15(2):127–149.

    Article  Google Scholar 

  • Varona, A., and Torres, I. (1999). Using Smoothed k-TLSS Language Models in Continous Speech Recognition. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. II, pp. 729–732.

  • Varona, A., and Torres, I. (2000). Delimited smoothing technique over pruned and not pruned syntactic language models: Perplexity and wer. In Automatic Speech Recognition: Challenges for the new Millenium, ISCA ITRW ASR2000 Workshop, Paris, pp. 69–76.

  • Varona, A., and Torres, I. (2001). Back-off smoothing evaluation over syntactic language models. In Proc. of European Conference on Speech Technology. Vol. 3, pp. 2135–2138.

  • Varona, A., and Torres, I. (2003). Integrating high and low smoothed lms in a csr system. In Progress in Pattern Recognition, Speech and Image Analysis. Lecture Notes in Computer Science, vol. 1, pp. 236–243.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Varona.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Varona, A., Torres, I. Scaling Smoothed Language Models. Int J Speech Technol 8, 341–361 (2005). https://doi.org/10.1007/s10772-006-9047-5

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-006-9047-5

Keywords

Navigation