Advertisement

Topic-Dependent Language Model Switching for Embedded Automatic Speech Recognition

  • Marcos Santos-PérezEmail author
  • Eva González-Parada
  • José Manuel Cano-García
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 153)

Abstract

Embedded devices incorporate everyday new applications in different domains due to their increasing computational power.Many of these applications have a voice interface that uses Automatic Speech Recognition (ASR). When the complexity of the language model is high, it is common to use an external server to perform the recognition at the expense of certain limitations (network availability, latency, etc.). This paper focuses on a new proposal to improve the efficiency of the usage of the language model in a recognizer for multiple domains. The idea is based on the selection of a proper language model for each domain within the ASR system.

Keywords

Speech Recognition Language Model Automatic Speech Recognition Word Error Rate Automatic Speech Recognition System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    BeagleBoard website, http://beagleboard.org/
  2. 2.
  3. 3.
    Ballinger, B., Allauzen, C., Gruenstein, A., Schalkwyk, J.: On-demand language model interpolation for mobile speech input. In: Kobayashi, T., Hirose, K., Nakamura, S. (eds.) Proceedings of Interspeech, pp. 1812–1815. ISCA (2010)Google Scholar
  4. 4.
    Bennett, C., Rudnicky, A.I.: The Carnegie Mellon Communicator corpus. In: Proceedings of the International Conference on Spoken Language Processing, pp. 341–344 (2002)Google Scholar
  5. 5.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011), http://www.csie.ntu.edu.tw/~cjlin/libsvm Google Scholar
  6. 6.
    Chen, S.F.: An empirical study of smoothing techniques for language modeling. Tech. rep. (1998)Google Scholar
  7. 7.
    CMU Communicator limited domain website, http://festvox.org/dbs/dbs_com.html
  8. 8.
    CMU Weather limited domain website, http://festvox.org/dbs/dbs_weather.html
  9. 9.
    Hsu, B.J., Glass, J.: Iterative language model estimation: Efficient data structure & algorithms. In: Proceedings of Interspeech, pp. 504–511. ISCA (2008)Google Scholar
  10. 10.
    Huggins-daines, D., Kumar, M., Chan, A., Black, A.W., Ravishankar, M., Rudnicky, A.I.: Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices. In: Proceedings of ICASSP (2006)Google Scholar
  11. 11.
    Lane, I.R., Kawahara, T., Matsui, T., Nakamura, S.: Dialogue speech recognition by combining hierarchical topic classification and language model switching. IEICE - Trans. Inf. Syst. E88-D, 446–454 (2005)CrossRefGoogle Scholar
  12. 12.
    Price, P., Fisher, W., Bernstein, J., Pallet, D.: Resource Management RM1 2.0. Linguistic Data Consortium, Philadelphia (1993), LDC93S3BGoogle Scholar
  13. 13.
    Ravishankar, M.: Efficient algorithms for speech recognition. Ph.D. thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh (1996), Available as tech report CMU-CS-96-143Google Scholar
  14. 14.
    Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., Kamvar, M., Strope, B.: “your word is my command”: Google search by voice: A case study. In: Neustein, A. (ed.) Advances in Speech Recognition, pp. 61–90. Springer, US (2010)CrossRefGoogle Scholar
  15. 15.
    Schmitt, A., Zaykovskiy, D., Minker, W.: Speech recognition for mobile devices. International Journal of Speech Technology 11, 63–72 (2008)CrossRefGoogle Scholar
  16. 16.
    Vapnik, V.N.: The nature of statistical learning theory. Springer-Verlag New York, Inc., New York (1995)zbMATHGoogle Scholar
  17. 17.
    Vertanen, K.: Baseline WSJ acoustic models for HTK and sphinx: Training recipes and recognition experiments. Technical report, University of Cambridge, Cavendish Laboratory (2006)Google Scholar
  18. 18.
    Voxforge English Acoustic Model website, http://www.voxforge.org/home/downloads

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Marcos Santos-Pérez
    • 1
    Email author
  • Eva González-Parada
    • 1
  • José Manuel Cano-García
    • 1
  1. 1.Electronic Technology Department, School of Telecommunications EngineeringUniversity of MalagaMalagaSpain

Personalised recommendations