Advertisement

Pattern Recognition and Image Analysis

, Volume 17, Issue 2, pp 321–336 | Cite as

Russian voice interface

  • A. L. Ronzhin
  • A. A. Karpov
Software and Hardware for Pattern Recognition and Image Analysis

Abstract

In the paper, we describe a system SIRIUS for recognition of continuous Russian speech, which is developed in the group of speech informatics of SPIIRAS. The specific feature of this system is that the language and speech are represented on morphemic level. This allows one to significantly reduce the size of lexically recognizable dictionary and increase the processing rate. We describe the process of introduction of the Russian speech recognition system into the area of infotelecommunications for voice access to the Internet-version of the electronic catalogue “Yellow Pages of Saint Petersburg” with the purpose of creation of an automated call-center for answering subscriber’s calls. In the paper, we demonstrate the results of testing the system work with speech samples recorded both in offices and in conditions of phone conversations.

Keywords

Speech Recognition Speech Signal Knowledge Domain Speech Recognition System Russian Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    L. R. Rabiner, “Applications of Speech Recognition in the Area of Telecommunications,” in 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings (1997), pp. 501–510.Google Scholar
  2. 2.
    A. L. Ronzhin and A. A. Karpov, “Implementation of Morphemic Analysis for Russian Speech Recognition,” in Proc. of 9th Int. Conf. SPECOM’2004, Russia, 2004 (Anatolya, St. Petersburg, 2004), pp. 291–296.Google Scholar
  3. 3.
    A. A. Zaliznyak, Grammatical Dictionary of the Russian language (Moscow, 1977) [in Russian].Google Scholar
  4. 4.
    R. V. Cox, C. A. Kamm, L. R. Rabiner, J. Schroeter, and J. G. Wilpon, “Speech and Language Processing for Next-Millennium Communications Services,” Proc. of the IEEE 88(8), 1314–1337 (2000).CrossRefGoogle Scholar
  5. 5.
    S. V. Krest’yaninov, Intelligent Networks and Computer-Integrated Telephony (Radio i Svyaz’, Moscow, 2001) [in Russian].Google Scholar
  6. 6.
    C. Wood, K. Torkkola, and S. Kundalkar, “Using Driver’s Speech to Detect Cognitive Workload,” in Proc. of 9th Int. Conf. SPECOM’2004, Russia, 2004 (Anatolya, St. Petersburg, 2004), pp. 215–222.Google Scholar
  7. 7.
    N. O. Bernsen, H. Dybkjæar, and L. Dybkjæar, Designing Interactive Speech Systems: From First Ideas to User Testing (Springer, 1998).Google Scholar
  8. 8.
    J. Hirasawa, N. Miyzaki, M. Nakano, and K. Aikawa, “New Feature Parameters for Detecting Misunderstanding in Spoken Dialog System,” in Proc. of ICSLP’2000, Beijing, China (2000).Google Scholar
  9. 9.
    A. Kurematsu, Y. Akegam, S. Burge, S. Jekat, B. Lause, V. Maclaren, D. Oppermann, and T. Schultz, “VERBMOBIL Dialogues: Multifaced Analysis,” in: Proc. of ICSLP’2000, Beijing, China (2000).Google Scholar
  10. 10.
    O. Pietquin, A Framework for Unsupervised Learning of Dialogue Strategies (Presses universitaires de Louvain, Belgium, 2004).Google Scholar
  11. 11.
    T. I. Ivanova, Computer Technologies in Telephony (Eko-Trendz, Moscow, 2002) [in Russian].Google Scholar
  12. 12.
    J. Greenberg, “A Quantitative Approach to the Morphological Typology of Language,” Int. J. of Amer. Linguistics 26(3), 64 (July, 1960).Google Scholar
  13. 13.
    A. I. Kuznetsova and T. F. Efremova, Dictionary of Morphemes of the Russian language (Russkii Yazyk, Moscow, 1986) [in Russian].Google Scholar
  14. 14.
    Russian Grammar (Nauka, Moscow, 1980) [in Russian].Google Scholar
  15. 15.
    Library of Maksim Moshkov: http://lib.ru
  16. 16.
    S. Young et al., The HTK Book (v3.0) (Cambridge University, Engineering Department, 2000).Google Scholar
  17. 17.
    A. A. Karpov, “Robust Method for Determination of Speech Boundaries on the Basis of Spectral Entropy,” Iskusstvennyi intellekt, No. 4, 607–613 (2004) (Donetsk).Google Scholar
  18. 18.
    Yu. A. Kosarev, I. V. Li, A. L. Ronzhin, and J. Savage, “Methods for Speech and Text Understanding,” in: Trudy SPIIRAN Ed. by R. M. Yusupov (Anatoliya, St. Petersburg, 2004), issue 1, Vol. 2, pp. 157–195.Google Scholar
  19. 19.
    I. V. Lee, A. L. Ronzhin, and A. A. Karpov, “Semantic-Pragmatic Processing of Natural Language for Automatic Speech Understanding System,” in Proc. of 9th Int. Conf. SPECOM’2004, Russia, 2004 (Anatolya, St. Petersburg, 2004), pp. 488–494.Google Scholar
  20. 20.
    A. L. Ronzhin and A. A. Karpov, “Russian Voice Interface,” in: Proc. 7th Int. Conf. on Pattern Recognition and Image Analysis: New Informational Technologies ROAI-7-2004, pp. 523–526.Google Scholar
  21. 21.
    Yellow Pages of St. Petersburg: http://yell.ru
  22. 22.
    K. Markov, T. Matsui, R. Gruhn, J. Zhang, and S. Nakamura, “Noise and Channel Distortion Robust ASR System for DARPA SPINE2 Task,” IEICE Transactions on Information and Systems E86-D (3), 497–504 (2003).Google Scholar
  23. 23.
    K. Yao, K. Paliwal, and S. Nakamura, “Noise Adaptive Speech Recognition Based on Sequential Noise Parameter Estimation,” Speech Communication 42, 5–23 (2004).CrossRefGoogle Scholar
  24. 24.
    B. H. Juang, “Speech Recognition in Adverse Environments,” Computer Speech and Language, 275–294 (1991).Google Scholar
  25. 25.
    C. H. Lee, C. H. Lin, and B. H. Juang, “A Study on Speaker Adaptation of the Parameters of Continuous Density Hidden Markov Models,” IEEE Trans. Acoust., Speech, Signal Processing ASSP-39 (4), 806–814 (1991).Google Scholar
  26. 26.
    M. Westphal, “The Use of Cepstral Means in Conversational Speech Recognition,” in Proc. European Conf. on Speech Communication and Technology (Rhodes, 1997), Vol. 3, pp. 1143–1146.Google Scholar
  27. 27.
    M. Pawlewski and S. Downey, “Channel Effects in Speaker Recognition,” in: Proc. of the COST-250 Workshop on Application of Speaker Recognition Techniques in Telephony (1996), pp. 39–46.Google Scholar

Copyright information

© Pleiades Publishing, Ltd. 2007

Authors and Affiliations

  • A. L. Ronzhin
    • 1
  • A. A. Karpov
    • 1
  1. 1.St. Petersburg Institute for Informatics and AutomationRussian Academy of SciencesSt. PetersburgRussia

Personalised recommendations