Flexible Context Extraction for Keywords in Russian Automatic Speech Recognition Results

  • Olga Khomitsevich
  • Kirill Boyarsky
  • Eugeny Kanevsky
  • Anna BulushevaEmail author
  • Valentin Mendelev
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 661)


The paper deals with extracting contexts for keywords found in text, in particular in Automatic Speech Recognition (ASR) output. We propose using a syntactic parser to find contexts by analysing the sentence structure, rather than simply using a window of several words on the left and right of the keyword, or the whole sentence. This method provides concise but meaningful contexts that are easily readable by humans and can also be used in applications such as thematic clustering. We describe the Russian SemSin system which combines a syntactic dependency parser and elements of semantic ontology. We demonstrate the use of SemSin for our task both for normal text and for recognition output, and outline the suggestions for future developments of our method.


Syntactic parsing Keyword extraction Contexts Russian Speech-to-text ASR Speech recognition 



The work was financially supported by the Ministry of Education and Science of the Russian Federation, Contract 14.579.21.0008, ID RFMEFI57914X0008.


  1. 1.
    Beil, F., Ester, M., Xu, X.: Frequent term-based text clustering. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 436–442. ACM (2002)Google Scholar
  2. 2.
    Mihalcea, R., Tarau, P.: A language independent algorithm for single, multiple document summarization. In: IJCNLP (2005)Google Scholar
  3. 3.
    Boyarsky, K., Kanevsky, E.: Vega - a system for text classification and analysis. LAP Lambert Academic Publishing, Saarbrũcken (2011). in RussianGoogle Scholar
  4. 4.
    Boyarsky, K., Kanevsky, E.: The semantic-and-syntactic parser SemSin. In: Dialog 2012 (2012). in Russian
  5. 5.
    Tuzov, V.A.: Computer semantics of the Russian language. Saint-Petersburg State University Publishing House, Saint-Petersburg (2004). in RussianGoogle Scholar
  6. 6.
    Covington, M.A.: A dependency parser for variable-word-order languages. Research Report (1990)Google Scholar
  7. 7.
    Nivre, J., Boguslavsky, I.M., Iomdin, L.L.: Parsing the SynTagRus treebank of Russian. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 641–648. Association for Computational Linguistics (2008)Google Scholar
  8. 8.
    Chernykh, G., Korenevsky, M., Levin, K., Ponomareva, I., Tomashenko, N.: State level control for acoustic model training. In: Ronzhin, A., Potapova, R., Delic, V. (eds.) SPECOM 2014. LNCS (LNAI), vol. 8773, pp. 435–442. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-11581-8_54 Google Scholar
  9. 9.
    Tomashenko, N., Khokhlov, Y.: Speaker adaptation of context dependent deep neural networks based on MAP-adaptation, GMM-derived feature processing. In: INTERSPEECH 2014 - Proceedings of the 15th Annual Conference of the International Speech Communication Association, pp. 2997–3001 (2014)Google Scholar
  10. 10.
    Popova, S., Krivosheeva, T., Korenevsky, M.: Automatic stop list generation for clustering recognition results of call center recordings. In: Ronzhin, A., Potapova, R., Delic, V. (eds.) SPECOM 2014. LNCS (LNAI), vol. 8773, pp. 137–144. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-11581-8_17 Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Olga Khomitsevich
    • 1
  • Kirill Boyarsky
    • 2
  • Eugeny Kanevsky
    • 3
  • Anna Bulusheva
    • 4
    Email author
  • Valentin Mendelev
    • 1
  1. 1.Speech Technology Center LtdSt. PetersburgRussia
  2. 2.ITMO UniversitySt. PetersburgRussia
  3. 3.St. Petersburg Institute for Economics and Mathematics, RASSt. PetersburgRussia
  4. 4.STC-Innovations LtdSt. PetersburgRussia

Personalised recommendations