Automatic Speech Recognition Texts Clustering

  • Svetlana Popova
  • Ivan Khodyrev
  • Irina Ponomareva
  • Tatiana Krivosheeva
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8655)


Abstract. This paper deals with the clustering task for Russian texts obtained using automatic speech recognition (ASR). The input for processing are recognition result for phone call recordings and manual text transcripts for these calls. We present a comparative analysis of clustering results for recognition texts and manual text transcripts, make an evaluation of how recognition quality affects clustering and explore approaches to increasing clustering quality by using stop words and Latent Semantic Indexing (LSI).


clustering speech-to-text recognition result clustering Latent Semantic Indexing information retrieval stop words 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Larson, M., Jones, G.J.F.: Spoken content retrieval: A survey of techniques and technologies. Foundations and Trends in Information Retrieval 5(4-5), 235–422 (2012) ISSN 1554-0669Google Scholar
  2. 2.
    Park, A., Glass, J.R.: Unsupervised pattern discovery in speech. IEEE Trans. Acoustics, Speech and Language Processing 8(1), 186–197 (2008)CrossRefGoogle Scholar
  3. 3.
    Deerwester, S., et al.: Improving Information Retrieval with Latent Semantic Indexing. In: Proceedings of the 51st Annual Meeting of the American Society for Information Science, vol. 25, pp. 36–40 (1988)Google Scholar
  4. 4.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (1977)Google Scholar
  5. 5.
    MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
  6. 6.
    Chernykh, G., Korenevsky, M., Levin, K., Ponomareva, I., Tomashenko, N.: Cross-Validation State Control in Acoustic Model Training of Automatic Speech Recognition System. Scientific and Technical Journal Priborostroenie 57(2), 23–28 (2014)Google Scholar
  7. 7.
    Kudashev, O., Kozlov, A.: The Diarization System for an Unknown Number of Speakers. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 340–344. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  8. 8.
    Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. IEEE Trans. Audio, Speech and Language Proc. 20(1), 30–42 (2012)CrossRefGoogle Scholar
  9. 9.
    Pinto, D.: Analysis of narrow-domain short texts clustering. In: Research report for Diploma de Estudios Avanzados (DEA). Department of Information Systems and Computation, UPV (2007)Google Scholar
  10. 10.
    Pinto, D., Rosso, P., Jimenez, H.: A Self-Enriching Methodology for Clustering Narrow Domain Short Texts. Comput. J. 54(7), 1148–1165 (2011)CrossRefGoogle Scholar
  11. 11.
    Manning, C., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press (2009)Google Scholar
  12. 12.
    Eissen, S.M.z., Stein, B.: Analysis of Clustering Algorithms for Web-based Search. In: Karagiannis, D., Reimer, U. (eds.) PAKM 2002. LNCS (LNAI), vol. 2569, pp. 168–178. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  13. 13.
    Stein, B., zu Eissen, S.M., Wibbrock, F.: On Cluster Validity and the Information Need of Users. In: Hanza, M.H. (ed.) 3rd IASTED Int. Conference on Artificial Intelligence and Applications (AIA 2003), Benalmadena, Spain, pp. 216–221. ACTA Press, IASTED (2003) ISBN 0-88986-390-3Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Svetlana Popova
    • 1
    • 2
  • Ivan Khodyrev
    • 2
  • Irina Ponomareva
    • 3
  • Tatiana Krivosheeva
    • 3
  1. 1.Saint-Petersburg State UniversitySaint-PetersburgRussia
  2. 2.ITMO UniversitySaint-PetersburgRussia
  3. 3.Speech Technology CenterSaint-PetersburgRussia

Personalised recommendations