Russian Q&A Method Study: From Naive Bayes to Convolutional Neural Networks

  • Kirill Nikolaev
  • Alexey MalafeevEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11179)


This paper deals with automatic classification of questions in the Russian language. In contrast to previously used methods, we introduce a convolutional neural network for question classification. We took advantage of an existing corpus of 2008 questions, manually annotated in accordance with a pragmatic 14-class typology. We modified the data by reducing the typology to 13 classes, expanding the dataset and improving the representativeness of some of the question types. The training data in a combined representation of word embeddings and binary regular expression-based features was used for supervised learning to approach the task of question tagging. We tested a convolutional neural network against a state-of-the-art Russian language question classification algorithm, an SVM classifier with a linear kernel and questions represented as word trigram counts, as the baseline model (60.22% accuracy on the new dataset). We also tested several widely-used machine learning methods (logistic regression, Bernoulli Naïve Bayes) trained on the new question representation. The best result of 72.38% accuracy (micro) was achieved with the CNN model. We also ran experiments on pertinent feature selection with a simple Multinomial Naïve Bayes classifier, using word features only, Add-1 smoothing and no strategy for out-of-vocabulary words. Surprisingly, the setting with top-1200 informative word features (by PPMI) and equal priors achieved only slightly lower accuracy, 70.72%, which also beats the baseline by a large margin.


Natural language processing question answering Machine learning Deep learning Convolutional neural networks Relevant feature selection 


  1. 1.
    Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: OSDI, vol. 16, pp. 265–283 (2016)Google Scholar
  2. 2.
    Bengio, Y., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb.), 1137–1155 (2003)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Cai, L., Hofmann, T.: Text categorization by boosting automatically extracted concepts. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 182–189. ACM (2003)Google Scholar
  4. 4.
    Chollet, F., et al.: Keras (2015)Google Scholar
  5. 5.
    Collobert, R., et al.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug.), 2493–2537 (2011)zbMATHGoogle Scholar
  6. 6.
    Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)zbMATHGoogle Scholar
  7. 7.
    Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)CrossRefGoogle Scholar
  8. 8.
    Goldberg, Y., Levy, O.: Word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014)
  9. 9.
    Kutuzov, A., Kuzmenko, E.: RusVectores: distributional semantic models for the Russian (2017)Google Scholar
  10. 10.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  11. 11.
    Lai, S., et al.: Recurrent convolutional neural networks for text classification. In: AAAI, vol. 333, pp. 2267–2273 (2015)Google Scholar
  12. 12.
    Loni, B.: A survey of state-of-the-art methods on question classification (2011)Google Scholar
  13. 13.
    Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of ICML, vol. 30, no. 1, p. 3 (2013)Google Scholar
  14. 14.
    Nikolaev, K., Malafeev, A.: Russian-language question classification: a new typology and first results. In: van der Aalst, W.M.P., et al. (eds.) AIST 2017. LNCS, vol. 10716, pp. 72–81. Springer, Cham (2018). Scholar
  15. 15.
    Sosnin, P.I.: Question-answer modeling in the development of automated systems [Voprosno-otvetnoe modelirovanie v razrabotke avtomatizovannykh sistem], Ul’yanovsk, USTU (2007)Google Scholar
  16. 16.
    Suleymanov, D.Sh.: A study of the basic principles of building a semantic interpreter for questions and answers in natural language in AOS [Issledovanie bazovykh printsipov postroeniya semanticheskogo interpretatora voprosno-otvetnykh tekstov na estestvennom yazyke v AOS], Educational technologies and society [Obrazovatel’nye tekhnologii i obshchestvo], no. 3, pp. 178–192 (2001)Google Scholar
  17. 17.
    Tikhomirov, I.A.: Question-answering search in the intelligent search system Exactus [Voprosno-otvetnyy poisk v intellektual’noy poiskovoy sisteme Exactus]. In: Proceedings of the Fourth Russian Seminar on Evaluation of Information Retrieval Methods ROMIP [Trudy chetvertogo rossiyskogo seminara po otsenke metodov informatsionnogo poiska ROMIP], pp. 80–85 (2006)Google Scholar
  18. 18.
    Xu, Z., Yang, Y., Hauptmann, A.G.: A discriminative CNN video representation for event detection. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1798–1807. IEEE (2015)Google Scholar
  19. 19.
    RCNN Model. Accessed 13 Apr 2018
  20. 20.
    Naïve Bayes Model. Accessed 13 Apr 2018

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.National Research University Higher School of EconomicsNizhny NovgorodRussia

Personalised recommendations