Application of a Hybrid Bi-LSTM-CRF Model to the Task of Russian Named Entity Recognition
Named Entity Recognition (NER) is one of the most common tasks of the natural language processing. The purpose of NER is to find and classify tokens in text documents into predefined categories called tags, such as person names, quantity expressions, percentage expressions, names of locations, organizations, as well as expression of time, currency and others. Although there is a number of approaches have been proposed for this task in Russian language, it still has a substantial potential for the better solutions. In this work, we studied several deep neural network models starting from vanilla Bi-directional Long Short Term Memory (Bi-LSTM) then supplementing it with Conditional Random Fields (CRF) as well as highway networks and finally adding external word embeddings. All models were evaluated across three datasets Gareev’s, Person-1000 and FactRuEval 2016. We found that extension of Bi-LSTM model with CRF significantly increased the quality of predictions. Encoding input tokens with external word embeddings reduced training time and allowed to achieve state of the art for the Russian NER task.
KeywordsNER Bi-LSTM CRF
The statement of author contributions. AL conducted initial literature review, selected a baseline (Bi-LSTM + CRF) model, prepared datasets and run experiments under supervision of MB. AM implemented and studied extensions of the NeuroNER model. AL drafted the first version of the paper. AM added a review of works related to the Russian NER and materials related to the NeuroNER modifications. MB, AL and AM edited and extended the manuscript.
This work was supported by National Technology Initiative and PAO Sberbank project ID 0000000007417F630002.
- 1.Patawar, M.L., Potey, M.A.: Approaches to named entity recognition: a survey. Int. J. Innov. Res. Comput. Commun. Eng. 3(12), 12201–12208 (2015)Google Scholar
- 2.Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. ArXiv preprint arXiv: 1603.01360 (2016)
- 3.Dernoncourt F., Lee, J.Y., Szolovits P.: NeuroNER: an easy-to-use program for named-entity recognition based on neural networks. ArXiv preprint arXiv:1705.05487 (2017)
- 5.Trofimov, I.V.: Person name recognition in news articles based on the persons-1000/1111-F collections. In: 16th All-Russian Scientific Conference Digital Libraries: Advanced Methods and Technologies, Digital Collection, RCDL 2014, pp. 217–221 (2014)Google Scholar
- 6.Mozharova V., Loukachevitch N.: Two-stage approach in Russian named entity recognition. In: 2016 International FRUCT Conference on Intelligence, Social Media and Web (ISMW FRUCT), pp. 1–6 (2016)Google Scholar
- 7.Ivanitskiy, R., Alexander, S., Liubov, K.: Russian named entities recognition and classification using distributed word and phrase representations. In: SIMBig, pp. 150–156 (2016)Google Scholar
- 8.Sysoev, A.A., Andrianov, I.A.: Named entity recognition in Russian: the power of Wiki-based approach. dialog-21.ru (2016)Google Scholar
- 9.Malykh, V., Ozerin, A.: Reproducing Russian NER baseline quality without additional data. In: Proceedings of the 3rd International Workshop on Concept Discovery in Unstructured Data, Moscow, Russia, pp. 54–59 (2016)Google Scholar
- 13.Chen, W., Zhang, Y., Isahara, H.: Chinese named entity recognition with conditional random fields. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 118–121 (2006)Google Scholar
- 14.Kutuzov, A., Kuzmenko, E.: WebVectors: a toolkit for building web interfaces for vector semantic models. In: Ignatov, D.I., Khachay, M.Y., Labunets, V.G., Loukachevitch, N., Nikolenko, S.I., Panchenko, A., Savchenko, A.V., Vorontsov, K. (eds.) AIST 2016. CCIS, vol. 661, pp. 155–161. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52920-2_15CrossRefGoogle Scholar
- 15.Ekbal, A., Haque, R., Bandyopadhyay, S.: Named entity recognition in Bengali: a conditional random field approach. In: IJCNLP Conference, pp. 589–594 (2008)Google Scholar
- 16.Starostin, A.S., Bocharov, V.V., Alexeeva, S.V., Bodrova, A., Chuchunkov, A.S., Dzhumaev, S.S., Nikolaeva, M.A.: FactRuEval 2016: evaluation of named entity recognition and fact extraction systems for Russian. In: Proceedings of the Annual International Conference Dialogue on Computational Linguistics and Intellectual Technologies, no. 15, pp. 702–720 (2016)Google Scholar
- 17.Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. ArXiv preprint arXiv:1607.04606 (2016)
- 18.Vlasova, N.A., Suleymanova, E.A., Trofimov, I.V: Report on Russian corpus for personal name retrieval. In: Proceedings of Computational and Cognitive Linguistics, TEL 2014, Kazan, Russia, pp. 36–40 (2014)Google Scholar
- 19.Rubaylo, A.V., Kosenko, M.Y.: Software utilities for natural language information retrievial. Alm. Mod. Sci. Educ. 12(114), 87–92 (2016)Google Scholar
- 20.Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 2741–2749 (2016)Google Scholar
- 21.Pundak, G., Sainath, T.N.: Highway-LSTM and recurrent highway networks for speech recognition. In: Proceedings of Interspeech 2017, ISCA (2017)Google Scholar
- 22.Tran, P.-N., Ta, V.-D., Truong, Q.-T., Duong, Q.-V., Nguyen, T.-T., Phan, X.-H.: Named entity recognition for vietnamese spoken texts and its application in smart mobile voice interaction. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, T.-P. (eds.) ACIIDS 2016. LNCS (LNAI), vol. 9621, pp. 170–180. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49381-6_17CrossRefGoogle Scholar
- 23.Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)