Abstract
The present paper addresses the task of morphological segmentation for Russian language. We show that deep convolutional neural networks solve this problem with F1-score of 98% over morpheme boundaries and beat existing non-neural approaches.
The work is partially supported by National Technological Initiative and Sberbank, project identifier 0000000007417F630002.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
The model equipped with Harris features takes more than 2Â h.
References
Botha, J., Blunsom, P.: Compositional morphology for word representations and language modelling. In: International Conference on Machine Learning, pp. 1899–1907 (2014)
Creutz, M., Lagus, K.: Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0. Helsinki University of Technology, Helsinki (2005)
Harris, Z.S.: Morpheme boundaries within words: report on a computer test. In: Harris, Z.S. (ed.) Papers in Structural and Transformational Linguistics. FLIS, pp. 68–77. Springer, Dordrecht (1970). https://doi.org/10.1007/978-94-017-6059-1_3
Ruokolainen, T., et al.: Painless semi-supervised morphological segmentation using conditional random fields. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, vol. 2: Short Papers, pp. 84–89 (2014)
Ruzsics, T., Samardzic, T.: Neural sequence-to-sequence learning of internal word structure. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp. 184–194 (2017)
Sirts, K., Goldwater, S.: Minimally-supervised morphological segmentation using adaptor grammars. Trans. Assoc. Comput. Linguist. T. 1, 255–266 (2013)
Shao, Y.: Cross-lingual word segmentation and morpheme segmentation as sequence labelling. arXiv preprint arXiv:1709.03756 (2017)
Tikhonov, A.N.: Morphemno-orfograficheskij slovar, 704 c. ACT Publishing (2002). (in Russian)
Vylomova, E., et al.: Word representation models for morphologically rich languages in neural machine translation. arXiv preprint arXiv:1606.04217 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Sorokin, A., Kravtsova, A. (2018). Deep Convolutional Networks for Supervised Morpheme Segmentation of Russian Language. In: Ustalov, D., Filchenkov, A., Pivovarova, L., Žižka, J. (eds) Artificial Intelligence and Natural Language. AINL 2018. Communications in Computer and Information Science, vol 930. Springer, Cham. https://doi.org/10.1007/978-3-030-01204-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-01204-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01203-8
Online ISBN: 978-3-030-01204-5
eBook Packages: Computer ScienceComputer Science (R0)