Improving Neural Models for Natural Language Processing in Russian with Synonyms

Galinsky, R. B.; Alekseev, A. M.; Nikolenko, S. I.

doi:10.1007/s10958-023-06520-z

Improving Neural Models for Natural Language Processing in Russian with Synonyms

Published: 24 June 2023

Volume 273, pages 583–594, (2023)
Cite this article

Journal of Mathematical Sciences Aims and scope Submit manuscript

R. B. Galinsky¹,
A. M. Alekseev² &
S. I. Nikolenko^1,2

101 Accesses
1 Citation
Explore all metrics

Large-scale neural network models, including models for natural language processing, require large datasets that could be unavailable for low-resource languages or for special domains. We consider a way to approach the problem of poor variability and small size of available data for training NLP models based on augmenting the data with synonyms. We design a novel augmentation scheme that includes replacing words with synonyms, apply it to the Russian language and report improved results for the sentiment analysis task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Morpheme Level Word Embedding

PuoBERTa: Training and Evaluation of a Curated Language Model for Setswana

A Feature Extraction Method Based on Word Embedding for Word Similarity Computing

References

N. Abramov, Dictionary of Russian Synonyms and Synonymous Phrases, Russkie Slovari, Moscow (1999).
Google Scholar
Z. E. Alexandrova, Dictionary of Russian Synonyms, Russkii Yazyk, Moscow (2001).
Google Scholar
Y. Bengio, R. Ducharme, and P. Vincent, A Neural Probabilistic Language Model, 3, 1137–1155 (2003).
Google Scholar
Y. Bengio, H. Schwenk, J.-S. Senécal, F. Morin, and J.-L. Gauvain, “Neural probabilistic language models,” in: Innovations in Machine Learning, Springer (2006), pp. 137–186.
M. D. Bloice, Chr. Stocker, and A. Holzinger, “Augmentor: an image augmentation library for machine learning,” arXiv preprint arXiv:1708.04680 (2017).
J. A. Botha and Ph. Blunsom, “Compositional morphology for word representations and language modelling,” in: Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China (2014), pp. 1899–1907.
P. F. Brown, P. V. deSouza, R. L. Mercer, V. J. D. Pietra, and J. C. Lai, “Class-based n-gram models of natural language,” Comput. Linguist., 18, No. 4, 467–479 (1992).
Google Scholar
S. F. Chen and J. Goodman, “An empirical study of smoothing techniques for language modeling,” in: Proceedings of the 34th Annual Meeting on Association for Computational Linguistics (Stroudsburg, PA, USA), ACL ’96, Association for Computational Linguistics (1996), pp. 310–318.
F. Chollet, “Keras”, https://github.com/fchollet/keras (2015).
R. Cotterell, H. Schütze, and J. Eisner, “Morphological smoothing and extrapolation of word embeddings,” in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Vol. 1, Long Papers, ACL Berlin, Germany (2016).
Chr. Fellbaum (ed.), WordNet: an Electronic Lexical Database, MIT Press (1998).
R. Galinsky, A. Alekseev, and S. I. Nikolenko, “Improving neural network models for natural language processing in russian with synonyms,” in: Proc. 5th conference on Artificial Intelligence and Natural Language (2016), pp. 45–51.
Yoav Goldberg, “A primer on neural network models for natural language processing,” CoRR abs/1510.00726 (2015).
J. T. Goodman, “A bit of progress in language modeling,” Comput. Speech Lang., 15, No. 4, 403–434 (2001).
Article Google Scholar
A. Graves, S. Fernández, and J. Schmidhuber, “Bidirectional LSTM networks for improved phoneme classification and recognition,” Artificial Neural Networks: Formal Models and Their Applications - ICANN 2005, 15th International Conference, Part II,Warsaw, Poland, (2005), pp. 799–804.
A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM and other neural network architectures,” Neural Networks, 18, No. 5-6, 602–610 (2005).
Article Google Scholar
J. Howard and S. Ruder, “Universal language model fine-tuning for text classification,” in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics Vol. 1, Long Papers (2018), pp. 328–339.
A. B. Jung, “imgaug,” https://github.com/aleju/imgaug (2018), [Online; accessed 30-Dec-2018].
K.Kann and H. Schütze, “Single-model encoder-decoder with explicit morphological representation for reinflection,” in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL Vol. 2, Short Papers, Berlin, Germany (2016).
D.P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR abs/1412.6980 (2014).
R. Kneser and H. Ney, “Improved backing-off for m-gram language modeling,” in: Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on, Vol. 1 (1995), pp. 181–184.
S. Kobayashi, “Contextual augmentation: Data augmentation by words with paradigmatic relations,” in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 2, Short Papers (New Orleans, Louisiana), Association for Computational Linguistics (2018), pp. 452–457.
M. Korobov, “Morphological analyzer and generator for russian and ukrainian languages,” in: Analysis of Images, Social Networks and Texts (M. Yu. Khachay, N. Konstantinova, A. Panchenko, D.I. Ignatov, and V.G. Labunets, eds.), Communications in Computer and Information Science, Vol. 542, Springer International Publishing (2015), pp. 320–332.
O. Kozlowa and A. Kutuzov, “Improving distributional semantic models using anaphora resolution during linguistic preprocessing,” in: Proceedings of International Conference on Computational Linguistics “Dialogue 2016” (2016).
Y. LeCun, K. Kavukcuoglu, and C. Farabet, “Convolutional networks and applications in vision,” in: International Symposium on Circuits and Systems (ISCAS 2010), May 30 – June 2, 2010, Paris, France (2010), pp. 253–256.
W. Ling, Chr. Dyer, A. W. Black, I. Trancoso, R. Fermandez, S. Amir, L. Marujo, and T. Luis, “Finding function in form: Compositional character models for open vocabulary word representation,” in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (Lisbon, Portugal), Association for Computational Linguistics (2015), pp. 1520–1530.
N. Loukachevitch, M. Nokel, and K. Ivanov, “Combining thesaurus knowledge and probabilistic topic models,” in: International Conference on Analysis of Images, Social Networks and Texts, Springer (2017), pp. 59–71.
Google Scholar
M.-Th. Luong, R. Socher, and Chr. D. Manning, Better Word Representations With Recursive Neural Networks for Morphology, CoNLL, Sofia, Bulgaria (2013).
V. Malykh, “Robust word vectors for russian language,” in: Proceedings of Artificial Intelligence and Natural Language AINL FRUCT 2016 Conference, St.Petersburg, Russia (2016), pp. 10–12.
V. Malykh, “Generalizable architecture for robust word vectors tested by noisy paraphrases,” in: Proc. of The 6th International Conference On Analysis of Images, Social Networks, and Texts (AIST) (2017).
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” CoRR abs/1301.3781 (2013).
T. Mikolov, M. Karafiát, L. Burget, J. Cernockỳ, and S. Khudanpur, “Recurrent neural network based language model,” INTERSPEECH, 2, 3 (2010).
T. Mikolov, S. Kombrink, L. Burget, J. H. ˇCernockỳ, and S. Khudanpur, “Extensions of recurrent neural network language model,” in: Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, IEEE (2011), pp. 5528–5531.
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” CoRR abs/1310.4546 (2013).
G. A. Miller, “Wordnet: a lexical database for english,” Communications of the ACM, 38, No. 11, 39–41 (1995).
Article Google Scholar
A. Mnih and G. E. Hinton, “A scalable hierarchical distributed language model,” in: Advances in Neural Information Processing Systems (2009), pp. 1081–1088.
M. Ranzato, G.E. Hinton, and Y. LeCun, “Guest editorial: Deep learning,” International J. Computer Vision, 113, No. 1, 1–2 (2015).
Article MathSciNet Google Scholar
S. Ruder, “An overview of multi-task learning in deep neural networks,” arXiv preprint arXiv:1706.05098 (2017).
R. ”Sennrich, B. Haddow, and A.” Birch, “Edinburgh neural machine translation systems for WMT 16,” in: Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, ”Association for Computational Linguistics (2016), pp. 371–376.
V. Solovyev and V. Ivanov, “Knowledge-driven event extraction in russian: corpus-based linguistic resources,” Computational Intelligence and Neuroscience, 2016, 16 (2016).
Article Google Scholar
R. Soricut and F. Och, “Unsupervised morphology induction using word embeddings,” in: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Denver, Colorado), Association for Computational Linguistics (2015), pp. 1627–1637.
E. Tutubalina and S. Nikolenko, “Constructing aspect-based sentiment lexicons with topic modeling,” in: International Conference on Analysis of Images, Social Networks and Texts, Springer (2016), pp. 208–220.
Google Scholar
W. Y. Wang and D. Yang, “That’s so annoying!!!: A lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using #petpeeve tweets,” in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (Lisbon, Portugal), Association for Computational Linguistics (2015), pp. 2557–2563.
X. Wang, H. Pham, Z. Dai, and G. Neubig, “SwitchOut: an efficient data augmentation algorithm for neural machine translation,” in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics (2018), pp. 856–861.
Z. Xie, Wang S.I., J. Li, D. Lévy, A. Nie, D. Jurafsky, and A.Y. Ng, “Data noising as smoothing in neural network language models,” in: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, OpenReview.net (2017).
X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” in: Advances in Neural Information Processing Systems 28 (C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, eds.), Curran Associates, Inc. (2015), pp. 649–657.

Download references

Author information

Authors and Affiliations

St.Petersburg State University, St. Petersburg, Russia
R. B. Galinsky & S. I. Nikolenko
St.Petersburg Department of Steklov Mathematical Institute RAS, St. Petersburg, Russia
A. M. Alekseev & S. I. Nikolenko

Authors

R. B. Galinsky
View author publications
You can also search for this author in PubMed Google Scholar
A. M. Alekseev
View author publications
You can also search for this author in PubMed Google Scholar
S. I. Nikolenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. M. Alekseev.

Additional information

Published in Zapiski Nauchnykh Seminarov POMI, Vol. 499, 2021, pp. 206–221.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Galinsky, R.B., Alekseev, A.M. & Nikolenko, S.I. Improving Neural Models for Natural Language Processing in Russian with Synonyms. J Math Sci 273, 583–594 (2023). https://doi.org/10.1007/s10958-023-06520-z

Download citation

Received: 02 October 2020
Published: 24 June 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s10958-023-06520-z

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Neural Models for Natural Language Processing in Russian with Synonyms

Access this article

Similar content being viewed by others

Morpheme Level Word Embedding

PuoBERTa: Training and Evaluation of a Curated Language Model for Setswana

A Feature Extraction Method Based on Word Embedding for Word Similarity Computing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Improving Neural Models for Natural Language Processing in Russian with Synonyms

Access this article

Similar content being viewed by others

Morpheme Level Word Embedding

PuoBERTa: Training and Evaluation of a Curated Language Model for Setswana

A Feature Extraction Method Based on Word Embedding for Word Similarity Computing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation