Abstract
Insulting speech acts have become the subject of public discussion in the media, social media, the basis for speculation in political communication, and a working concept in the legal environment. The present research article explores insulting speech acts on the social network site “VKontakte” aiming to develop an algorithm for automatic classification of text data. We conducted semantic analysis of the text of “Article 5.61” of the Code of Administrative Offenses of the Russian Federation, which made it possible to formulate inclusion criteria for formal classification. We used three common word embeddings models (BERT, ELMo, and fastText) on the original Russian language dataset consisting of 4596 annotated messages perceived as insulting speech acts. General findings argue that even in a specialized dataset the share of messages that meet criteria of inclusion is negligible. This indicates a low probability of going to court on the fact of an administrative offense under Article 5.61 based on speech communication on social network sites, even though such communication is public in nature and is automatically recorded in writing. Machine learning text classifier based on BERT model showed best performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahrenova, N.A.: Internet-lingvistika: Novaja paradigma v opisanii jazyka Interneta. Vestnik Moskovskogo gosudarstvennogo oblastnogo socialno-gumanitarnogo instituta 3, 8–14 (2016)
AI from Siberia will find covert forbidden texts on the Web (2019) . https://roskomsvoboda.org/53920/
Article 5.61 of the Code of Administrative Offenses of the Russian Federation. https://www.consultant.ru/document/cons_doc_LAW_34661/d40cbd099d17057d9697b15ee8368e49953416ae/
Audience’s features of “VKontakte”. https://www.demis.ru/articles/celevaya-auditoria-vkontakte/
Bojanowski, P., et al.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Brusenskaya, L.A., Arsenieva, V.A., Suryanto, T.: Verbal crime: the problem of insult in the media text. Media Educ. (Mediaobrazovanie) 58(3), 12–23 (2018). https://doi.org/10.13187/me.2018.3.12
Crystal, D.: The Language Revolution. Polity Press Ltd., Cambridge (2008)
Culpeper, J., Iganski, P., Sweiry, A.: Linguistic impoliteness and religiously aggravated hate crime in England and Wales. J. Lang. Aggr. Confl. 5(1), 1–29 (2017). https://doi.org/10.1075/jlac.5.1.01cul
Devlin, J., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Durán Sánchez, C.A.: Aspectos interventores en la participación política y electoral de jóvenes. Una reflexión sobre la información, interacción y difusión de contenidos en redes sociales para futuras investigaciones en Santander. Desafíos 27(1), 47–81 (2015). https://doi.org/10.12804/desafios27.01.2015.02
Galyashina, E.: The distinction between the forensic linguistic and scientific activity of linguist analyst: competencies, methods and technologies. Acta Linguistica Petropolitana 1(15), 104–129 (2019). https://doi.org/10.30842/alp2306573715105
Jaroshhuk, I.A., Zhukova, N.A., Dolzhenko, N.I.: Linguistic expertise. BelGU, Belgorod (2020)
Kennedy, J.: Rhetorics of sharing: data, imagination, and desire. In: Lovink, G., Rasch, M. (eds.) Unlike Us Reader: Social Media Monopolies and Their Alternatives, pp. 127–136. Institute of Network Cultures, Amsterdam (2013)
Komalova, L., Goloshchapova, T., Motovskikh, L., Epifanov, R., Morozov, D., Glazkova, A.: MCA Workshop – Toxic Comments (2021). https://doi.org/10.17632/fktgy52645.1, https://data.mendeley.com/datasets/fktgy52645/1
Komalova, L.R.: Agressogen Discourse: The Multilingual Aggression Verbalization Typology. Publishing House «Sputnik +», Moscow (2020)
Komalova, L.R.: Repertory of verbal realization of reciprocal aggression in situation of status-role asimmetry. Vestnik of Moscow State Linguistic University. Humanities 9(695), 103–111 (2014)
Kukushkina, O.V., Safonova, J., Sekerazh, T.N.: Theoretical and Methodological Foundations for Psycho-linguistic Text Expertise on Extremism Cases. RFCSJe pri Minjuste Rossii, Moscow (2011)
Kuratov, Y., Arkhipov, M.: Adaptation of deep bidirectional multilingual transformers for Russian language. Comput. Linguist. Intellect. Technol. 18, 333–339 (2019)
Kusov, G.V.: Kommunikativnaja perversija kak sposob diagnostiki iskazhenij pri oskorblenijah. Jurislingvistika 6, 43–55 (2005)
Kutuzov, A., et al.: Word vectors, reuse, and replicability: towards a community repository of large-text resources. In: Proceedings of the 58th Conference on Simulation and Modelling, pp. 271–276. Linköping University Electronic Press (2017)
Lambke, A.: The social dilemma. In: Netflix, Documentary Films (2020). https://www.netflix.com/ru-en/title/81254224
McCulloch, M.: Because Internet: Understanding the New Rules of Language. Riverhead Book, New York (2019)
Miconi, A.: Under the skin of the networks: how concentration affects social practices in web 2.0 environments. In: Lovink, G., Rasch, M. (eds.) Unlike Us Reader: Social Media Monopolies and Their Alternatives, pp. 89–102. Institute of Network Cultures, Amsterdam (2013)
Mikolov, T., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013). https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf
MSA-Workshop (2020). https://gitlab.com/rostepifanov/mca-workshop
News for the Press (2020). https://vk.com/press/no-hate-speech
Paasch-Colberg, S., Strippel, C., Trebbe, J., Emmer, M.: From insult to hate speech: mapping offensive language in German user comments on immigration. Media Commun. 9(1), 171–180 (2021). https://doi.org/10.17645/mac.v9i1.3399
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, pp. 8026–8037 (2019). https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Peters, M.E., et al.: Deep contextualized word representations. arXiv:1802.05365 (2018)
Russian language toxic comments. https://www.kaggle.com/blackmoon/russian-language-toxic-comments
Shahmatova, T.S.: Oskorblenie kak instrument jazykovogo nasilija v rechevyh situacijah institucionalnogo obshhenija. Uchenye zapiski Kazanskogo universiteta. Serija. Gumanitarnye nauki 155(5), 267–278 (2013)
Smetanin, S., Komarov, M.: Deep transfer learning baselines for sentiment analysis in Russian. Inf. Process. Manag. 3(58), 102484 (2021). https://doi.org/10.1016/j.ipm.2020.102484
Špago, D., Maslo, A., Špago-Ćumurija, E.: Insults speak louder than words: Donald Trump’s tweets through the lens of the speech act of insulting. Folia Linguistica et Litteraria 27, 139–159 (2019)
Sponholz, L., Christofoletti, R.: From preachers to comedians: Ideal types of hate speakers in Brazil. Glob. Media Commun. 15(1), 67–84 (2019). https://doi.org/10.1177/1742766518818870
The Multilingual Internet: Language, Culture, and Communication Online. Oxford University Press, Oxford (2007)
VKontakte told about increase of more than 22% to 73 million in Russian audience. https://vk.com/press/q1-2020-results
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. arXiv:1910.03771 (2019)
Funding
The research done for this work has been supported by the 1st Workshop at the Mathematical Center in Akademgorodok (project No 26 “Mathematical support for linguistic expertise”, 13 July–14 August, 2020) http://mca.nsu.ru/workshopen/. The authors express their sincere gratitude to the students of the Engineering School of Novosibirsk State University, especially to M.V. Fedorova and E.V. Timofeeva, as well as a student of the Higher School of Economics M.O. Maslova, who made an invaluable contribution to the collection of the dataset and acted as annotators.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Komalova, L., Glazkova, A., Morozov, D., Epifanov, R., Motovskikh, L., Mayorova, E. (2022). Automated Classification of Potentially Insulting Speech Acts on Social Network Sites. In: Alexandrov, D.A., et al. Digital Transformation and Global Society. DTGS 2021. Communications in Computer and Information Science, vol 1503. Springer, Cham. https://doi.org/10.1007/978-3-030-93715-7_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-93715-7_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93714-0
Online ISBN: 978-3-030-93715-7
eBook Packages: Computer ScienceComputer Science (R0)