Abstract
The paper considers the problem of automatic recognition of animacy in the Russian language. We propose a recognizer that is based on the analysis of co-occurrence with the most frequent words and is trained on data from the Russian subcorpus of Google Books Ngram. The obtained recognition accuracy of animacy is 94.3% on the test sample. We also consider the application of the trained recognizer to diachronic data. The performed analysis shows that high recognition accuracy can be obtained even using the data extracted from the corpus for one single year. This allows one, firstly, to diachronically investigate changes in perception of words for which variability of animacy/inanimacy is observed. Secondly, the considered examples show that change in perception of an object as animate or inanimate can serve as a marker of semantic change and, in particular, emergence of new meanings of a word denoting this object. This makes the recognizer a good tool for studies of language evolution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Harris, Z.: Papers in Structural and Transformational Linguistics. Reidel, Dordrecht (1970)
Rubenstein, H., Goodenough, J.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)
Firth, J.R.: A synopsis of linguistic theory, studies in linguistic analysis 1930–1955. Spec. Vol. Phil. Soc. 1–32 (1957)
Weeds, J., Weir, D., McCarthy, D.: Characterising measures of lexical distributional similarity. In: Proceedings of the 20th International Conference on Computational Linguistics, Geneva, Switzerland, pp. 1015–1021. COLING (2004)
Pantel, P.: Inducing ontological co-occurrence vectors. In: Proceedings of the 43rd Conference of the Association for Computational Linguistics, pp. 125–132. Association for Computational Linguistics, USA (2005)
Bullinaria, J., Levy, J.: Extracting semantic representations from word co-occurrence statistics: a computational study. Behav. Res. Methods 39, 510–526 (2007). https://doi.org/10.3758/BF03193020
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3111–3119. Curran Associates, Inc. (2013)
Kim, Y., Chiu, Y.-I., Hanaki, K., Hegde, D., Petrov, S.: Temporal analysis of language through neural language models. In: Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, pp. 61–65. ACL (2014)
Frermann, L., Lapata, M.: A Bayesian model of diachronic meaning change. Trans. Assoc. Comput. Linguist. 4, 31–45 (2016)
Yao, Z., Sun, Y., Ding, W., Rao, N., Xiong, H.: Dynamic word embeddings for evolving semantic discovery. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM 2018, pp. 673– 681. ACM (2018)
Tang, X.: A state-of-the-art of semantic change computation. Nat. Lang. Eng. 24(5), 649–676 (2018)
Kulkarni, V., Al-Rfou, R., Perozzi, B., Skiena, S.: Statistically significant detection of linguistic change. In: Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, pp. 625–635 (2015)
Giulianelli, M., Kutuzov, A., Pivovarova, L.: Grammatical profiling for semantic change detection. In: Proceedings of the 25th Conference on Computational Natural Language Learning, pp. 423–434. Association for Computational Linguistics (2021)
Vihman, V.-A., Nelson, D.: Effects of animacy in grammar and cognition: introduction to special issue. Open Linguist. 5(1), 260–267 (2019)
Gao, T., Scholl, B., McCarthy, G.: Dissociating the detection of intentionality from animacy in the right posterior superior temporal sulcus. J. Neurosci. Off. J. Soc. Neurosci. 32, 14276–14280 (2012)
Nieuwland, M., van Berkum, J.: When peanuts fall in love: N400 evidence for the power of discourse. J. Cogn. Neurosci. 18(7), 1098–1111 (2005)
Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., Jurafsky, D.: Deterministic coreference resolution based on entity-centric, precision-ranked rules. Comput. Linguist. 39(4), 885–916 (2913)
Orasan, C., Evans, R.: NP animacy identification for anaphora resolution. J. Artif. Intell. Res. 29, 79–103 (2007)
Chen, J., Schein, A., Ungar, L., Palmer, M.: An empirical study of the behavior of active learning for word sense disambiguation. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 120–127. Association for Computational Linguistics (2006)
Coll Ardanuy, M., et al.: Living machines: a study of atypical animacy. In: Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 4534–4545. International Committee on Computational Linguistics (2020)
Karsdorp, F., van der Meulen, M., Meder, T., van den Bosch, A.: Animacy detection in stories. In: Proceedings of the 6th Workshop on Computational Models of Narrative, Saarbrücken/Wadern, Germany, pp. 82–97. Dagstuhl Publishing (2015)
Jahan, L., Chauhan, G., Finlayson, M.: A new approach to animacy detection. In: Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, pp. 1–12. Association for Computational Linguistics (2018)
Bochkarev, V.V., Khristoforov, S.V., Shevlyakova, A.V., Solovyev, V.D.: Neural network algorithm for detection of new word meanings denoting named entities. IEEE Access 10, 68499–68512 (2022). https://doi.org/10.1109/ACCESS.2022.3186681
Lin, Y., Michel, J.-B., Aiden, E.L., Orwant, J., Brockman, W., Petrov, S.: Syntactic Annotations for the Google Books Ngram Corpus. In: Li, H., Lin, C.-Y., Osborne, M., Lee, G.G., Park, J.C. (eds.) 50th Annual Meeting of the Association for Computational Linguistics 2012, Proceedings of the Conference, Jeju Island, Korea, vol. 2, pp. 238–242. Association for Computational Linguistics (2012)
Bocharov, V.V., Alexeeva, S.V., Granovsky, D.V., Protopopova, E.V., Stepanova, M.E., Surikov, A.V.: Crowdsourcing morphological annotation. In: Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue”, vol. 12, no. 1, pp. 109–115. RGGU, Moskow (2013)
OpenCorpora, n.d. http://opencorpora.org/dict.php. Accessed 14 July 2022
Xu, Y., Kemp, C.: A computational evaluation of two laws of semantic change. In: Proceedings of the 37th Annual Meeting of the Cognitive Science Society, CogSci 2015, Pasadena, California, USA, 22–25 July 2015
Khristoforov, S., Bochkarev, V., Shevlyakova, A.: Recognition of parts of speech using the vector of bigram frequencies. In: van der Aalst, W.M.P., et al. (eds.) AIST 2019. CCIS, vol. 1086, pp. 132–142. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39575-9_13
Bullinaria, J.A., Levy, J.P.: Extracting semantic representations from word co-occurrence statistics: Stop-lists, stemming, and SVD. Behav. Res. Methods 44(3), 890–907 (2012)
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, vol. 2, Short Papers, pp. 427–431. Association for Computational Linguistics (2017)
Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), European Language Resources Association (ELRA), Miyazaki, Japan (2018)
Dubey, S.R., Singh, S.K., Chaudhuri, B.B.: Activation functions in deep learning: a comprehensive survey and benchmark. Neurocomputing 503, 92–108 (2022)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sun, S., Cao, Z., Zhu, H., Zhao, J.: A survey of optimization methods from a machine learning perspective. IEEE Trans. Cybern. 50(8), 3668–3681 (2020)
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (n.d.). https://www.tensorflow.org/. Accessed 28 July 2022
Chollet, F.: Keras (n.d.). https://keras.io. Accessed 28 July 2022
Antoniak, M., Mimno, D.: Evaluating the stability of embedding-based word similarities. Trans. Assoc. Comput. Linguist. 6, 107–119 (2018)
Bochkarev, V.V., Maslennikova, Yu.S., Shevlyakova, A.V.: Testing of statistical significance of semantic changes detected by diachronic word embedding. J. Intell. Fuzzy Syst. 1–13 (2022). https://doi.org/10.3233/JIFS-212179
Poor, H., Hadjiliadis, O.: Quickest Detection. Cambridge University Press, Cambridge (2008)
Lavielle, M.: Using penalized contrasts for the change-point problem. Signal Process 85(8), 1501–1510 (2005)
Killick, R., Fearnhead, P., Eckley, I.A.: Optimal detection of changepoints with a linear computational cost. J. Amer. Statist. Assoc. 107(500), 1590–1598 (2012)
Bochkarev, V., Shevlyakova, A.: Calculation of a confidence interval of semantic distance estimates obtained using a large diachronic corpus. J. Phys. Conf. Ser. 1730, 012031 (2021)
Acknowledgements
This research was financially supported by Russian Science Foundation, grant № 20-18-00206.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Bochkarev, V., Achkeev, A., Shevlyakova, A., Khristoforov, S. (2022). Diachronic Neural Network Predictor of Word Animacy. In: Pichardo Lagunas, O., Martínez-Miranda, J., Martínez Seis, B. (eds) Advances in Computational Intelligence. MICAI 2022. Lecture Notes in Computer Science(), vol 13613. Springer, Cham. https://doi.org/10.1007/978-3-031-19496-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-19496-2_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19495-5
Online ISBN: 978-3-031-19496-2
eBook Packages: Computer ScienceComputer Science (R0)