AINL 2017: Artificial Intelligence and Natural Language pp 121-127 | Cite as
Corpus of Syntactic Co-Occurrences: A Delayed Promise
Conference paper
First Online:
Abstract
The paper gives a technical description of CoSyCo, a corpus of syntactic co-occurrences, which provides information on syntactically connected words in the Russian language. The paper includes an overview of the corpora collected for CoSyCo creation and the amount of collected combinations. In the paper, we also provide a short evaluation of the gathered information.
Keywords
Corpora creation Shallow parsing Grammatically ambiguous text Words combinations The Russian languageReferences
- 1.Apresjan, V., Baisa, V., Buivolova, O., Kultepina, O.: RuSkELL: online language learning tool for Russian language. In: Proceedings of the XVII EURALEX International congress, Tbilisi, Georgia, pp. 292–299 (2016)Google Scholar
- 2.Belikov, V., Selegey, V., Sharoff, S.: Preliminary considerations towards developing the General Internet Corpus of Russian, Computational Linguistics and Intelligent Technologies: Proceedings of the International Conference “Dialog” 2012”, Bekasovo, vol. 1, pp. 37–49 (2012)Google Scholar
- 3.Belikov, V., Kopylov, N., Piperski, A., Selegey, V., Sharoff, S.: Corpus as language: from scalability to register variation [Korpus kak yazyk: ot masshtabiruemosti k differentsialnoi polnote] Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialog” (2013) [Komp’iuternaia Lingvistika i Intellektual’nye Tekhnologii: Po materialam ezhegodnoi Mezhdunarodnoii Konferentsii “Dialog” (2013)], Bekasovo, vol. 1, pp. 83–96 (2013)Google Scholar
- 4.Frolova, T.I., Podlesskaya, O.Y.: Tagging lexical functions in Russian texts of SynTagRus. In: Proceedings of Dialog 2011, pp. 207–218 (2011)Google Scholar
- 5.Kilgariff, A., Rychly, P., Smrz, P., Tugwell, D.: The sketch engine. In: Proceedings of the XI Euralex International Congress, Lorient, France, pp. 105–116 (2004)Google Scholar
- 6.Klyshinsky, E., Kochetkova, N., Litvinov, M., Maximov, V.: Method of POS-disambiguation using information about words co-occurrence (for Russian). In: Proceedings of the annual meeting of the GSCL, Hamburg, pp. 191–195 (2011)Google Scholar
- 7.Klyshinsky, E., et al.: Analysis of Words Ambiguity in European Languages, № 4, p. 31. Keldysh IAM Preprints 2015 (2015)Google Scholar
- 8.Klyshinsky, E., Ermakov, P., Lukashevich, N., Karpik, O.: The corpus of syntactic co-occurences: the first glance. In: Proceedings of the Fifth International Conference on Analysis of Images, Social Networks and Texts (AIST 2016), pp. 85–90 (2016)Google Scholar
- 9.Kormacheva, D., Pivovarova, L., Kopotev, M.: Automatic collocation extraction and classification of automatically obtained bigrams. In: Proceedings of Workshop on Computational, Cognitive, and Linguistic Approaches to the Analysis of Complex Words and Collocations (CCLCC 2014), pp. 27–33 (2014)Google Scholar
- 10.Lukashevich, N., Klyshinky, E., Kobozeva, I.: Lexical research in Russian: are modern corpora flexible enough?, Computational Linguistics and Intellectual Technologies: Proceedings of the Annual International Conference “Dialog” (2016) [Komp’iuternaia Lingvistika i Intellektual’nye Tekhnologii: Po materialam ezhegodnoi Mezhdunarodnoi Konferentsii “Dialog” (2016)], Moscow, pp. 385–397 (2016)Google Scholar
- 11.Lyashevskaja, O., Plungian, V.: Morphological annotation in Russian National Corpus: a theoretical feedback. In: Proceedings of the 5th International Conference on Formal Description of Slavic Languages (FDSL-5), Leipzig, pp. 26–28 (2003)Google Scholar
Copyright information
© Springer International Publishing AG 2018