Human and Machine Judgements for Russian Semantic Relatedness

  • Alexander Panchenko
  • Dmitry Ustalov
  • Nikolay Arefyev
  • Denis Paperno
  • Natalia Konstantinova
  • Natalia Loukachevitch
  • Chris Biemann
Conference paper

DOI: 10.1007/978-3-319-52920-2_21

Part of the Communications in Computer and Information Science book series (CCIS, volume 661)
Cite this paper as:
Panchenko A. et al. (2017) Human and Machine Judgements for Russian Semantic Relatedness. In: Ignatov D. et al. (eds) Analysis of Images, Social Networks and Texts. AIST 2016. Communications in Computer and Information Science, vol 661. Springer, Cham

Abstract

Semantic relatedness of terms represents similarity of meaning by a numerical score. On the one hand, humans easily make judgements about semantic relatedness. On the other hand, this kind of information is useful in language processing systems. While semantic relatedness has been extensively studied for English using numerous language resources, such as associative norms, human judgements and datasets generated from lexical databases, no evaluation resources of this kind have been available for Russian to date. Our contribution addresses this problem. We present five language resources of different scale and purpose for Russian semantic relatedness, each being a list of triples \(({word}_{i}, {word}_{j}, {similarity}_{ij}\)). Four of them are designed for evaluation of systems for computing semantic relatedness, complementing each other in terms of the semantic relation type they represent. These benchmarks were used to organise a shared task on Russian semantic relatedness, which attracted 19 teams. We use one of the best approaches identified in this competition to generate the fifth high-coverage resource, the first open distributional thesaurus of Russian. Multiple evaluations of this thesaurus, including a large-scale crowdsourcing study involving native speakers, indicate its high accuracy.

Keywords

Semantic similarity Semantic relatedness Evaluation Distributional thesaurus Crowdsourcing Language resources 

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Alexander Panchenko
    • 1
  • Dmitry Ustalov
    • 2
  • Nikolay Arefyev
    • 3
  • Denis Paperno
    • 4
  • Natalia Konstantinova
    • 5
  • Natalia Loukachevitch
    • 3
  • Chris Biemann
    • 1
  1. 1.TU DarmstadtDarmstadtGermany
  2. 2.Ural Federal UniversityYekaterinburgRussia
  3. 3.Moscow State UniversityMoscowRussia
  4. 4.University of TrentoRoveretoItaly
  5. 5.University of WolverhamptonWolverhamptonUK

Personalised recommendations