Semantic Similarity Functions in Word Sense Disambiguation

  • Łukasz Kobyliński
  • Mateusz Kopeć
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7499)


This paper presents a method of improving the results of automatic Word Sense Disambiguation by generalizing nouns appearing in a disambiguated context to concepts. A corpus-based semantic similarity function is used for that purpose, by substituting appearances of particular nouns with a set of the most closely related similar words. We show that this approach may be applied to both supervised and unsupervised WSD methods and in both cases leads to an improvement in disambiguation accuracy. We evaluate the proposed approach by conducting a series of lexical sample WSD experiments on both domain-restricted dataset and a general, balanced Polish-language text corpus.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Pradhan, S., Loper, E., Dligach, D., Palmer, M.: Semeval-2007 task-17: English lexical sample srl and all words. In: Proceedings of SemEval 2007 (2007)Google Scholar
  2. 2.
    Fellbaum, C.: WordNet: An Electronic Lexical Database. Bradford Books (1998)Google Scholar
  3. 3.
    Agirre, E., Soroa, A.: Personalizing PageRank for word sense disambiguation. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 33–41. ACL (2009)Google Scholar
  4. 4.
    Kopeć, M., Młodzki, R., Przepiórkowski, A.: Word Sense Disambiguation in the National Corpus of Polish. Prace Filologiczne, vol. LX (forthcoming, 2012)Google Scholar
  5. 5.
    Kobyliński, Ł.: Mining Class Association Rules for Word Sense Disambiguation. In: Bouvry, P., Kłopotek, M.A., Leprévost, F., Marciniak, M., Mykowiecka, A., Rybiński, H. (eds.) SIIS 2011. LNCS, vol. 7053, pp. 307–317. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Kohomban, U.S., Lee, W.S.: Learning semantic classes for word sense disambiguation. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 34–41. ACL (2005)Google Scholar
  7. 7.
    Banerjee, S., Pedersen, T.: An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 136–145. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  8. 8.
    Iida, R., McCarthy, D., Koeling, R.: Gloss-based semantic similarity metrics for predominant sense acquisition. In: Proceedings of the Third International Joint Conference on Natural Language Processing, pp. 561–568 (2008)Google Scholar
  9. 9.
    Lin, D.: Automatic retrieval and clustering of similar words. In: COLING-ACL, pp. 768–774 (1998)Google Scholar
  10. 10.
    Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Computational Linguistics 32, 13–47 (2006)zbMATHCrossRefGoogle Scholar
  11. 11.
    Piasecki, M., Szpakowicz, S., Broda, B.: Automatic Selection of Heterogeneous Syntactic Features in Semantic Similarity of Polish Nouns. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 99–106. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Młodzki, R., Przepiórkowski, A.: The WSD development environment. In: Proceedings of the 4th Language and Technology Conference (2009)Google Scholar
  13. 13.
    Przepiórkowski, A., Bańko, M., Górski, R.L., Lewandowska-Tomaszczyk, B. (eds.): Narodowy Korpus Jc̨zyka Polskiego. Wydawnictwo Naukowe PWN, Warsaw (forthcoming)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Łukasz Kobyliński
    • 1
  • Mateusz Kopeć
    • 1
  1. 1.Institute of Computer SciencePolish Academy of SciencesWarszawaPoland

Personalised recommendations