Advertisement

Rank-Based Transformation in Measuring Semantic Relatedness

  • Bartosz Broda
  • Maciej Piasecki
  • Stan Szpakowicz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5549)

Abstract

Rank weight functions had been shown to increase the accuracy of measures of semantic relatedness for Polish. We present a generalised ranking principle and demonstrate its effect on a range of established measures of semantic relatedness, and on a different language. The results confirm that the generalised transformation method based on ranking brings an improvement over several well-known measures.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agirre, E., Edmonds, P. (eds.): Word Sense Disambiguation: Algorithms and Applications. Springer, Heidelberg (2006)Google Scholar
  2. 2.
    Lin, D.: Automatic retrieval and clustering of similar words. In: Proc. COLING 1998, ACL, pp. 768–774 (1998)Google Scholar
  3. 3.
    Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of semantic distance. Computational Linguistics 32(1), 13–47 (2006)CrossRefzbMATHGoogle Scholar
  4. 4.
    Fellbaum, C. (ed.): WordNet – An Electronic Lexical Database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  5. 5.
    Derwojedowa, M., Piasecki, M., Szpakowicz, S., Zawisławska, M.: plWordNet – The Polish Wordnet. Project homepage, http://www.plwordnet.pwr.wroc.pl
  6. 6.
    Ruge, G.: Experiments on linguistically-based term associations. Information Processing and Management 28(3), 317–332 (1992)CrossRefGoogle Scholar
  7. 7.
    Weeds, J., Weir, D.: Co-occurrence retrieval: A flexible framework for lexical distributional similarity. Computational Linguistics 31(4), 439–475 (2005)CrossRefzbMATHGoogle Scholar
  8. 8.
    Piasecki, M., Szpakowicz, S., Broda, B.: Automatic selection of heterogeneous syntactic features in semantic similarity of Polish nouns. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 99–106. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Piasecki, M., Broda, B.: Semantic similarity measure of Polish nouns based on linguistic features. In: Abramowicz, W. (ed.) BIS 2007. LNCS, vol. 4439, pp. 381–390. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    Lin, D., Pantel, P.: Concept discovery from text. In: Proc. COLING 2002, Taipei, Taiwan, pp. 577–583 (2002)Google Scholar
  11. 11.
    Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwe, R., Wang, Z.: New experiments in distributional representations of synonymy. In: Proc. Ninth Conf. on Computational Natural Language Learning, pp. 25–32 (June 2005)Google Scholar
  12. 12.
    Landauer, T., Dumais, S.: A solution to Plato’s problem: The latent semantic analysis theory of acquisition. Psychological Review 104(2), 211–240 (1997)CrossRefGoogle Scholar
  13. 13.
    Piasecki, M., Szpakowicz, S., Broda, B.: Extended similarity test for the evaluation of semantic similarity functions. In: Vetulani, Z. (ed.) Proc. 3rd Language and Technology Conference, Poznań, Poznań, Wydawnictwo Poznańskie Sp. z o.o (2007)Google Scholar
  14. 14.
    BNC: The British National Corpus, version 2 (BNC World). Distributed by Oxford University Computing Services on behalf of the BNC Consortium (2001)Google Scholar
  15. 15.
    Lin, D.: Principle-based parsing without overgeneration. In: Proc. 31st Meeting of the ACL, pp. 112–120 (1993)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Bartosz Broda
    • 1
  • Maciej Piasecki
    • 1
  • Stan Szpakowicz
    • 2
    • 3
  1. 1.Institute of InformaticsWrocław University of TechnologyPoland
  2. 2.School of Information Technology and EngineeringUniversity of OttawaCanada
  3. 3.Institute of Computer SciencePolish Academy of SciencesPoland

Personalised recommendations