Skip to main content

Comparisons of Relatedness Measures Through a Word Sense Disambiguation Task

  • Chapter
  • First Online:
Language Production, Cognition, and the Lexicon

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 48))

  • 1476 Accesses

Abstract

Michael Zock’s work has focussed these last years on finding the appropriate and most adequate word when writing or speaking. The semantic relatedness between words can play an important role in this context. Previous studies have pointed out three kinds of approaches for their evaluation: a theoretical examination of the desirability (or not) of certain mathematical properties, for example in mathematically defined measures: distances, similarities, scores, …; a comparison with human judgement or an evaluation through NLP applications. In this article, we present a novel approach to analyse the semantic relatedness between words that is based on the relevance of semantic relatedness measures on the global level of a word sense disambiguation task. More specifically, for a given selection of senses of a text, a global similarity for the sense selection can be computed, by combining the pairwise similarities through a particular function (sum for example) between all the selected senses. This global similarity value can be matched to other possible values pertaining to the selection, for example the F1 measure resulting from the evaluation with a gold standard reference annotation. We use several classical local semantic similarity measures as well as measures built by our team and study the correlation of the global score compared to the F1 values of a gold standard. Thus, we are able to locate the typical output of an algorithm compared to an exhaustive evaluation, and thus to optimise the measures and the sense selection process in general.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Sufficient in the sense of permitting the exhibition of statistical significance, even though in practice we generate several orders of magnitude more samples that the bare minimum necessary to obtain statistically significant differences in the average values.

  2. 2.

    The article is available here https://wiki.csc.calpoly.edu/CSC-581-S11-06/browser/trunk/treebank_paper/buraw/wsj_0105.ready.buraw.

  3. 3.

    http://getalp.imag.fr/static/wsd/Schwab-et-al-SemanticSimilarity2014.html.

  4. 4.

    http://wn-similarity.sourceforge.net.

  5. 5.

    Given that we have over a million configuration and that the correlation is calculated in chunks of 100 scores, each group contains over 10,000 samples, which at a 10−4 difference range should guarantee a sufficient statistical power.

References

  • Baldwin, T., Kim, S., Bond, F., Fujita, S., Martinez, D., & Tanaka, T. (2010). A reexamination of MRD-based word sense disambiguation. ACM Transactions on Asian Language Information Processing, 9(1), 4:1–4:21. doi:10.1145/1731035.1731039, http://doi.acm.org/10.1145/1731035.1731039.

  • Banerjee, S., & Pedersen, T. (2002). An adapted Lesk algorithm for word sense disambiguation using wordnet. In CICLing 2002, Mexico City.

    Google Scholar 

  • Brody, S., & Lapata, M. (2008). Good neighbors make good senses: Exploiting distributional similarity for unsupervised WSD. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK (pp. 65–72).

    Google Scholar 

  • Budanitsky, A., & Hirst, G. (2006). Evaluating WordNet-based measures of lexical semantic relatedness. Computational Linguistics, 32(1), 13–47.

    Article  MATH  Google Scholar 

  • Cowie, J., Guthrie, J., & Guthrie, L. (1992). Lexical disambiguation using simulated annealing. In COLING 1992 (Vol. 1, pp. 359–365). Nantes, France.

    Google Scholar 

  • Cramer, I., Wandmacher, T., & Waltinger, U. (2010). WordNet: An electronic lexical database, chapter modeling, learning and processing of text technological data structures. Heidelberg: Springer.

    Google Scholar 

  • Dice, L. R. (1945). Measures of the amount of ecologic association between species. Ecology, 26(3), 297–302.

    Article  Google Scholar 

  • Gale, W., Church, K., & Yarowsky, D. (1992). One sense per discourse. In Fifth DARPA Speech and Natural Language Workshop (pp. 233–237). Harriman, New York: États-Unis.

    Google Scholar 

  • Gelbukh, A., Sidorov, G., & Han, S. Y. (2003). Evolutionary approach to natural language WSD through global coherence optimization. WSEAS Transactions on Communications, 2(1), 11–19.

    Google Scholar 

  • Hirst, G., & St-Onge, D. D. (1998). Lexical chains as representations of context for the detection and correction of malapropisms. In C. Fellbaum (Ed.) WordNet: An electronic lexical database (pp. 305–332). Cambridge, MA: MIT Press.

    Google Scholar 

  • Lesk, M. (1986). Automatic sense disambiguation using mrd: How to tell a pine cone from an ice cream cone. In Proceedings of SIGDOC86 (pp. 24–26). New York, NY, USA: ACM.

    Google Scholar 

  • Miller, G. A., Leacock, C., Tengi, R., & Bunker, R. T. (1993). A semantic concordance. In Proceedings of the Workshop on Human Language Technology, HLT93 (pp. 303–308). Stroudsburg, PA, USA: Association for Computational Linguistics. doi:10.3115/1075671.1075742, http://dx.doi.org/10.3115/1075671.1075742.

  • Miller, T., Biemann, C., Zesch, T., & Gurevych, I. (2012). Using distributional similarity for lexical expansion in knowledge-based word sense disambiguation. In Proceedings of COLING 2012 (pp. 1781–1796). Mumbai, India: The COLING 2012 Organizing Committee. Retrieved from http://www.aclweb.org/anthology/C12-1109.

  • Navigli, R. (2009). WSD: A survey. ACM Computing Surveys, 41(2), 1–69.

    Article  Google Scholar 

  • Navigli, R. (2012). A quick tour of word sense disambiguation, induction and related approaches. In Proceedings of the 38th Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM) (pp. 115–129).

    Google Scholar 

  • Navigli, R., & Lapata, M. (2010). An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 678–692.

    Article  Google Scholar 

  • Ng, H. T., & Lee, H. B. (1996). Integrating multiple knowledge sources to disambiguate word sense: An exemplar-based approach. In Proceedings of the 34th annual meeting on Association for Computational Linguistics, ACL96 (pp. 40–47). Stroudsburg, PA, USA: Association for Computational Linguistics. doi:10.3115/981863.981869, http://dx.doi.org/10.3115/981863.981869.

  • Patwardhan, S., & Pedersen, T. (2006). Using wordnet based context vectors to estimate the semantic relatedness of concepts. In EACL 2006 Workshop Making Sense of Sense—Bringing Computational Linguistics and Psycholinguistics Together (pp. 1–8).

    Google Scholar 

  • Pedersen, T., Banerjee, S., & Patwardhan, S. (2005). Maximizing semantic relatedness to perform WSD. Research report, University of Minnesota Supercomputing Institute.

    Google Scholar 

  • Pirró, G., & Euzenat, J. (2010). A feature and information theoretic framework for semantic similarity and relatedness. In P. Patel-Schneider, Y. Pan, P. Hitzler, P. Mika, L. Zhang, J. Pan, I. Horrocks, & B. Glimm (Eds.), The semantic web—ISWC 2010 (Vol. 6496, pp. 615–630)., Lecture Notes in Computer Science Berlin/Heidelberg: Springer.

    Chapter  Google Scholar 

  • Rogers, D., & Tanimoto, T. (1960). A computer program for classifying plants. Science, 132(3434), 1115–1118.

    Article  Google Scholar 

  • Schutze, H. (1998). Automatic word sense discrimination. Computational Linguistics, 24(1), 97–123.

    MathSciNet  Google Scholar 

  • Schwab, D., Goulian, J., & Guillaume, N. (2011). Désambigusation lexicale par propagation de mesures sémantiques locales par algorithmes à colonies de fourmis. In Traitement Automatique des Langues Naturelles (TALN), Montpellier, France.

    Google Scholar 

  • Schwab, D., Goulian, J., & Tchechmedjiev, A. (2013). Worst-case complexity and empirical evaluation of artificial intelligence methods for unsupervised word sense disambiguation. International Journal of Web Engineering and Technology 8(2), 124–153. doi:10.1504/IJWET.2013.055713, http://dx.doi.org/10.1504/IJWET.2013.055713.

  • Schwab, D., Goulian, J., Tchechmedjiev, A., & Blanchon, H. (2012). Ant colony algorithm for the unsupervised word sense disambiguation of texts: Comparison and evaluation. In Proceedings of the 25th International Conference on Computational Linguistics (COLING 2012), Mumbai (India).

    Google Scholar 

  • Silber, H. G., McCoy, K. F. (2000). Efficient text summarization using lexical chains. In Proceedings of the 5th International Conference on Intelligent User Interfaces, IUI00 (pp. 252–255). New York, NY, USA: ACM.

    Google Scholar 

  • Tversky, A. (1977). Features of similarity. Psychological Review, 84(4), 327–352.

    Article  Google Scholar 

  • Wilks, Y., & Stevenson, M. (1998). Word sense disambiguation using optimised combinations of knowledge sources. In COLING98 (pp. 1398–1402). Stroudsburg, PA, USA: ACL. Retrieved from http://dx.doi.org/10.3115/980432.980797.

  • Zipf, G. K. (1949). Human behavior and the principle of least effort. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Zock, M., Ferret, O., & Schwab, D. (2010). Deliberate word access: An intuition, a roadmap and some preliminary empirical results. International Journal of Speech Technology, 13(4), 107–117. Retrieved from http://hal.archives-ouvertes.fr/hal-00953695.

  • Zock, M., & Schwab, D. (2011). Storage does not guarantee access: The problem of organizing and accessing words in a speaker’s Lexicon. Journal of Cognitive Science, 12, 233–258. Retrieved from http://hal.archives-ouvertes.fr/hal-00953672. (Impact-F 3.52 estim. in 2012).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Didier Schwab .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Schwab, D., Tchechmedjiev, A., Goulian, J., Sérasset, G. (2015). Comparisons of Relatedness Measures Through a Word Sense Disambiguation Task. In: Gala, N., Rapp, R., Bel-Enguix, G. (eds) Language Production, Cognition, and the Lexicon. Text, Speech and Language Technology, vol 48. Springer, Cham. https://doi.org/10.1007/978-3-319-08043-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08043-7_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08042-0

  • Online ISBN: 978-3-319-08043-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics