Comparisons of Relatedness Measures Through a Word Sense Disambiguation Task
Michael Zock’s work has focussed these last years on finding the appropriate and most adequate word when writing or speaking. The semantic relatedness between words can play an important role in this context. Previous studies have pointed out three kinds of approaches for their evaluation: a theoretical examination of the desirability (or not) of certain mathematical properties, for example in mathematically defined measures: distances, similarities, scores, …; a comparison with human judgement or an evaluation through NLP applications. In this article, we present a novel approach to analyse the semantic relatedness between words that is based on the relevance of semantic relatedness measures on the global level of a word sense disambiguation task. More specifically, for a given selection of senses of a text, a global similarity for the sense selection can be computed, by combining the pairwise similarities through a particular function (sum for example) between all the selected senses. This global similarity value can be matched to other possible values pertaining to the selection, for example the F1 measure resulting from the evaluation with a gold standard reference annotation. We use several classical local semantic similarity measures as well as measures built by our team and study the correlation of the global score compared to the F1 values of a gold standard. Thus, we are able to locate the typical output of an algorithm compared to an exhaustive evaluation, and thus to optimise the measures and the sense selection process in general.
KeywordsSemantic relatedness Word sense disambiguation Semantic similarity measures Evaluation of semantic similarity measures Best atteignable score Correlation global score/F1 measure Lesk measures Gloss overlap measures Tversky’s similarity measure Gloss vector measure
- Banerjee, S., & Pedersen, T. (2002). An adapted Lesk algorithm for word sense disambiguation using wordnet. In CICLing 2002, Mexico City.Google Scholar
- Brody, S., & Lapata, M. (2008). Good neighbors make good senses: Exploiting distributional similarity for unsupervised WSD. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK (pp. 65–72).Google Scholar
- Cowie, J., Guthrie, J., & Guthrie, L. (1992). Lexical disambiguation using simulated annealing. In COLING 1992 (Vol. 1, pp. 359–365). Nantes, France.Google Scholar
- Cramer, I., Wandmacher, T., & Waltinger, U. (2010). WordNet: An electronic lexical database, chapter modeling, learning and processing of text technological data structures. Heidelberg: Springer.Google Scholar
- Gale, W., Church, K., & Yarowsky, D. (1992). One sense per discourse. In Fifth DARPA Speech and Natural Language Workshop (pp. 233–237). Harriman, New York: États-Unis.Google Scholar
- Gelbukh, A., Sidorov, G., & Han, S. Y. (2003). Evolutionary approach to natural language WSD through global coherence optimization. WSEAS Transactions on Communications, 2(1), 11–19.Google Scholar
- Hirst, G., & St-Onge, D. D. (1998). Lexical chains as representations of context for the detection and correction of malapropisms. In C. Fellbaum (Ed.) WordNet: An electronic lexical database (pp. 305–332). Cambridge, MA: MIT Press.Google Scholar
- Lesk, M. (1986). Automatic sense disambiguation using mrd: How to tell a pine cone from an ice cream cone. In Proceedings of SIGDOC ’86 (pp. 24–26). New York, NY, USA: ACM.Google Scholar
- Miller, G. A., Leacock, C., Tengi, R., & Bunker, R. T. (1993). A semantic concordance. In Proceedings of the Workshop on Human Language Technology, HLT ’93 (pp. 303–308). Stroudsburg, PA, USA: Association for Computational Linguistics. doi: 10.3115/1075671.1075742, http://dx.doi.org/10.3115/1075671.1075742.
- Miller, T., Biemann, C., Zesch, T., & Gurevych, I. (2012). Using distributional similarity for lexical expansion in knowledge-based word sense disambiguation. In Proceedings of COLING 2012 (pp. 1781–1796). Mumbai, India: The COLING 2012 Organizing Committee. Retrieved from http://www.aclweb.org/anthology/C12-1109.
- Navigli, R. (2012). A quick tour of word sense disambiguation, induction and related approaches. In Proceedings of the 38th Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM) (pp. 115–129).Google Scholar
- Ng, H. T., & Lee, H. B. (1996). Integrating multiple knowledge sources to disambiguate word sense: An exemplar-based approach. In Proceedings of the 34th annual meeting on Association for Computational Linguistics, ACL ’96 (pp. 40–47). Stroudsburg, PA, USA: Association for Computational Linguistics. doi: 10.3115/981863.981869, http://dx.doi.org/10.3115/981863.981869.
- Patwardhan, S., & Pedersen, T. (2006). Using wordnet based context vectors to estimate the semantic relatedness of concepts. In EACL 2006 Workshop Making Sense of Sense—Bringing Computational Linguistics and Psycholinguistics Together (pp. 1–8).Google Scholar
- Pedersen, T., Banerjee, S., & Patwardhan, S. (2005). Maximizing semantic relatedness to perform WSD. Research report, University of Minnesota Supercomputing Institute.Google Scholar
- Pirró, G., & Euzenat, J. (2010). A feature and information theoretic framework for semantic similarity and relatedness. In P. Patel-Schneider, Y. Pan, P. Hitzler, P. Mika, L. Zhang, J. Pan, I. Horrocks, & B. Glimm (Eds.), The semantic web—ISWC 2010 (Vol. 6496, pp. 615–630)., Lecture Notes in Computer Science Berlin/Heidelberg: Springer.CrossRefGoogle Scholar
- Schwab, D., Goulian, J., & Guillaume, N. (2011). Désambigusation lexicale par propagation de mesures sémantiques locales par algorithmes à colonies de fourmis. In Traitement Automatique des Langues Naturelles (TALN), Montpellier, France.Google Scholar
- Schwab, D., Goulian, J., & Tchechmedjiev, A. (2013). Worst-case complexity and empirical evaluation of artificial intelligence methods for unsupervised word sense disambiguation. International Journal of Web Engineering and Technology 8(2), 124–153. doi: 10.1504/IJWET.2013.055713, http://dx.doi.org/10.1504/IJWET.2013.055713.
- Schwab, D., Goulian, J., Tchechmedjiev, A., & Blanchon, H. (2012). Ant colony algorithm for the unsupervised word sense disambiguation of texts: Comparison and evaluation. In Proceedings of the 25th International Conference on Computational Linguistics (COLING 2012), Mumbai (India).Google Scholar
- Silber, H. G., McCoy, K. F. (2000). Efficient text summarization using lexical chains. In Proceedings of the 5th International Conference on Intelligent User Interfaces, IUI ’00 (pp. 252–255). New York, NY, USA: ACM.Google Scholar
- Wilks, Y., & Stevenson, M. (1998). Word sense disambiguation using optimised combinations of knowledge sources. In COLING ’98 (pp. 1398–1402). Stroudsburg, PA, USA: ACL. Retrieved from http://dx.doi.org/10.3115/980432.980797.
- Zipf, G. K. (1949). Human behavior and the principle of least effort. Reading, MA: Addison-Wesley.Google Scholar
- Zock, M., Ferret, O., & Schwab, D. (2010). Deliberate word access: An intuition, a roadmap and some preliminary empirical results. International Journal of Speech Technology, 13(4), 107–117. Retrieved from http://hal.archives-ouvertes.fr/hal-00953695.
- Zock, M., & Schwab, D. (2011). Storage does not guarantee access: The problem of organizing and accessing words in a speaker’s Lexicon. Journal of Cognitive Science, 12, 233–258. Retrieved from http://hal.archives-ouvertes.fr/hal-00953672. (Impact-F 3.52 estim. in 2012).