Advertisement

Language Resources and Evaluation

, Volume 47, Issue 3, pp 579–605 | Cite as

Evaluating Word Sense Induction and Disambiguation Methods

  • Ioannis P. KlapaftisEmail author
  • Suresh Manandhar
Original Paper

Abstract

Word Sense Induction (WSI) is the task of identifying the different uses (senses) of a target word in a given text in an unsupervised manner, i.e. without relying on any external resources such as dictionaries or sense-tagged data. This paper presents a thorough description of the SemEval-2010 WSI task and a new evaluation setting for sense induction methods. Our contributions are two-fold: firstly, we provide a detailed analysis of the Semeval-2010 WSI task evaluation results and identify the shortcomings of current evaluation measures. Secondly, we present a new evaluation setting by assessing participating systems’ performance according to the skewness of target words’ distribution of senses showing that there are methods able to perform well above the Most Frequent Sense (MFS) baseline in highly skewed distributions.

Keywords

Word Sense Induction Word Sense Disambiguation Lexical Semantics 

Notes

Acknowledgments

We gratefully acknowledge the support of the EU FP7 INDECT project, Grant No. 218086, the National Science Foundation Grant NSF-0715078, Consistent Criteria for Word Sense Disambiguation, and the GALE program of the Defense Advanced Research Projects Agency, Contract No. HR0011-06-C-0022, a subcontract from the BBN-AGILE Team.

References

  1. Agirre, E., Ansa, O., Hovy, E., & Martinez, D. (2001). Enriching wordnet concepts with topic signatures. ArXiv Computer Science e-prints.Google Scholar
  2. Agirre, E., & De Lacalle, O. L. (2003). Clustering wordnet word senses. In Proceedings of the conference on recent advances on natural language (RANLP’03), Borovets, Bulgaria.Google Scholar
  3. Agirre, E., & De Lacalle, O. L. (2004). Publicly available topic signatures for all wordnet nominal senses. In Proceedings of the 4th international conference on language resources and evaluation(LREC), Lisbon, Portugal.Google Scholar
  4. Agirre, E., Martínez, D., de Lacalle, O. L., & Soroa, A. (2006a). Evaluating and optimizing the parameters of an unsupervised graph-based wsd algorithm. In Proceedings of the first workshop on graph based methods for natural language processing, TextGraphs-1 (pp. 89–96). Stroudsburg, PA, USA: Association for Computational Linguistics.Google Scholar
  5. Agirre, E., Martínez, D., López de Lacalle, O., & Soroa, A. (2006b). Two graph-based algorithms for state-of-the-art wsd. In Proceedings of the conference on empirical methods in natural language processing (pp. 585–593). Sydney, Australia: ACL.Google Scholar
  6. Agirre, E., & Soroa, A. (2007a). Semeval-2007 task 02: Evaluating word sense induction and discrimination systems. In Proceedings of the fourth international workshop on semantic evaluations (pp. 7–12). Prague, Czech Republic: ACL.Google Scholar
  7. Agirre, E., & Soroa, A. (2007b). Ubc-as: A graph based unsupervised system for induction and classification. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 346–349). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  8. Alfonseca, E., & Manandhar, S. (2002). Extending a lexical ontology by a combination of distributional semantics signatures. In Proceedings of the 13th international conference on knowledge engineering and knowledge management. Ontologies and the semantic web, EKAW ’02 (pp. 1–7). London, UK: Springer.Google Scholar
  9. Amigó, E., Gonzalo, J., Artiles, J., & Verdejo, F. (2009). A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf. Retr., 12, 461–486.CrossRefGoogle Scholar
  10. Artiles, J., Amigó, E., & Gonzalo, J. (2009). The role of named entities in Web People Search. In Proceedings of the 2009 conference on empirical methods in natural language processing (pp. 534–542). Singapore: Association for Computational Linguistics.Google Scholar
  11. Bagga, A., & Baldwin, B. (1998). Entity-based cross-document coreferencing using the vector space model. In Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics—Volume 1, ACL ’98 (pp. 79–85). Stroudsburg, PA, USA: Association for Computational Linguistics.Google Scholar
  12. Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998). The berkeley framenet project. In Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics—Volume 1, ACL ’98 (pp. 86–90). Stroudsburg, PA, USA: Association for Computational Linguistics.Google Scholar
  13. Biemann, C. (2006). Chinese whispers—An efficient graph clustering algorithm and its application to natural language processing problems. In Proceedings of textGraphs (pp. 73–80). New York,USA: ACL.Google Scholar
  14. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. J. Mach. Learn. Res., 3, 993–1022.Google Scholar
  15. Brody, S., & Lapata, M. (2009). Bayesian word sense induction. In Proceedings of the 12th conference of the european chapter of the association for computational linguistics, EACL ’09 (pp. 103–111). Stroudsburg, PA, USA: Association for Computational Linguistics.Google Scholar
  16. Clauset, A., Moore, C., & Newman, M. E. J. (2008). Hierarchical structure and the prediction of missing links in networks. Nature, 453(7191), 98–101.CrossRefGoogle Scholar
  17. Daszykowski, M., Walczak, B., & Massart, D. L. (2002). On the optimal partitioning of data with k-means, growing k-means, neural gas, and growing neural gas. Journal of Chemical Information and Computer Sciences, 42(6), 1378–1389.Google Scholar
  18. Dorow, B., & Widdows, D. (2003). Discovering corpus-specific word senses. In Proceedings of the 10th conference of the European chapter of the ACL (pp. 79–82). Budapest, Hungary: ACL.Google Scholar
  19. Elshamy, W., Caragea, D., & Hsu, W. (2010). Ksu kdd: Word sense induction by clustering in topic space. In Proceedings of the 5th international workshop on semantic evaluation (pp. 367–370). Uppsala, Sweden: Association for Computational Linguistics.Google Scholar
  20. Fellbaum, C. (1998). Wordnet: An Electronic Lexical Database. Cambridge, MA, USA: MIT Press.Google Scholar
  21. Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2006). Ontonotes: The 90 % solution. In Proceedings of the human language technology / North American Association for computational linguistics conference, pp. 57–60. New York, USA.Google Scholar
  22. Jiang, J. J., & Conrath, D. W. (1997). Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In International conference research on computational linguistics, pp. 19–33.Google Scholar
  23. Jurgens, D., & Stevens, K. (2010). Hermit: Flexible clustering for the semeval-2 wsi task. In Proceedings of the 5th international workshop on semantic evaluation (pp. 359–362). Uppsala, Sweden: Association for Computational Linguistics.Google Scholar
  24. Kanerva, P., Kristoferson, J., & Anders, H. (2000). Random indexing of text samples for latent semantic analysis. In Proceedings of the 22nd annual conference of the cognitive science society (pp. 10–36). Uppsala, Sweden.Google Scholar
  25. Kern, R., Muhr, M., & Granitzer, M. (2010). Kcdc: Word sense induction by using grammatical dependencies and sentence phrase structure. In Proceedings of the 5th international workshop on semantic evaluation (pp. 351–354). Uppsala, Sweden: Association for Computational Linguistics.Google Scholar
  26. Kilgarriff, A., Kovář, V., Krek, S., Srdanović, I., & Tiberius, C. (2010). A quantitative evaluation of word sketches. In Proceedings of the XIV Euralex international Congress, pp. 251–263, Leeuwarden, Netherlands. Leeuwarden: Fryske Academy.Google Scholar
  27. Klapaftis, I., & Manandhar, S. (2010). Word sense induction & disambiguation using hierarchical random graphs. In Proceedings of the 2010 conference on empirical methods in natural language processing (pp. 745–755). Cambridge, MA: Association for Computational Linguistics.Google Scholar
  28. Korkontzelos, I., & Manandhar, S. (2010). Uoy: Graphs of unambiguous vertices for word sense induction and disambiguation. In Proceedings of the 5th international workshop on semantic evaluation (pp. 355–358). Uppsala, Sweden: Association for Computational Linguistics.Google Scholar
  29. Lin, D., & Pantel, P. (2002). Concept discovery from text. In Proceedings of the 19th international conference on computational linguistics (pp. 1–7). Morristown, NJ, USA: Association for Computational Linguistics.Google Scholar
  30. Manandhar, S., Klapaftis, I., Dligach, D., & Pradhan, S. (2010). Semeval-2010 task 14: Word sense induction & disambiguation. In Proceedings of the 5th international workshop on semantic evaluation (pp. 63–68). Uppsala, Sweden: Association for Computational Linguistics.Google Scholar
  31. Navigli, R., & Crisafulli, G. (2010). Inducing word senses to improve web search result clustering. In Proceedings of the 2010 conference on empirical methods in natural language processing (pp. 116–126). Cambridge, MA: Association for Computational Linguistics.Google Scholar
  32. Niu, Z.-Y., Ji, D.-H., & Tan, C.-L. (2007). I2r: Three systems for word sense discrimination, chinese word sense disambiguation, and english word sense disambiguation. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 177–182). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  33. Pedersen, T. (2007). Umnd2: Senseclusters applied to the sense induction task of senseval-4. In Proceedings of the fourth international workshop on semantic evaluations (pp. 394–397). Prague, Czech Republic: ACL.Google Scholar
  34. Pedersen, T. (2010). Duluth-wsi: Senseclusters applied to the sense induction task of semeval-2. In Proceedings of the 5th international workshop on semantic evaluation (pp. 363–366). Uppsala, Sweden: Association for Computational Linguistics.Google Scholar
  35. Pedersen, T., & Kulkarni, A. (2006). Automatic cluster stopping with criterion functions and the gap statistic. In Proceedings of the 2006 conference of the North American chapter of the ACL on human language technology (pp. 276–279). Morristown, NJ, USA: ACL.Google Scholar
  36. Pinto, D., Rosso, P., & Jiménez-Salazar, H. (2007). Upv-si: Word sense induction using self term expansion. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 430–433). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  37. Purandare, A., & Pedersen, T. (2004). Senseclusters - finding clusters that represent word senses. In D. M. Susan Dumais & S. Roukos (Eds.), HLT-NAACL 2004: Demonstration Papers, (pp. 26–29). Boston, USA: ACL.Google Scholar
  38. Rosenberg, A., & Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL) (pp. 410–420). Prague, Czech Republic.Google Scholar
  39. Salton, G., & Buckley, C. (1988). Term weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523.CrossRefGoogle Scholar
  40. Schütze, H. (1998). Automatic word sense discrimination. Computational Linguistics, 24(1), 97–123.Google Scholar
  41. Tsuruoka, Y., & Tsujii, J. (2005). Bidirectional inference with the easiest-first strategy for tagging sequence data. In HLT ’05: Proceedings of the conference on human language technology and empirical methods in natural language processing (pp. 467–474). Morristown, NJ, USA: Association for Computational Linguistics.Google Scholar
  42. Véronis, J. (2004). Hyperlex: lexical cartography for information retrieval. Computer Speech & Language, 18(3), 223–252.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  1. 1.Microsoft CorporationRedmondUSA
  2. 2.Department of Computer ScienceUniversity of YorkYorkUK

Personalised recommendations