Skip to main content
Log in

Evaluating Word Sense Induction and Disambiguation Methods

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

Word Sense Induction (WSI) is the task of identifying the different uses (senses) of a target word in a given text in an unsupervised manner, i.e. without relying on any external resources such as dictionaries or sense-tagged data. This paper presents a thorough description of the SemEval-2010 WSI task and a new evaluation setting for sense induction methods. Our contributions are two-fold: firstly, we provide a detailed analysis of the Semeval-2010 WSI task evaluation results and identify the shortcomings of current evaluation measures. Secondly, we present a new evaluation setting by assessing participating systems’ performance according to the skewness of target words’ distribution of senses showing that there are methods able to perform well above the Most Frequent Sense (MFS) baseline in highly skewed distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://en.wikipedia.org/wiki/Snood_(video_game) [Access:09/12/2011].

  2. http://elt.oup.com/teachers/ocd/ [Access:09/12/2011].

  3. An act that fails.

  4. An event that does not accomplish its intended purpose.

  5. http://developer.yahoo.com/search/ [Access:10/04/2010].

  6. http://www.cs.york.ac.uk/semeval2010_WSI/files/training_data.tar.gz.

  7. http://www.cs.york.ac.uk/semeval2010_WSI/files/test_data.tar.gz.

  8. http://mallet.cs.umass.edu.

  9. http://www.cs.york.ac.uk/semeval2010_WSI/files/evaluation.zip.

  10. http://www.cs.york.ac.uk/semeval2010_WSI/files/evaluation.zip.

  11. The number of clusters of each system is shown in Table 9.

References

  • Agirre, E., Ansa, O., Hovy, E., & Martinez, D. (2001). Enriching wordnet concepts with topic signatures. ArXiv Computer Science e-prints.

  • Agirre, E., & De Lacalle, O. L. (2003). Clustering wordnet word senses. In Proceedings of the conference on recent advances on natural language (RANLP’03), Borovets, Bulgaria.

  • Agirre, E., & De Lacalle, O. L. (2004). Publicly available topic signatures for all wordnet nominal senses. In Proceedings of the 4th international conference on language resources and evaluation(LREC), Lisbon, Portugal.

  • Agirre, E., Martínez, D., de Lacalle, O. L., & Soroa, A. (2006a). Evaluating and optimizing the parameters of an unsupervised graph-based wsd algorithm. In Proceedings of the first workshop on graph based methods for natural language processing, TextGraphs-1 (pp. 89–96). Stroudsburg, PA, USA: Association for Computational Linguistics.

  • Agirre, E., Martínez, D., López de Lacalle, O., & Soroa, A. (2006b). Two graph-based algorithms for state-of-the-art wsd. In Proceedings of the conference on empirical methods in natural language processing (pp. 585–593). Sydney, Australia: ACL.

  • Agirre, E., & Soroa, A. (2007a). Semeval-2007 task 02: Evaluating word sense induction and discrimination systems. In Proceedings of the fourth international workshop on semantic evaluations (pp. 7–12). Prague, Czech Republic: ACL.

  • Agirre, E., & Soroa, A. (2007b). Ubc-as: A graph based unsupervised system for induction and classification. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 346–349). Prague, Czech Republic: Association for Computational Linguistics.

  • Alfonseca, E., & Manandhar, S. (2002). Extending a lexical ontology by a combination of distributional semantics signatures. In Proceedings of the 13th international conference on knowledge engineering and knowledge management. Ontologies and the semantic web, EKAW ’02 (pp. 1–7). London, UK: Springer.

  • Amigó, E., Gonzalo, J., Artiles, J., & Verdejo, F. (2009). A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf. Retr., 12, 461–486.

    Article  Google Scholar 

  • Artiles, J., Amigó, E., & Gonzalo, J. (2009). The role of named entities in Web People Search. In Proceedings of the 2009 conference on empirical methods in natural language processing (pp. 534–542). Singapore: Association for Computational Linguistics.

  • Bagga, A., & Baldwin, B. (1998). Entity-based cross-document coreferencing using the vector space model. In Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics—Volume 1, ACL ’98 (pp. 79–85). Stroudsburg, PA, USA: Association for Computational Linguistics.

  • Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998). The berkeley framenet project. In Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics—Volume 1, ACL ’98 (pp. 86–90). Stroudsburg, PA, USA: Association for Computational Linguistics.

  • Biemann, C. (2006). Chinese whispers—An efficient graph clustering algorithm and its application to natural language processing problems. In Proceedings of textGraphs (pp. 73–80). New York,USA: ACL.

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. J. Mach. Learn. Res., 3, 993–1022.

    Google Scholar 

  • Brody, S., & Lapata, M. (2009). Bayesian word sense induction. In Proceedings of the 12th conference of the european chapter of the association for computational linguistics, EACL ’09 (pp. 103–111). Stroudsburg, PA, USA: Association for Computational Linguistics.

  • Clauset, A., Moore, C., & Newman, M. E. J. (2008). Hierarchical structure and the prediction of missing links in networks. Nature, 453(7191), 98–101.

    Article  Google Scholar 

  • Daszykowski, M., Walczak, B., & Massart, D. L. (2002). On the optimal partitioning of data with k-means, growing k-means, neural gas, and growing neural gas. Journal of Chemical Information and Computer Sciences, 42(6), 1378–1389.

    Google Scholar 

  • Dorow, B., & Widdows, D. (2003). Discovering corpus-specific word senses. In Proceedings of the 10th conference of the European chapter of the ACL (pp. 79–82). Budapest, Hungary: ACL.

  • Elshamy, W., Caragea, D., & Hsu, W. (2010). Ksu kdd: Word sense induction by clustering in topic space. In Proceedings of the 5th international workshop on semantic evaluation (pp. 367–370). Uppsala, Sweden: Association for Computational Linguistics.

  • Fellbaum, C. (1998). Wordnet: An Electronic Lexical Database. Cambridge, MA, USA: MIT Press.

    Google Scholar 

  • Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2006). Ontonotes: The 90 % solution. In Proceedings of the human language technology / North American Association for computational linguistics conference, pp. 57–60. New York, USA.

  • Jiang, J. J., & Conrath, D. W. (1997). Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In International conference research on computational linguistics, pp. 19–33.

  • Jurgens, D., & Stevens, K. (2010). Hermit: Flexible clustering for the semeval-2 wsi task. In Proceedings of the 5th international workshop on semantic evaluation (pp. 359–362). Uppsala, Sweden: Association for Computational Linguistics.

  • Kanerva, P., Kristoferson, J., & Anders, H. (2000). Random indexing of text samples for latent semantic analysis. In Proceedings of the 22nd annual conference of the cognitive science society (pp. 10–36). Uppsala, Sweden.

  • Kern, R., Muhr, M., & Granitzer, M. (2010). Kcdc: Word sense induction by using grammatical dependencies and sentence phrase structure. In Proceedings of the 5th international workshop on semantic evaluation (pp. 351–354). Uppsala, Sweden: Association for Computational Linguistics.

  • Kilgarriff, A., Kovář, V., Krek, S., Srdanović, I., & Tiberius, C. (2010). A quantitative evaluation of word sketches. In Proceedings of the XIV Euralex international Congress, pp. 251–263, Leeuwarden, Netherlands. Leeuwarden: Fryske Academy.

  • Klapaftis, I., & Manandhar, S. (2010). Word sense induction & disambiguation using hierarchical random graphs. In Proceedings of the 2010 conference on empirical methods in natural language processing (pp. 745–755). Cambridge, MA: Association for Computational Linguistics.

  • Korkontzelos, I., & Manandhar, S. (2010). Uoy: Graphs of unambiguous vertices for word sense induction and disambiguation. In Proceedings of the 5th international workshop on semantic evaluation (pp. 355–358). Uppsala, Sweden: Association for Computational Linguistics.

  • Lin, D., & Pantel, P. (2002). Concept discovery from text. In Proceedings of the 19th international conference on computational linguistics (pp. 1–7). Morristown, NJ, USA: Association for Computational Linguistics.

  • Manandhar, S., Klapaftis, I., Dligach, D., & Pradhan, S. (2010). Semeval-2010 task 14: Word sense induction & disambiguation. In Proceedings of the 5th international workshop on semantic evaluation (pp. 63–68). Uppsala, Sweden: Association for Computational Linguistics.

  • Navigli, R., & Crisafulli, G. (2010). Inducing word senses to improve web search result clustering. In Proceedings of the 2010 conference on empirical methods in natural language processing (pp. 116–126). Cambridge, MA: Association for Computational Linguistics.

  • Niu, Z.-Y., Ji, D.-H., & Tan, C.-L. (2007). I2r: Three systems for word sense discrimination, chinese word sense disambiguation, and english word sense disambiguation. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 177–182). Prague, Czech Republic: Association for Computational Linguistics.

  • Pedersen, T. (2007). Umnd2: Senseclusters applied to the sense induction task of senseval-4. In Proceedings of the fourth international workshop on semantic evaluations (pp. 394–397). Prague, Czech Republic: ACL.

  • Pedersen, T. (2010). Duluth-wsi: Senseclusters applied to the sense induction task of semeval-2. In Proceedings of the 5th international workshop on semantic evaluation (pp. 363–366). Uppsala, Sweden: Association for Computational Linguistics.

  • Pedersen, T., & Kulkarni, A. (2006). Automatic cluster stopping with criterion functions and the gap statistic. In Proceedings of the 2006 conference of the North American chapter of the ACL on human language technology (pp. 276–279). Morristown, NJ, USA: ACL.

  • Pinto, D., Rosso, P., & Jiménez-Salazar, H. (2007). Upv-si: Word sense induction using self term expansion. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 430–433). Prague, Czech Republic: Association for Computational Linguistics.

  • Purandare, A., & Pedersen, T. (2004). Senseclusters - finding clusters that represent word senses. In D. M. Susan Dumais & S. Roukos (Eds.), HLT-NAACL 2004: Demonstration Papers, (pp. 26–29). Boston, USA: ACL.

  • Rosenberg, A., & Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL) (pp. 410–420). Prague, Czech Republic.

  • Salton, G., & Buckley, C. (1988). Term weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523.

    Article  Google Scholar 

  • Schütze, H. (1998). Automatic word sense discrimination. Computational Linguistics, 24(1), 97–123.

    Google Scholar 

  • Tsuruoka, Y., & Tsujii, J. (2005). Bidirectional inference with the easiest-first strategy for tagging sequence data. In HLT ’05: Proceedings of the conference on human language technology and empirical methods in natural language processing (pp. 467–474). Morristown, NJ, USA: Association for Computational Linguistics.

  • Véronis, J. (2004). Hyperlex: lexical cartography for information retrieval. Computer Speech & Language, 18(3), 223–252.

    Article  Google Scholar 

Download references

Acknowledgments

We gratefully acknowledge the support of the EU FP7 INDECT project, Grant No. 218086, the National Science Foundation Grant NSF-0715078, Consistent Criteria for Word Sense Disambiguation, and the GALE program of the Defense Advanced Research Projects Agency, Contract No. HR0011-06-C-0022, a subcontract from the BBN-AGILE Team.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ioannis P. Klapaftis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Klapaftis, I.P., Manandhar, S. Evaluating Word Sense Induction and Disambiguation Methods. Lang Resources & Evaluation 47, 579–605 (2013). https://doi.org/10.1007/s10579-012-9205-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-012-9205-0

Keywords

Navigation