Skip to main content

Framework and Results for English SENSEVAL

Abstract

Senseval was the first open, community-based evaluation exercisefor Word Sense Disambiguation programs. It adopted the quantitativeapproach to evaluation developed in MUC and other ARPA evaluationexercises. It took place in 1998. In this paper we describe thestructure, organisation and results of the SENSEVAL exercise forEnglish. We present and defend various design choices for theexercise, describe the data and gold-standard preparation, considerissues of scoring strategies and baselines, and present the resultsfor the 18 participating systems. The exercise identifies thestate-of-the-art for fine-grained word sense disambiguation, wheretraining data is available, as 74–78% correct, with a number ofalgorithms approaching this level of performance. For systems thatdid not assume the availability of training data, performance wasmarkedly lower and also more variable. Human inter-tagger agreementwas high, with the gold standard taggings being around 95%replicable.

This is a preview of subscription content, access via your institution.

References

  • Harley, A. and D. Glennon. “Combining Different Tests with Additive Weighting and Their Evaluation”. In Tagging Text with Lexical Semantics: Why, What and How? Ed. M. Light, Washington, 1997, pp. 74–78.

References

  • Li, X., S. Szpakowicz and S. Matwin. “A WordNet-based Algorithm for Word Sense Disambiguation”. In Proceedings, IJCAI '95. Montreal, 1995, pp. 1368–1374.

  • Szpakowicz, S., S. Matwin and K. Barker. “WordNet-based Word Sense Disambiguation that Works for Small Texts”. Technical Report Computer Science TR–96–03, School of Information Technology and Engineering, University of Ottawa, 1996.

References

  • Guo, C.-M. Constructing a MTD from LDOCE, Chapt. Part 2. Norwood, New Jersey: Ablex, 1995, pp. 145–234.

    Google Scholar 

  • Wilks, Y., D. Fass, C.-M. Guo, J. McDonald, T. Plate and B. Slator: 1989, 'A Tractable Machine Dictionary as a Resource for Computational Semantics”. In Computational Lexicography for Natural Language Processing. Eds. B. K. Boguraev and E. J. Briscoe, Harlow: Longman, pp. 193–238.

    Google Scholar 

References

  • Atkins, S. “Tools for Computer-Aided Corpus Lexicography: The Hector Project”. Acta Linguistica Hungarica, 41 (1993), 5–72.

    Google Scholar 

  • Byrd, R. J., N. Calzolari, M. S. Chodorow, J. L. Klavans, M. S. Neff and O. A. Rizk. “Tools and Methods for Computational Lexicology”. Computational Linguistics, 13 (1987), 219–240.

    Google Scholar 

  • CIDE. “Cambridge International Dictionary of English”. Cambridge, England: CUP, 1995.

    Google Scholar 

  • Fellbaum, C. (ed.). WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press, 1998.

    Google Scholar 

  • Gale, W., K. Church and D. Yarowsky. “Estimating Upper and Lower Bounds on the Performance of Word-sense Disambiguation Programs”. In Proceedings, 30th ACL, 1992, pp. 249–156.

  • Harley, A. and D. Glennon. “Combining Different Tests with Additive Weighting and Their Evaluation”. In Tagging Text with Lexical Semantics: Why, What and How? Ed. M. Light, Washington, 1997, pp. 74–78.

  • Hirschman, L. “The Evolution of Evaluation: Lessons from the Message Understanding Conferences”. Computer Speech and Language, 12(4) (1998), 281–307.

    Google Scholar 

  • Jorgensen, J. C. “The Psychological Reality of Word Senses”. Journal of Psycholinguistic Research, 19(3) (1990), 167–190.

    Google Scholar 

  • Kilgarriff, A.: 1992, 'Polysemy'. Ph.D. thesis, University of Sussex, CSRP 261, School of Cognitive and Computing Sciences.

  • Kilgarriff, A.: 1997, 'Evaluating Word Sense Disambiguation Programs: Progress Report'. In Proc. SALT Workshop on Evaluation in Speech and Language Technology. Ed. R. Gaizauskas, Sheffield, pp. 114–120.

  • Kilgarriff, A. “Gold Standard Datasets for Evaluating Word Sense Disambiguation Programs”. Computer Speech and Language, 12(4) (1998), 453–472. Special Issue on Evaluation of Speech and Language Technology, edited by R. Gaizauskas.

    Google Scholar 

  • Lesk, M. E. “Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone”. In Proc. 1986 SIGDOC Conference. Toronto, Canada, 1986.

  • Ng, H. T. and H. B. Lee. “Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach”. In ACL Proceedings. Santa Cruz, California, 1996, pp. 40–47.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Kilgarriff, A., Rosenzweig, J. Framework and Results for English SENSEVAL. Computers and the Humanities 34, 15–48 (2000). https://doi.org/10.1023/A:1002693207386

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1002693207386

  • evaluation
  • SENSEVAL
  • word sense disambiguation