Abstract
SENSEVAL set itself the task of evaluating automaticword sense disambiguation programs (see Kilgarriff andRosenzweig, this volume, for an overview of theframework and results). In order to do this, it wasnecessary to provide a `gold standard' dataset of `correct' answers. This paper will describe thelexicographic part of the process involved in creatingthat dataset. The primary objective was for a group oflexicographers to manually examine keywords in a largenumber of corpus contexts, and assign to each contexta sense-tag for the keyword, taken from the Hectordictionary. Corpus contexts also had to be manuallypart-of-speech (POS) tagged. Various observationsmade and insights gained by the lexicographers duringthis process will be presented, including a critiqueof the resources and the methodology.
Similar content being viewed by others
References
Atkins, S. “Tools for Computer-Aided Corpus Lexicography: The Hector Project”. Acta Linguistica Hungarica, 41(1993), 5–72.
Atkins, B.T.S. and Levin, B. “Admitting Impediments”. In: Zernik, Uri (ed.) Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, Lawrence Erlbaum, New Jersey. 1991, pp. 233–262.
Kilgarriff,A. “The Hard Parts of Lexicography”. International Journal of Lexicography, 11:1 (1998), 51–54.
Sinclair, J.M., (ed.) Looking Up: An Account of the COBUILD Project in Lexical Computing. Collins, London. 1987.
Stock, P.F. Polysemy. In: Hartmann, R.R.K. (ed.), LEXeter '83 Proceedings, Max Niemeyer Verlag, Tubingen. 1984, pp. 131–140.
Rights and permissions
About this article
Cite this article
Krishnamurthy, R., Nicholls, D. Peeling an Onion: The Lexicographer's Experience ofManual Sense-Tagging. Computers and the Humanities 34, 85–97 (2000). https://doi.org/10.1023/A:1002407003264
Issue Date:
DOI: https://doi.org/10.1023/A:1002407003264