Advertisement

The Lexico-Semantic Annotation of PDT: Some Results, Problems and Solutions

  • Eduard Bejček
  • Petra Möllerová
  • Pavel Straňák
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4188)

Abstract

This paper presents our experience with the lexico-semantic annotation of the Prague Dependency Treebank (PDT). We have used the Czech WordNet (CWN) as an annotation lexicon (repository of lexical meanings) and we annotate each word which is included in the CWN. Based on the error analysis we have performed some experiments with modification of the annotation lexicon (CWN) and consequent re-annotation of occurrences of selected lemmas. We present the results of the annotations and improvements achieved by our corrections.

Keywords

Semantic Annotation Word Sense Disambiguation Lexical Unit Lexical Meaning Linguistic Data Consortium 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hajič, J., Vidová-Hladká, B., Hajičová, E., Sgall, P., Pajas, P., Řezníčková, V., Holub, M.: The current status of the prague dependency treebank. In: Matoušek, V., Mautner, P., Mouček, R., Tauser, K. (eds.) TSD 2001. LNCS (LNAI), vol. 2166, pp. 11–20. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  2. 2.
    Hajič, J., Hajičová, E., Pajas, P., Panevová, J., Sgall, P., Vidová-Hladká, B.: Prague dependency treebank 1.0 (Final Production Label). Linguistic Data Consortium, University of Pennsylvania (2001)Google Scholar
  3. 3.
    Hajič, J., Honetschläger, V.: Annotation lexicons: Using the valency lexicon for tectogrammatical annotation. Prague Bulletin of Mathematical Linguistics, 61–86 (2003)Google Scholar
  4. 4.
    Smrž, P.: Quality Control for Wordnet Development. In: Sojka, P., Pala, K., Smrž, P., Fellbaum, C., Vossen, P. (eds.) Proceedings of the Second International WordNet Conference—GWC 2004, Brno, Czech Republic, pp. 206–212. Masaryk University (2003)Google Scholar
  5. 5.
    Landes, S., Leacock, C., Tengi, R.I.: Building semantic concordances. In: Fellbaum, C. (ed.) WordNet, An Electronic Lexical Database, 1st edn., pp. 199–216. MIT Press, Cambridge (1998)Google Scholar
  6. 6.
    Hajič, J., Holub, M., Hučínová, M., Pavlík, M., Pecina, P., Straňák, P., Šidák, P.M.: Validating and improving the Czech WordNet via lexico-semantic annotation of the Prague Dependency Treebank. In: LREC 2004, Lisbon (2004)Google Scholar
  7. 7.
    Stevenson, M.: Word Sense Disambiguation: The Case for Combinations of Knowledge Sources. CSLI Studies in Computational Linguistics. CSLI Publications, Stanford (2003)Google Scholar
  8. 8.
    Navarro, B., Civit, M., Martí, M.A., Marcos, R., Fernández, B.: Syntactic, semantic and pragmatic annotation in cast3lb. Technical report, UCREL, Lancaster, UK (2003)Google Scholar
  9. 9.
    Agirre, E., Aldezabal, I., Etxeberria, J., Izagirre, E., Mendizabal, K., Pociello, E., Iruskieta, M.Q.: Improving the basque wordnet by corpus annotation. In: Proceedings of Third International WordNet Conference, Jeju Island, Korea, pp. 287–290 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Eduard Bejček
    • 1
  • Petra Möllerová
    • 1
  • Pavel Straňák
    • 1
  1. 1.Institute of Formal and Applied LinguisticsCharles UniversityPragueCzech Republic

Personalised recommendations