Abstract
This paper presents our experience with the lexico-semantic annotation of the Prague Dependency Treebank (PDT). We have used the Czech WordNet (CWN) as an annotation lexicon (repository of lexical meanings) and we annotate each word which is included in the CWN. Based on the error analysis we have performed some experiments with modification of the annotation lexicon (CWN) and consequent re-annotation of occurrences of selected lemmas. We present the results of the annotations and improvements achieved by our corrections.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Hajič, J., Vidová-Hladká, B., Hajičová, E., Sgall, P., Pajas, P., Řezníčková, V., Holub, M.: The current status of the prague dependency treebank. In: Matoušek, V., Mautner, P., Mouček, R., Tauser, K. (eds.) TSD 2001. LNCS (LNAI), vol. 2166, pp. 11–20. Springer, Heidelberg (2001)
Hajič, J., Hajičová, E., Pajas, P., Panevová, J., Sgall, P., Vidová-Hladká, B.: Prague dependency treebank 1.0 (Final Production Label). Linguistic Data Consortium, University of Pennsylvania (2001)
Hajič, J., Honetschläger, V.: Annotation lexicons: Using the valency lexicon for tectogrammatical annotation. Prague Bulletin of Mathematical Linguistics, 61–86 (2003)
Smrž, P.: Quality Control for Wordnet Development. In: Sojka, P., Pala, K., Smrž, P., Fellbaum, C., Vossen, P. (eds.) Proceedings of the Second International WordNet Conference—GWC 2004, Brno, Czech Republic, pp. 206–212. Masaryk University (2003)
Landes, S., Leacock, C., Tengi, R.I.: Building semantic concordances. In: Fellbaum, C. (ed.) WordNet, An Electronic Lexical Database, 1st edn., pp. 199–216. MIT Press, Cambridge (1998)
Hajič, J., Holub, M., Hučínová, M., Pavlík, M., Pecina, P., Straňák, P., Šidák, P.M.: Validating and improving the Czech WordNet via lexico-semantic annotation of the Prague Dependency Treebank. In: LREC 2004, Lisbon (2004)
Stevenson, M.: Word Sense Disambiguation: The Case for Combinations of Knowledge Sources. CSLI Studies in Computational Linguistics. CSLI Publications, Stanford (2003)
Navarro, B., Civit, M., Martí, M.A., Marcos, R., Fernández, B.: Syntactic, semantic and pragmatic annotation in cast3lb. Technical report, UCREL, Lancaster, UK (2003)
Agirre, E., Aldezabal, I., Etxeberria, J., Izagirre, E., Mendizabal, K., Pociello, E., Iruskieta, M.Q.: Improving the basque wordnet by corpus annotation. In: Proceedings of Third International WordNet Conference, Jeju Island, Korea, pp. 287–290 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bejček, E., Möllerová, P., Straňák, P. (2006). The Lexico-Semantic Annotation of PDT: Some Results, Problems and Solutions. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_3
Download citation
DOI: https://doi.org/10.1007/11846406_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39090-9
Online ISBN: 978-3-540-39091-6
eBook Packages: Computer ScienceComputer Science (R0)