Advertisement

Extensive Study on Automatic Verb Sense Disambiguation in Czech

  • Jiří Semecký
  • Petr Podveský
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4188)

Abstract

In this paper we compare automatic methods for disambiguation of verb senses, in particular we investigate Naïve Bayes classifier, decision trees, and a rule-based method. Different types of features are proposed, including morphological, syntax-based, idiomatic, animacy, and WordNet-based features. We evaluate the methods together with individual feature types on two essentially different Czech corpora, VALEVAL and the Prague Dependency Treebank. The best performing methods and features are discussed.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Dang, H.T., Palmer, M.: The Role of Semantic Roles in Disambiguating Verb Senses. In: Proceedings of ACL, Ann Arbor MI (2005)Google Scholar
  2. 2.
    Ye, P.: Selectional Preferenced Based Verb Sense Disambiguation Using WordNet. In: Australasian Language Technology Workshop 2004, Australia, pp. 155–162 (2004)Google Scholar
  3. 3.
    Lopatková, M., Bojar, O., Semecký, J., Benešová, V., Žabokrtský, Z.: Valency lexicon of czech verbs VALLEX: Recent experiments with frame disambiguation. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS, vol. 3658, pp. 99–106. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Král, R.: Jaký to má význam? Ph.D. thesis, Masaryk University (2004)Google Scholar
  5. 5.
    Kocek, J., Kopřivová, M., Kučera, K. (eds.): Czech National Corpus - introduction and user handbook (in Czech), FF UK - ÚČNK, Prague (2000)Google Scholar
  6. 6.
    Bojar, O., Semecký, J., Benešová, V.: VALEVAL: Testing VALLEX Consistency and Experimenting with Word-Frame Disambiguation. Prague Bulletin of Mathematical Linguistics 83 (2005)Google Scholar
  7. 7.
    Charniak, E.: A Maximum-Entropy-Inspired Parser. In: Proceedings of NAACL 2000, Seattle, Washington, USA, pp. 132–139 (2000)Google Scholar
  8. 8.
    Hajič, J.: Building a Syntactically Annotated Corpus: The Prague Dependency Treebank. Issues of Valency and Meaning, pp. 106–132 (1998)Google Scholar
  9. 9.
    Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence in its Semantic and Pragmatic Aspects, Academia, Prague. Czech Republic/Reidel Publishing Company, Dordrecht, Netherlands (1986)Google Scholar
  10. 10.
    McDonald, R., Pereira, F., Ribarov, K., Hajic, J.: Non-Projective Dependency Parsing using Spanning Tree Algorithms. In: Proceedings of HLT Conference and Conference on EMNLP, Vancouver, Canada, ACL, pp. 523–530 (2005)Google Scholar
  11. 11.
    Hajič, J.: Morphological Tagging: Data vs. Dictionaries. In: Proceedings of ANLP-NAACL Conference, Seattle, Washington, USA, pp. 94–101 (2000)Google Scholar
  12. 12.
    Fellbaum, C.: WordNet An Electronic Lexical Database. The MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  13. 13.
    Vossen, P., Bloksma, L., Rodriguez, H., Climent, S., Calzolari, N., Roventini, A., Bertagna, F., Alonge, A., Peters, W.: The EuroWordNet Base Concepts and Top Ontology. Technical report (1997)Google Scholar
  14. 14.
    Pala, K., Smrž, P.: Building Czech Wordnet. Romanian Journal of Information Science and Technology 7, 79–88 (2004)Google Scholar
  15. 15.
    Borgelt, C.: A Decision Tree Plug-In for DataEngine. In: Proceedings of 2nd Data Analysis Symposium, Aachen, Germany, MIT GmbH (1998)Google Scholar
  16. 16.
    Quinlan, J.R.: Data Mining Tools See5 and C5.0 (2005), http://www.rulequest.com/see5-info.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jiří Semecký
    • 1
  • Petr Podveský
    • 1
  1. 1.Institute of Formal and Applied LinguisticsPragueCzech Republic

Personalised recommendations