Skip to main content

Three Approaches to Word Sense Disambiguation for Czech

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2166))

Abstract

Before building a full wsd system it is necessary to have a balanced and representative corpus annotated with sense tags. This requirement is not certainly fulfilled for the Czech language. Thus, we decided to develop some particular methods for annotating texts and we have started with the most common nouns. In our approach, the disambiguation algorithm based on sets of words (called bags) was used. The advantage of this approach is the possibility of filling bags in various ways. Our ultimate goal is to reduce manual work as much as possible. Here we present three basic ways of filling bags. The first one is based on the machine readable version of SSJČ, the second takes the advantage of learning from manually annotated text and the strategy of pseudoclustering is the third one.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ide, N., Véronis, J.: Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art, Computational Linguistics, Vol. 24, Num. 1, 1998.

    Google Scholar 

  2. Lesk, M.: Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone. Proceedings of SIGDOC, Toronto, 1986, pp. 1–9.

    Google Scholar 

  3. Pala, K.: Word Senses and Semantic Representations. Can We Have Both?, Text, Speech and Dialogue: Proceedings of TSD’00 Workshop, LNAI 1902, Springer, 2000, pp. 109–114.

    Google Scholar 

  4. Pala, K., Rychlý, P., Smrž, P.: DESAM — An Annotated Corpus for Czech, Proceedings of SOFSEM’98, Springer, 1998.

    Google Scholar 

  5. Sedláček R., Smrž P.: Automatic Processing of Czech Inflectional and Derivative Morphology, Technical Report, Faculty of Informatics, Masaryk University, Brno, 2001.

    Google Scholar 

  6. Veber M.: CED-Program for Corpora Editing, Technical Report, Faculty of Informatics, Masaryk University, Brno, 1999.

    Google Scholar 

  7. Vossen, P., et al.: Set of Common Base Concepts in EuroWordNet-2, Final Report, 2D001, Amsterdam, October 1988.

    Google Scholar 

  8. Wilks Y., Stevenson M.: Sense Tagging: Semantic Tagging with a Lexicon, Proceedings of the SIGLEX Workshopon Tagging Text with Lexical Semantics: Why, What and How?, Washington, D.C., 1997.

    Google Scholar 

  9. Slovník spisovného jazyka českého (Dictionary of literary Czech), Akademia, Praha, 1960, electronic version, Praha, Brno, 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Král, R. (2001). Three Approaches to Word Sense Disambiguation for Czech. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_22

Download citation

  • DOI: https://doi.org/10.1007/3-540-44805-5_22

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42557-1

  • Online ISBN: 978-3-540-44805-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics