DESAM — Annotated corpus for Czech
This paper deals with Czech disambiguated corpus DESAM. It is a tagged corpus which has been manually disambiguated and can be used in various applications. We discuss the structure of the corpus, tools used for its managing, linguistic applications, and also possible use of machine learning techniques relying on the disambiguated data. Possible ways of developing the procedures for complete automatic disambiguation are considered.
Unable to display preview. Download preview PDF.
- 1.K. Pala. Desambiguating syntactic constructions from tagged corpus. In Workshop on AI Methods in Machine Learning, 1996.Google Scholar
- 2.R. Garside. The CLAWS word-tagging system, The computational analysis of English. Longman, London, 1987.Google Scholar
- 3.D. Cutting. A practical part-of-speech tagger. In Proceedings of the 3rd Conference on Natural Language Processing, Trento, Italy, March–April 1992.Google Scholar
- 4.F. Karlsson, A. Voutilainen, J. Heikkila, and A. Anttila. Constraint Grammars. Mouton de Gruyter, Berlin, 1995.Google Scholar
- 5.P. Ševeček. LEMMA — a lemmatizer for Czech. Brno, 1996. (manuscript).Google Scholar
- 6.K. Osolsobě. Algorithmic description of Czech morphology. PhD thesis, Masaryk University, Brno, 1996.Google Scholar
- 7.V. Puža. Syntactic analysis of natural language with a view to a corpora tagging. Master's thesis, Faculty of Informatics, Masaryk University, Brno, 1997.Google Scholar
- 8.B. M. Schulze and O. Christ. The CQP User's Manual.Google Scholar
- 9.O. Christ. The XKWIC User Manual.Google Scholar
- 10.J. Jelinek, J. V. Bečka, and M. Těšiteloá. Frequency Dictionary of Czech. Academia, Praha, 1961.Google Scholar
- 11.J. Hajič and B. Hladká. Probabilistic and rule-based tagging of an inflective language — a comparison. Technical Report 1, Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, November 1996.Google Scholar
- 12.T. J. Sejnowski and C. R. Rosenberg. Parallel Networks that Learn to Pronounce English Text. Complex Systems, 1:145–168, 1987.Google Scholar