Advertisement

Partial Grammar Checking for Czech Using the SET Parser

  • Vojtěch Kovář
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8655)

Abstract

Checking people’s writing for correctness is one of the prominent language technology applications. In the Czech language, punctuation errors and mistakes in subject-predicate agreement belong to the most severe and most frequent errors people make, as there are complex and non-intuitive rules for both of these phenomena. At the same time, they include numerous syntactic, semantic and pragmatic aspects which makes them very difficult to be formalized for automatic checking. In this paper, we present an automatic method for fixing errors in commas and subject-predicate agreement, using pattern-matching rule-based syntactic analysis provided by the SET parsing system. We explain the method and present first evaluation of the overall accuracy.

Keywords

parser SET Czech grammar checking punctuation detection syntactic analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Holan, T., Kuboň, V., Plátek, M.: A prototype of a grammar checker for Czech. In: Proceedings of the 5th Conference on Applied Natural Language Processing, pp. 147–154. Association for Computational Linguistics (1997)Google Scholar
  2. 2.
    Kovář, V., Horák, A., Jakubíček, M.: Syntactic analysis using finite patterns: A new parsing system for Czech. In: Vetulani, Z. (ed.) LTC 2009. LNCS, vol. 6562, pp. 161–171. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  3. 3.
    Oliva, K., Petkevič, V.: Microsoft s.r.o.: Czech grammar checker (2005), http://office.microsoft.com/word
  4. 4.
    Lingea s.r.o.: Grammaticon (2003), http://www.lingea.cz/grammaticon.htm
  5. 5.
    Pala, K.: Pište dopisy konečně bez chyb – Česká gramatickÝ korektor pro Microsoft Office. Computer, 13–14 (2005)Google Scholar
  6. 6.
    Behún, D.: Kontrola české gramatiky pro MS Office - konec korektorů v Čechách (2005), http://interval.cz/clanky/kontrola-ceske-gramatiky-pro-ms-office-konec-korektoru-v-cechach
  7. 7.
    Jakubíček, M., Horák, A.: Punctuation detection with full syntactic parsing. Research in Computing Science, Special issue: Natural Language Processing and its Applications 46, 335–343 (2010)Google Scholar
  8. 8.
    Horák, A.: Computer Processing of Czech Syntax and Semantics. Librix.eu, Brno (2008)Google Scholar
  9. 9.
    Martin, J.: Rapid application development. Macmillan (1991)Google Scholar
  10. 10.
    Gabriel, R.P.: Lisp: Good news, bad news, how to win big. AI Expert 6, 30–39 (1991)Google Scholar
  11. 11.
    Sedláček, R., Smrž, P.: A new Czech morphological analyser ajka. In: Matoušek, V., Mautner, P., Mouček, R., Tauser, K. (eds.) TSD 2001. LNCS (LNAI), vol. 2166, pp. 100–107. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  12. 12.
    Pala, K., Rychlý, P., Smrž, P.: DESAM — annotated corpus for Czech. In: Jeffery, K. (ed.) SOFSEM 1997. LNCS, vol. 1338, pp. 523–530. Springer, Heidelberg (1997)Google Scholar
  13. 13.
    Trifanová, B.: Analýza chyb v diktátech žáků po absolvování 1. stupně ZŠ. Bachelor thesis, Masaryk University (2014), http://is.muni.cz/th/382965/ff_b
  14. 14.
    Šmerk, P.: Unsupervised learning of rules for morphological disambiguation. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 211–216. Springer, Heidelberg (2004)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Vojtěch Kovář
    • 1
  1. 1.NLP Centre, Faculty of InformaticsMasaryk UniversityBrnoCzech Republic

Personalised recommendations