Advertisement

Parsing Polish as a Context-Free Language

  • Stanisław Galus
Conference paper
Part of the Advances in Soft Computing book series (AINSC, volume 35)

Abstract

A set of 974 lexical symbols is defined which may appear in Polish text. Based on this set, a context-free grammar is constructed whose Chomsky normal form possesses 755 variables, 490 terminals and 1790 productions. Probabilities of these productions are estimated using over 40000 unparsed sentences. It turns out that a parsing algorithm using the resulting probabilistic context-free grammar parses correctly about 1/4 sentences.

Keywords

Grammatical Form Phrase Structure Grammar Parsing Algorithm Subjunctive Mood Statistical Natural Language Processing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    1. Bobrowski I. (1995) Gramatyka opisowa jêzyka polskiego (zarys modelu generatywno- transformacyjnego). Tom pierwszy: struktury wyjœciowe. Wy¿sza Szkola Pedagogiczna im. Jana Kochanowskiego, KielceGoogle Scholar
  2. 2.
    2. Galus St. (2005) Dictionary-based part-of-speech tagging of Polish. In: Kłopotek M. A., Wierzchoñ S. T., Trojanowski K., editors, Intelligent Information Processing and Web Mining. Proceedings of the International IIS: IIPWM/05 Conference, pages 179 –188, Springer-Verlag, Berlin HeidelbergGoogle Scholar
  3. 3.
    3. Manning Ch. D., Schütze H. (1999) Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, Massachusetts, London, EnglandzbMATHGoogle Scholar
  4. 4.
    4. Marciniak M., Mykowiecka A., Kupœæ A., Wêgiel M. (2000) Klasy.kacja zjawisk syntaktycznych na potrzeby testowego zbioru wyra¿eñ jêzyka polskiego. Prace IPI PAN 908, IPI PAN, WarszawaGoogle Scholar
  5. 5.
    5. Vetulani Z. (2004) Komunikacja czlowieka z maszynæ. Komputerowe modelowanie kompetencji jêzykowej. Akademicka O.cyna Wydawnicza EXIT, WarszawaGoogle Scholar

Copyright information

© Springer 2006

Authors and Affiliations

  • Stanisław Galus
    • 1
  1. 1.Wyższa Szkoła Finansów i AdministracjiGdańskPoland

Personalised recommendations