Advertisement

A Method of Accurate Robust Parsing of Czech

  • Vladislav Kuboň
  • Martin Plátek
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2166)

Abstract

This paper advocates the claim that the property of a robustness of a certain automatic natural language parser is something different than a simple ability to construct a syntactic structure for each sequence of word forms (sentence) of a given language.

The robustness in our terminology should be more accurate in a sense that it should be able to distinguish between “good” and “bad” ill-formed sentence. We propose to use two measures for this purpose, the node-gap complexity which describes the complexity of the sentence with regard to nonprojective constructions, and the degree of robustness which takes into account the number of syntactic inconsistencies encountered in the process of robust parsing. These measures make it possible to develop a scale of global constraints which allow a kind of gradual parsing of both syntactically well-formed and ill-formed sentences of a natural language.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hajič, J.: Building a syntactically annotated corpus: The Prague Dependency Treebank, In: Issues of Valency and Meaning, Studies in Honour of Jarmila Panevová (ed. by E. Hajičová) (pp. 106–132). Praha: Karolinum.Google Scholar
  2. 2.
    Hajič, J. et al.: Core Natural Language Processing Technology Applicable to Multiple Languages, in: Final report of the Workshop’98 of the Center for Language and Speech Processing at the Johns Hopkins University, Baltimore, 1998Google Scholar
  3. 3.
    Holan, T., Kuboň, V., Oliva, K., Plátek, M.: Two Useful Measures of Word Order Complexity, in: Proceedings of the Coling’ 98 Workshop “Processing of Dependency-Based Grammars”, A. Polguere and S. Kahane (eds.), University of Montreal, Montreal, 1998Google Scholar
  4. 4.
    Holan, T., Kuboň, V., Oliva, K., Plátek, M.: On Complexity of Word Order, ÚFAL Technical Report TR-2000-08, MFF UK Praha, 2000Google Scholar
  5. 5.
    Holan, T.: A Software Environment for the Development of NL Parsers, (in Czech), Dissertation at MFF UK, Praha, manuscriptGoogle Scholar
  6. 6.
    Kuboň, V., Holan, T., Plátek, M.: A Grammar Checker for Czech,ÚFAL Technical Report TR-1997-02, MFF UK Praha, 1997Google Scholar
  7. 7.
    Kuboň, V.: A Robust Parser for Czech, Dissertation at MFF UK, Praha, manuscriptGoogle Scholar
  8. 8.
    Kunze, J.: Abhängigskeitsgrammatik, Berlin: Akademie-Verlag, 1975Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Vladislav Kuboň
    • 1
  • Martin Plátek
    • 2
  1. 1.Institute of Formal and Applied Linguistics, Faculty of Mathematics and PhysicsCharles UniversityPragueCzech Republic
  2. 2.Dept. of Theoretical Computer Science and Logic, Faculty of Mathematics and PhysicsCharles UniversityPragueCzech Republic

Personalised recommendations