A Method of Accurate Robust Parsing of Czech

Kuboň, Vladislav; Plátek, Martin

doi:10.1007/3-540-44805-5_12

Vladislav Kuboň² &
Martin Plátek³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2166))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

396 Accesses

Abstract

This paper advocates the claim that the property of a robustness of a certain automatic natural language parser is something different than a simple ability to construct a syntactic structure for each sequence of word forms (sentence) of a given language.

The robustness in our terminology should be more accurate in a sense that it should be able to distinguish between “good” and “bad” ill-formed sentence. We propose to use two measures for this purpose, the node-gap complexity which describes the complexity of the sentence with regard to nonprojective constructions, and the degree of robustness which takes into account the number of syntactic inconsistencies encountered in the process of robust parsing. These measures make it possible to develop a scale of global constraints which allow a kind of gradual parsing of both syntactically well-formed and ill-formed sentences of a natural language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hajič, J.: Building a syntactically annotated corpus: The Prague Dependency Treebank, In: Issues of Valency and Meaning, Studies in Honour of Jarmila Panevová (ed. by E. Hajičová) (pp. 106–132). Praha: Karolinum.
Google Scholar
Hajič, J. et al.: Core Natural Language Processing Technology Applicable to Multiple Languages, in: Final report of the Workshop’98 of the Center for Language and Speech Processing at the Johns Hopkins University, Baltimore, 1998
Google Scholar
Holan, T., Kuboň, V., Oliva, K., Plátek, M.: Two Useful Measures of Word Order Complexity, in: Proceedings of the Coling’ 98 Workshop “Processing of Dependency-Based Grammars”, A. Polguere and S. Kahane (eds.), University of Montreal, Montreal, 1998
Google Scholar
Holan, T., Kuboň, V., Oliva, K., Plátek, M.: On Complexity of Word Order, ÚFAL Technical Report TR-2000-08, MFF UK Praha, 2000
Google Scholar
Holan, T.: A Software Environment for the Development of NL Parsers, (in Czech), Dissertation at MFF UK, Praha, manuscript
Google Scholar
Kuboň, V., Holan, T., Plátek, M.: A Grammar Checker for Czech,ÚFAL Technical Report TR-1997-02, MFF UK Praha, 1997
Google Scholar
Kuboň, V.: A Robust Parser for Czech, Dissertation at MFF UK, Praha, manuscript
Google Scholar
Kunze, J.: Abhängigskeitsgrammatik, Berlin: Akademie-Verlag, 1975
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Malostranské nám. 25, 118 01, Prague, Czech Republic
Vladislav Kuboň
Dept. of Theoretical Computer Science and Logic, Faculty of Mathematics and Physics, Charles University, Malostranské nám. 25, 118 01, Prague, Czech Republic
Martin Plátek

Authors

Vladislav Kuboň
View author publications
You can also search for this author in PubMed Google Scholar
Martin Plátek
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science and Engineering, University of West Bohemia in Plzeň, Faculty of Applied Sciences, Univerzitní 22, 306-14, Plzeň, Czech Republic
Václav Matoušek , Pavel Mautner , Roman Mouček & Karel Taušer , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kuboň, V., Plátek, M. (2001). A Method of Accurate Robust Parsing of Czech. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_12

Download citation

DOI: https://doi.org/10.1007/3-540-44805-5_12
Published: 24 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42557-1
Online ISBN: 978-3-540-44805-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics