Skip to main content

Syntactic Analysis Using Finite Patterns: A New Parsing System for Czech

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6562))

Abstract

Syntactic analysis of natural languages is considered to be one of the basic steps to advanced natural language processing, such as logical analysis or information retrieval with natural language texts. The Czech language can be characterized as a morphologically rich language with a relatively free word order, which further complicates the problem of syntactic analysis. Current parsing systems for Czech fight many problems including low precision or high ambiguity of the parser output. In this paper, we show a new approach to syntactic analysis of free-word-order languages based on the idea of pattern matching linking rules. The system, named SET, is currently developed and tested with the Czech language as a representative of free-word-order languages with very rich morphological system. We briefly mention current approaches and parsing systems for Czech. Then we describe the basic ideas as well as details of SET’s prototype implementation of the pattern matching approach to syntactic analysis. We also offer preliminary analysis of the system parsing precision and discuss the advantages and disadvantages of this approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baumann, S., Brinckmann, C., Hansen-Schirra, C., et al.: The muli project: Annotation and analysis of information structure in german and english. In: Proceedings of the LREC 2004 Conference, Lisboa, Portugal (2004)

    Google Scholar 

  2. Horák, A., Kadlec, V., Smrž, P.: Enhancing Best Analysis Selection and Parser Comparison. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, pp. 461–467. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  3. Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern information retrieval. ACM Press, New York (1999)

    Google Scholar 

  4. Mráková, E., Sedláček, R.: From Czech morphology through partial parsing to disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 126–135. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Sgall, P.: Generativní popis jazyka a česká deklinace (Generative Description of the Language and the Czech Declension). Academia, Prague (1967)

    Google Scholar 

  6. Sgall, P., Hajičová, E., Panevová, J.: The Meaning of the Sentence and Its Semantic and Pragmatic Aspects. Academia/Reidel Publishing Company, Prague, Czech Republic/Dordrecht, Netherlands (1986)

    Google Scholar 

  7. Hajič, J.: Complex Corpus Annotation: The Prague Dependency Treebank, Bratislava, Slovakia, Jazykovedný ústav L’. Štúra. SAV (2004)

    Google Scholar 

  8. Hajič, J., Collins, M., Ramshaw, L., Tillmann, C.: A Statistical Parser for Czech. In: Proceedings ACL 1999, Maryland, USA (1999)

    Google Scholar 

  9. McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-Projective Dependency Parsing using Spanning Tree Algorithms. In: Proceedings of HTL/EMNLP 2005, Vancouver, BC, Canada (2005)

    Google Scholar 

  10. Horák, A.: Computer Processing of Czech Syntax and Semantics. Librix.eu, Brno, Czech Republic (2008)

    Google Scholar 

  11. Holan, T., Žabokrtský, Z.: Combining Czech Dependency Parsers. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 95–102. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Kovář, V., Jakubíček, M.: Test suite for the Czech parser synt. In: Proceedings of Recent Advances in Slavonic Natural Language Processing 2008, Brno, Czech Republlic, Masaryk University, pp. 63–70 (2008)

    Google Scholar 

  13. Horák, A., Holan, T., Kadlec, V., Kovář, V.: Dependency and Phrasal Parsers of the Czech Language: A Comparison. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 76–84. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Sojka, P.: Competing patterns for language engineering. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2000. LNCS (LNAI), vol. 1902, pp. 157–162. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  15. Przepiórkowski, A., Buczyński, A.: \(\spadesuit\): Shallow parsing and disambiguation engine. In: Proceedings of the 3rd Language & Technology Conference, Poznań (2007)

    Google Scholar 

  16. Kilgarriff, A., Rychlý, P., Smrž, P., Tugwell, D.: The Sketch Engine. In: Proceedings of the Eleventh EURALEX International Congress, Lorient, France, Universite de Bretagne-Sud, pp. 105–116 (2004)

    Google Scholar 

  17. Rychlý, P., Smrž, P.: Manatee, Bonito and Word Sketches for Czech. In: Proceedings of the Second International Conference on Corpus Linguisitcs, pp. 124–132. Saint-Petersburg State University Press, Saint-Petersburg (2004)

    Google Scholar 

  18. Kadlec, V.: Syntactic analysis of natural languages based on context-free grammar backbone. PhD thesis, Masaryk University (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kovář, V., Horák, A., Jakubíček, M. (2011). Syntactic Analysis Using Finite Patterns: A New Parsing System for Czech. In: Vetulani, Z. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2009. Lecture Notes in Computer Science(), vol 6562. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20095-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20095-3_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20094-6

  • Online ISBN: 978-3-642-20095-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics