Advertisement

Segmentation Charts for Czech – Relations among Segments in Complex Sentences

  • Markéta Lopatková
  • Tomáš Holan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5457)

Abstract

Syntactic analysis of natural languages is the fundamental requirement of many applied tasks. We propose a new module between morphological and syntactic analysis that aims at determining the overall structure of a sentence prior to its complete analysis.

We exploit a concept of segments, easily automatically detectable and linguistically motivated units. The output of the module, so-called ‘segmentation chart’, describes the relationship among segments, especially relations of coordination and apposition or relation of subordination.

In this text we present a framework that enables us to develop and test rules for automatic identification of segmentation charts. We describe two basic experiments – an experiment with segmentation patterns obtained from the Prague Dependency Treebank and an experiment with the segmentation rules applied to plain text. Further, we discuss the evaluation measures suitable for our task.

Keywords

Elementary Boundary Adjacent Segment Plain Text Quotation Mark Individual Segment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Holan, T.: O složitosti Vesmíru. In: Obdržálek, D., Štanclová, J., Plátek, M. (eds.) Malý informatický seminář MIS 2007, pp. 44–47. MatFyz Press, Praha (2007)Google Scholar
  2. 2.
    Abney, S.: Parsing By Chunks. In: Berwick, R., Abney, S., Tenny, C. (eds.) Principle-Based Parsing, pp. 257–278. Kluwer Academic Publishers, Dordrecht (1991)CrossRefGoogle Scholar
  3. 3.
    Abney, S.: Partial Parsing via Finite-State Cascades. Journal of Natural Language Engineering 2, 337–344 (1995)CrossRefGoogle Scholar
  4. 4.
    Brants, T.: Cascaded Markov Models. In: Proceedings of EACL 1999, pp. 118–125. University of Bergen (1999)Google Scholar
  5. 5.
    Ciravegna, F., Lavelli, A.: Full Text Parsing using Cascades of Rules: An Information Extraction Procedure. In: Proceedings of EACL 1999, pp. 102–109. University of Bergen (1999)Google Scholar
  6. 6.
    Kuboň, V.: Problems of Robust Parsing of Czech. Ph.D. Thesis, MFF UK, Prague (2001)Google Scholar
  7. 7.
    Kuboň, V., Lopatková, M., Plátek, M., Pognan, P.: A Linguistically-Based Segmentation of Complex Sentences. In: Wilson, D.C., Sutcliffe, G.C.J. (eds.) Proceedings of FLAIRS Conference, pp. 368–374. AAAI Press, Menlo Park (2007)Google Scholar
  8. 8.
    Pardubská, D., Plátek, M.: On Parallel Communicating Grammar Systems and Correctness Preserving Restarting Automata. In: Dediu, A.H., Ionescu, A.M., Martín-Vide, C. (eds.) LATA 2009. LNCS, vol. 5457, pp. 1–18. Springer, Heidelberg (2009)Google Scholar
  9. 9.
    Jones, B.E.M.: Exploiting the Role of Punctuation in Parsing Natural Text. In: Proceedings of the COLING 1994, Kyoto, pp. 421–425 (1994)Google Scholar
  10. 10.
    Ohno, T., Matsubara, S., Kashioka, H., Maruyama, T., Inagaki, Y.: Dependency Parsing of Japanese Spoken Monologue Based on Clause Boundaries. In: Proceedings of COLING and ACL, pp. 169–176 (2006)Google Scholar
  11. 11.
    Hajič, J.: Disambiguation of Rich Inflection (Computational Morphology of Czech), UK, Nakladatelství Karolinum, Praha (2004)Google Scholar
  12. 12.
    Spoustová, D., Hajič, J., Votrubec, J., Krbec, P., Květoň, P.: The Best of Two Worlds: Cooperation of Statistical and Rule-Based Taggers for Czech. In: Proceedings of Balto-Slavonic NLP Workshop, pp. 67–74. ACL, Prague (2007)Google Scholar
  13. 13.
    Hajič, J., Hajičová, E., Panevová, J., Sgall, P., Pajas, P., Štěpánek, J., Havelka, J., Mikulová, M.: Prague Dependency Treebank 2.0. LDC, Philadelphia (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Markéta Lopatková
    • 1
  • Tomáš Holan
    • 1
  1. 1.Charles University in PragueCzech Republic

Personalised recommendations