Characterizing Discontinuity in Constituent Treebanks

  • Wolfgang Maier
  • Timm Lichte
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5591)


Measures for the degree of non-projectivity of dependency grammar have received attention both on the formal and on the empirical side. The empirical characterization of discontinuity in constituent treebanks annotated with crossing branches has nevertheless been neglected so far. In this paper, we present two measures for the characterization of both the discontinuity of constituent structures and the non-projectivity of dependency structures. An empirical evaluation on German data as well as an investigation of the relation between the measures and grammars extracted from treebanks shows their relevance.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1994)Google Scholar
  2. 2.
    Civit, M., Martí Antònín, M.A.: Design principles for a Spanish treebank. In: Proceedings of the 1st Workshop on Treebanks and Linguistic Theories, Sozopol, Bulgaria (2002)Google Scholar
  3. 3.
    Telljohann, H., Hinrichs, E., Kübler, S., Zinsmeister, H.: Stylebook for the Tübingen Treebank of Written German (TüBa-D/Z). Technischer Bericht, Seminar für Sprachwissenschaft, Universität Tübingen, Tübingen (July 2006) Revidierte FassungGoogle Scholar
  4. 4.
    Skut, W., Krenn, B., Brants, T., Uszkoreit, H.: An annotation scheme for free word order languages. In: Proceedings of the 5th Applied Natural Language Processing Conference, Washington, DC, pp. 88–95 (1997)Google Scholar
  5. 5.
    Brants, S., Dipper, S., Hansen, S., Lezius, W., Smith, G.: The TIGER Treebank. In: Proceedings of the 1st Workshop on Treebanks and Linguistic Theories, Sozopol, Bulgaria, pp. 24–42 (2002)Google Scholar
  6. 6.
    Kübler, S., Hinrichs, E.W., Maier, W.: Is it really that difficult to parse German? In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 111–119 (July 2006)Google Scholar
  7. 7.
    Boyd, A.: Discontinuity revisited: An improved conversion to context-free representations. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, the Linguistic Annotation Workshop, Prague, Czech Republic, pp. 41–44 (2007)Google Scholar
  8. 8.
    Kuhlmann, M.: Dependency Structures and Lexicalized Grammars. PhD thesis, Saarland University (2007)Google Scholar
  9. 9.
    Holan, T.: Kuboň, V., Oliva, K., Plátek, M.: Two useful measures of word order complexity. In: Workshop on Processing of Dependency-Based Grammars, Montréal, Canada, pp. 21–29 (1998)Google Scholar
  10. 10.
    Bodirsky, M., Kuhlmann, M., Möhl, M.: Well-nested drawings as models of syntactic structure. In: Proceedings of the 10th Conference on Formal Grammar and the 9th Meeting on Mathematics of Language (FG-MOL 2005), Edinburgh, UK (2005)Google Scholar
  11. 11.
    Kuhlmann, M., Satta, G.: Treebank grammar techniques for non-projective dependency parsing. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece (2009)Google Scholar
  12. 12.
    Kunze, J.: Abhängigkeitsgrammatik. Studia grammatica, vol. 12. Akademie-Verlag, Berlin (1975)MATHGoogle Scholar
  13. 13.
    Havelka, J.: Beyond projectivity: Multilingual evaluation of constraints and measures on non-projective structures. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 608–615 (2007)Google Scholar
  14. 14.
    Kuhlmann, M., Nivre, J.: Mildly non-projective dependency structures. In: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Sydney, Australia (2006)Google Scholar
  15. 15.
    Gómez-Rodríguez, C., Weir, D., Carroll, J.: Parsing mildly non-projective dependency structures. In: Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), Athens, Greece, pp. 291–299. Association for Computational Linguistics (March 2009)Google Scholar
  16. 16.
    Vijay-Shanker, K., Weir, D., Joshi, A.: Characterising structural descriptions used by various formalisms. In: Proceedings of ACL (1987)Google Scholar
  17. 17.
    Boullier, P.: Proposal for a natural language processing syntactic backbone. Rapport de Recherche RR-3342, Institut National de Recherche en Informatique et en Automatique, Le Chesnay, France (1998)Google Scholar
  18. 18.
    Maier, W., Søgaard, A.: Treebanks and mild context-sensitivity. In: Proceedings of the 13th Conference on Formal Grammar 2008, Hamburg, Germany, pp. 61–76 (2008)Google Scholar
  19. 19.
    Kracht, M.: The Mathematics of Language. Mouton de Gruyter, Berlin (2003)CrossRefMATHGoogle Scholar
  20. 20.
    Hajič, J., Hladka, B.V., Panevová, J., Hajičová, E., Sgall, P., Pajas, P.: Prague Dependency Treebank 1.0. LDC (2001) 2001T10 Google Scholar
  21. 21.
    Kromann, M.T.: The Danish Dependency Treebank and the DTAG treebank tool. In: Second Workshop on Treebanks and Linguistic Theories, Växjö, Sweden, pp. 217–220 (2003)Google Scholar
  22. 22.
    Daum, M., Foth, K., Menzel, W.: Automatic transformation of phrase treebanks to dependency trees. In: Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal (2004)Google Scholar
  23. 23.
    Forst, M., Bertomeu, N., Crysmann, B., Fouvry, F., Hansen-Schirra, S., Kordoni, V.: Towards a dependency-based gold standard for German parsers: The TiGer Dependency Bank. In: Proceedings of LINC 2004, Geneva, Switzerland (2004)Google Scholar
  24. 24.
    Hudson, R.: Word Grammar. Basil Blackwell, Oxford (1984)Google Scholar
  25. 25.
    Engel, U.: Deutsche Grammatik. Groos, Heidelberg (1988)Google Scholar
  26. 26.
    Lobin, H.: Koordinationssyntax als prozedurales Phänomen. Studien zur deutschen Grammatik, vol. 46. Narr, Tübingen (1993)Google Scholar
  27. 27.
    Osenova, P., Simov, K.: BTB-TR05: BulTreebank Stylebook. Technical Report 05, BulTreeBank Project (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Wolfgang Maier
    • 1
  • Timm Lichte
    • 1
  1. 1.University of TübingenGermany

Personalised recommendations