Chapter

Modeling, Learning, and Processing of Text Technological Data Structures

Volume 370 of the series Studies in Computational Intelligence pp 35-58

Processing Text-Technological Resources in Discourse Parsing

  • Henning LobinAffiliated withApplied and Computational Linguistics, Justus-Liebig-Universität Gießen
  • , Harald LüngenAffiliated withApplied and Computational Linguistics, Justus-Liebig-Universität Gießen
  • , Mirco HilbertAffiliated withApplied and Computational Linguistics, Justus-Liebig-Universität Gießen
  • , Maja BärenfängerAffiliated withApplied and Computational Linguistics, Justus-Liebig-Universität Gießen

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Discourse parsing of complex text types such as scientific research articles requires the analysis of an input document on linguistic and structural levels that go beyond traditionally employed lexical discourse markers. This chapter describes a text-technological approach to discourse parsing. Discourse parsing with the aim of providing a discourse structure is seen as the addition of a new annotation layer for input documents marked up on several linguistic annotation levels. The discourse parser generates discourse structures according to the Rhetorical Structure Theory. An overview of the knowledge sources and components for parsing scientific journal articles is given. The parser’s core consists of cascaded applications of the GAP, a Generic Annotation Parser. Details of the chart parsing algorithm are provided, as well as a short evaluation in terms of comparisons with reference annotations from our corpus and with recently developed systems with a similar task.