Skip to main content
  • 76 Accesses

Abstract

In the following, we present the experimental setup for evaluating the performance of our proposed discourse-aware TS approach with regard to the two subtasks of

  1. (1)

    splitting and rephrasing syntactically complex input sentences into a set of minimal propositions (Hypotheses 1.1, 1.2 and 1.3), and

  2. (2)

    setting up a semantic hierarchy between the split components, based on the tasks of constituency type classification and rhetorical relation identification (Hypotheses 1.4 and 1.5).

Moreover, we describe the setting for assessing the use of discourse-aware sentence splitting as a support for the task of Open IE (Hypotheses 2.1 and 2.2).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For a detailed analysis, see Section 21 in the online supplemental material.

  2. 2.

    available under https://github.com/Lambda-3/DiscourseSimplification/blob/master/supplemental_material/dataset_pattern_analysis.zip

  3. 3.

    https://catalog.ldc.upenn.edu/LDC2002T07

  4. 4.

    https://www.isi.edu/~marcu/discourse/Corpora.html

  5. 5.

    In our implementation, we use the “” class from the python module “”.

  6. 6.

    The 19 classes of rhetorical relations can be found in Table 15.17. Most importantly, Taboada and Das (2013) map the List and Disjunction relationships to a common Joint class and integrate the Result relation into the semantic category of Cause relationships. Moreover, both Temporal-After and Temporal-Before become part of the Temporal class of rhetorical relations.

  7. 7.

    For the sake of simplicity, we do not judge the output for minimality here.

  8. 8.

    In a pre-study with one annotator we further examined whether the content of a simplified sentence that is flagged as a contextual proposition by our TS approach is indeed limited to background information. This analysis revealed that only 0.5% of them were misclassified, i.e. they convey some piece of key information from the input, while 90.9% were correctly classified as context sentences. Instead, we figured out that malformed output is a bigger issue (9.6%). Therefore, we decided to not further investigate the question of limitation to background information, and rather focus on the soundness of the contextual propositions in our human evaluation.

  9. 9.

    We use the default configuration of each system.

  10. 10.

    We used the latest version of the OIE2016 benchmark (commit on their github repository).

  11. 11.

    Though the authors of ClausIE specify a method for calculating the confidence scores of their system’s extractions, it is not implemented in the published source code. Therefore, we make use of Graphene’s confidence score function for ClausIE’s extractions, too.

  12. 12.

    The original scoring function presented in Stanovsky and Dagan (2016) was more restrictive. Here, a prediction was classified as being correct if the predicted and the reference tuple agree on the grammatical head of all their elements, i.e. the predicate and the arguments. However, this scheme was later relaxed in their github repository to the lexical matching function described above, which has become the OIE2016 benchmark’s default scorer in the current literature.

  13. 13.

    The results of the comparative analysis on the original CaRB dataset can be found in Section 22 in the online supplemental material.

  14. 14.

    CaRB is built over a subset of OIE2016’s original sentences.

Author information

Authors and Affiliations

Authors

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 973 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Niklaus, C. (2022). Experimental Setup. In: From Complex Sentences to a Formal Semantic Representation using Syntactic Text Simplification and Open Information Extraction. Springer Vieweg, Wiesbaden. https://doi.org/10.1007/978-3-658-38697-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-658-38697-9_14

  • Published:

  • Publisher Name: Springer Vieweg, Wiesbaden

  • Print ISBN: 978-3-658-38696-2

  • Online ISBN: 978-3-658-38697-9

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics