Experimental Setup

Niklaus, Christina

doi:10.1007/978-3-658-38697-9_14

Christina Niklaus²

76 Accesses

Abstract

In the following, we present the experimental setup for evaluating the performance of our proposed discourse-aware TS approach with regard to the two subtasks of

(1)
splitting and rephrasing syntactically complex input sentences into a set of minimal propositions (Hypotheses 1.1, 1.2 and 1.3), and
(2)
setting up a semantic hierarchy between the split components, based on the tasks of constituency type classification and rhetorical relation identification (Hypotheses 1.4 and 1.5).

Moreover, we describe the setting for assessing the use of discourse-aware sentence splitting as a support for the task of Open IE (Hypotheses 2.1 and 2.2).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For a detailed analysis, see Section 21 in the online supplemental material.
2.
available under https://github.com/Lambda-3/DiscourseSimplification/blob/master/supplemental_material/dataset_pattern_analysis.zip
3.
https://catalog.ldc.upenn.edu/LDC2002T07
4.
https://www.isi.edu/~marcu/discourse/Corpora.html
5.
In our implementation, we use the “” class from the python module “”.
6.
The 19 classes of rhetorical relations can be found in Table 15.17. Most importantly, Taboada and Das (2013) map the List and Disjunction relationships to a common Joint class and integrate the Result relation into the semantic category of Cause relationships. Moreover, both Temporal-After and Temporal-Before become part of the Temporal class of rhetorical relations.
7.
For the sake of simplicity, we do not judge the output for minimality here.
8.
In a pre-study with one annotator we further examined whether the content of a simplified sentence that is flagged as a contextual proposition by our TS approach is indeed limited to background information. This analysis revealed that only 0.5% of them were misclassified, i.e. they convey some piece of key information from the input, while 90.9% were correctly classified as context sentences. Instead, we figured out that malformed output is a bigger issue (9.6%). Therefore, we decided to not further investigate the question of limitation to background information, and rather focus on the soundness of the contextual propositions in our human evaluation.
9.
We use the default configuration of each system.
10.
We used the latest version of the OIE2016 benchmark (commit on their github repository).
11.
Though the authors of ClausIE specify a method for calculating the confidence scores of their system’s extractions, it is not implemented in the published source code. Therefore, we make use of Graphene’s confidence score function for ClausIE’s extractions, too.
12.
The original scoring function presented in Stanovsky and Dagan (2016) was more restrictive. Here, a prediction was classified as being correct if the predicted and the reference tuple agree on the grammatical head of all their elements, i.e. the predicate and the arguments. However, this scheme was later relaxed in their github repository to the lexical matching function described above, which has become the OIE2016 benchmark’s default scorer in the current literature.
13.
The results of the comparative analysis on the original CaRB dataset can be found in Section 22 in the online supplemental material.
14.
CaRB is built over a subset of OIE2016’s original sentences.

Author information

Authors and Affiliations

Rehetobel, Switzerland
Christina Niklaus

Authors

Christina Niklaus
View author publications
You can also search for this author in PubMed Google Scholar

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 973 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Niklaus, C. (2022). Experimental Setup. In: From Complex Sentences to a Formal Semantic Representation using Syntactic Text Simplification and Open Information Extraction. Springer Vieweg, Wiesbaden. https://doi.org/10.1007/978-3-658-38697-9_14

Download citation

DOI: https://doi.org/10.1007/978-3-658-38697-9_14
Published: 30 August 2022
Publisher Name: Springer Vieweg, Wiesbaden
Print ISBN: 978-3-658-38696-2
Online ISBN: 978-3-658-38697-9
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics