A Symbolic Approach for Automatic Detection of Nuclearity and Rhetorical Relations among Intra-sentence Discourse Segments in Spanish
Nowadays automatic discourse analysis is a very prominent research topic, since it is useful to develop several applications, as automatic summarization, automatic translation, information extraction, etc. Rhetorical Structure Theory(RST) is the most employed theory. Nevertheless, there are not many studies about this subject in Spanish. In this paper we present the first system assigning nuclearity and rhetorical relations to intra-sentence discourse segments in Spanish texts. To carry out the research, we analyze the learning corpus of the RST Spanish Treebank, a corpus of manually-annotated specialized texts, in order to build a list of lexical and syntactic patterns marking rhetorical relations. To implement the system, this patterns’ list and a discourse segmenter called DiSeg are used. To evaluate the system, it is applied over the test corpus of the RST Spanish Treebank. Automatic and manual rhetorical analyses of each sentence are compared, by means of recall and precision, obtaining positive results.
KeywordsNuclearity Rhetorical Relations Intra-sentence Discourse Segments Rhetorical Structure Theory Corpus Symbolic Approach Spanish
Unable to display preview. Download preview PDF.