Abstract
Empirical studies of text coherence often use tree-like structures in the spirit of Rhetorical Structure Theory (RST) as representational device. This paper identifies several sources of ambiguity in RST-inspired trees and argues that such structures are therefore not as explanatory as a text representation should be. As an alternative, an approach toward multi-level annotation (MLA) of texts is proposed, which separates the information into distinct levels of representation, in particular: referential structure, thematic structure, conjunctive relations, and intentional structure. Levels are conceptually built upon each other, and human annotators can produce them using a dedicated software environment. We argue that the resulting multi-level corpora are descriptively more adequate, and as a resource are more useful than RST-style treebanks.
Similar content being viewed by others
References
Asher N., Lascarides A. (2003) Logics of conversation. Cambridge University Press, Cambridge
Bateman J. (2001) Between the leaves of rhetorical structure: static and dynamic aspects of discourse organization. Verbum 23(1): 31–58
Bateman J., Rondhuis K.J. (1997) “Coherence relations”: Towards a general specification. Discourse Processes 24(1): 3–50
Brandt M., Rosengren I. (1992) Zur Illokutionsstruktur von Texten. Zeitschrift fr Literaturwissenschaft und Linguistik 86: 9–51
Brants S., Dipper S., Eisenberg P., Hansen S., König E., Lezius W., Rohrer C., Smith G., Uszkoreit H. (2004) TIGER: Linguistic interpretation of a german corpus. Research on Language and Computation 2(4): 597–620
Brants, T., & Plaehn, O. (2000). Interactive corpus annotation. In Proceedings of Second International Conference on Language Resources and Evaluation (LREC-2000). Athens.
Carlson, L., & Marcu, D. (2001). Discourse tagging reference manual. Technical report, University of Southern California/ISI.
Carlson L., Marcu D., Okurowski M.E. (2003) Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: van Kuppevelt J., Smith R. (eds) Current directions in discourse and dialogue. Kluwer, Dordrecht
Chiarcos, C., Dipper, S., Götze, M., Ritz, J., & Stede, M. (2008). A flexible framework for integrating annotations from different tools and tagsets. In Proceedings of the First International Conference on Global Interoperability for Language Resources. Hongkong.
Dipper, S. (2005). XML-based stand-off representation and exploitation of multi-level linguistic annotation. In R. Eckstein & R. Tolksdorf (Eds.), Proceedings of Berliner XML Tage (pp. 39–50). Berlin: Humboldt University.
Figge, U. L. (1971). Syntagmatik, Distribution und Text. In W.-D. Stempel (Ed.), Beiträge zur Textlinguistik (pp. 161–181). München.
Freeman J.B. (1991) Dialectics and the macrostructure of argument. Foris, Berlin
Grosz B., Sidner C. (1986) Attention, intentions, and the structure of discourse. Computational Linguistics 12(3): 175–204
Jasinskaja, K., Mayer, J., Boethke, J., Neumann, A., Peldszus, A., & Rodríguez, K. J. (2007). Discourse Tagging Guidelines for German Radio News and Newspaper Commentaries. Ms., Universität Potsdam.
Kehler A. (2002) Coherence, reference, and the theory of grammar. CSLI Publications, Stanford
Knott A., Oberlander J., O’Donnell M., Mellish C. (2001) Beyond elaboration: The interaction of relations and focus in coherent text. In: Sanders T., Schilperoord J., Spooren W. (eds) Text representation: Linguistic and psycholinguistic aspects. John Benjamins, Amsterdam, pp 181–196
Krasavina, O., & Chiarcos, C. (2007). PoCoS: The Potsdam Coreference Scheme. In Proceedings of the Linguistic Annotation Workshop (LAW) at ACL-07. Prague.
Lötscher A. (1987) Text und Thema. Studien zur thematischen Konstituenz von Texten. Niemeyer, Tübingen
Mann W., Thompson S. (1988) Rhetorical structure theory: Towards a functional theory of text organization. TEXT 8: 243–281
Marcu D. (2000) The theory and practice of discourse parsing and summarization. MIT Press, Cambridge/MA
Martin J.R. (1992) English text: System and structure. John Benjamins, Philadelphia/Amsterdam
Matthiessen C., Thompson S. (1988) The structure of discourse and ‘subordination’. In: Haiman J., Thompson S. (eds) Clause combining in grammar and discourse. John Benjamins, Amsterdam, pp 275–329
Moore J., Pollack M. (1992) A problem for RST: The need for multi-level discourse analysis. Computational Linguistics 18(4): 537–544
Moser, M., & Moore, J. (1995). Using discourse analysis and automatic text generation to study discourse cue usage. In AAAI Spring Symposium on Empirical Methods in Discourse Interpretation and Generation, pp. 92–98.
Müller C., Strube M. (2006) Multi-level annotation of linguistic data with MMAX2. In: Braun S., Kohn K., Mukherjee J. (eds) Corpus technology and language pedagogy. Peter Lang, Frankfurt, pp 197–214
O’Donnell, M. (2000). RSTTool 2.4—a markup tool for rhetorical structure theory. In Proceedings of the International Natural Language Generation Conference (pp. 253–256). Mizpe Ramon/Israel.
Poesio, M. (2004). The MATE/GNOME proposals for anaphoric annotation, revisited. In Proceedings of the SIGDIAL ’04 Workshop. Cambridge/MA: Association for Computational Linguistics.
Poesio M., Stevenson R., di Eugenio B., Hitzeman J. (2004) Centering: A parametric theory and its instantiations. Computational Linguistics 30(3): 309–363
Polanyi L. (1988) A formal model of the structure of discourse. Journal of Pragmatics 12: 601–638
Prasad, R., Miltsakaki, E., Joshi, A., & Webber, B. (2004). Annotation and data mining of the Penn discourse Treebank. In Proceedings of the ACL Workshop on Discourse Annotation. Barcelona.
Redeker, G., & Egg, M. (2006). Says who? On the treatment of speech attributions in discourse structure. In Proceedings of the Workshop on Constraints in Discourse. Maynooth University/Ireland.
Sanders T., Spooren W. (1999) Communicative intentions and coherence relations. In: Sanders T., Schilperoord J., Spooren W. (eds) Coherence in Spoken and Written Discourse. Benjamins, Amsterdam
Sanders T., Spooren W., Noordman L. (1992) Toward a taxonomy of coherence relations. Discourse Processes 15: 1–35
Schmidt T. (2004) EXMARaLDA-ein System zur computergestützten Diskurstranskription. In: Mehler A., Lobin H. (eds) Automatische Textanalyse. Verlag für Sozialwissenschaften, Wiesbaden, pp 203–218
Schmitt H. (2000) Zur Illokutionsanalyse monologischer Texte. Peter Lang, Frankfurt
Searle J.R. (1976) A classification of illocutionary acts. Language in Society 5: 1–23
Stede M. (2004) Kontrast im Diskurs. In: Blühdorn H., Breindl E., Wassner H. (eds) Brücken schlagen. Grundlagen der Konnektorensemantik. Walter de Gruyter, Berlin, pp 255–286
Stede M. (2008) RST revisited: Disentangling nuclearity. In: Fabricius-Hansen C., Ramm W. (eds) ‘Subordination’ versus ‘coordination’ in sentence and text. John Benjamins, Amsterdam
Stede, M., & Heintze, S. (2004). Machine-assisted rhetorical structure annotation. In Proceedings of the 20th International Conference on Computational Linguistics (pp. 425–431). Geneva.
Taboada M., Mann W. (2006) Rhetorical structure theory: Looking back and moving ahead. Discourse Studies 8(4): 423–459
Toulmin S. (1958) The uses of argument. Cambridge University Press, Cambridge
van Dijk T. (1977) Text and context. Explorations in the semantics and pragmatics of discourse. Klett, London, NY
Webber B., Stone M., Joshi A., Knott A. (2003) Anaphora and discourse Structure. Computational Linguistics 29(4): 545–587
Wolf F., Gibson E. (2005) Representing discourse coherence: A corpus-based study. Computational Linguistics 31(2): 249–287
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Stede, M. Disambiguating Rhetorical Structure. Res on Lang and Comput 6, 311–332 (2008). https://doi.org/10.1007/s11168-008-9053-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11168-008-9053-7