Explicit and Implicit Discourse Relations in the Prague Discourse Treebank
- 396 Downloads
Abstract
Coherence of a text is provided by various language means, including discourse connectives (coordinating and subordinating conjunctions, adverbs etc.). However, semantic relations between text segments can be deduced without an explicit discourse connective, too (the so called implicit discourse relations, cf. He missed his train. 0 He had to take a taxi.). In our paper, we introduce a corpus of Czech annotated for implicit discourse relations (Enriched Discourse Annotation of Prague Discourse Treebank Subset 1.0) and we analyze some of the factors influencing the explicitness/implicitness of discourse relations, such as the text genre, semantic type of the discourse relation and the presence of negation in discourse arguments.
Keywords
Implicit discourse relations Text genre NegationNotes
Acknowledgments
This work has been supported by project “Implicit relations in text coherence” GA17-03461S of the Czech Science Foundation. The research team has been using language resources and tools distributed by the LINDAT/CLARIN project of the Ministry of Education, Youth and Sports of the Czech Republic (projects LM2015071 and OP VVV VI CZ.02.1.01/0.0/0.0/16 013/0001781).
References
- 1.Jínová, P., Poláková, L., Mírovský, J.: Sentence Structure and Discourse Structure (Possible Parallels), Linguistics Today, vol. 215, pp. 53–74. John Benjamins Publishing Company, Amsterdam (2014)Google Scholar
- 2.Mírovský, J., Hajičová, E.: What can linguists learn from some simple statistics on annotated treebanks. In: Henrich, V., Hinrichs, E., de Kok, D., Osenova, P., Przepiórkowski, A. (eds.) Proceedings of 13th International Workshop on Treebanks and Linguistic Theories (TLT13). pp. 279–284. University of Tübingen, University of Tübingen, Tübingen (2014)Google Scholar
- 3.Mírovský, J., Mladová, L., Žabokrtský, Z.: Annotation tool for discourse in PDT. In: Huang, C.R., Jurafsky, D. (eds.) Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). vol. 1, pp. 9–12. Chinese Information Processing Society of China, Tsinghua University Press, Beijing (2010)Google Scholar
- 4.Pajas, P., Štěpánek, J.: Recent advances in a feature-rich framework for treebank annotation. In: Scott, D., Uszkoreit, H. (eds.) The 22nd International Conference on Computational Linguistics - Proceedings of the Conference. vol. 2, pp. 673–680. The Coling 2008 Organizing Committee, Manchester (2008)Google Scholar
- 5.Pitler, E., Louis, A., Nenkova, A.: Automatic sense prediction for implicit discourse relations in text. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 2, pp. 683–691. Association for Computational Linguistics (2009)Google Scholar
- 6.Poláková, L.: K možnostem korpusového zpracování nadvětných jevů [on the possibilities of a corpus-based approach to discourse phenomena]. Naše řeč 4-5/2014, pp. 241–258 (2014)Google Scholar
- 7.Poláková, L., Mírovský, J., Nedoluzhko, A., Jínová, P., Zikánová, Š., Hajičová, E.: Introducing the prague discourse treebank 1.0. In: Proceedings of the 6th International Joint Conference on Natural Language Processing, pp. 91–99. Asian Federation of Natural Language Processing, Asian Federation of Natural Language Processing, Nagoya (2013)Google Scholar
- 8.Prasad, R., et al.: Penn Discourse Treebank Version 2.0. Data/software (2008). lDC2008T05Google Scholar
- 9.Prasad, R., et al.: The Penn Discourse Treebank 2.0 Annotation Manual. Technical Report IRCS-08-01. Institute for Research in Cognitive Science, University of Pennsylvania (2007)Google Scholar
- 10.Prasad, R., Webber, B., Lee, A., Joshi, A.: Penn Discourse Treebank Version 3.0. Data/software (2019). lDC2019T05Google Scholar
- 11.Rysová, M., et al.: Prague discourse treebank 2.0. Data/Software. LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University (2016). http://hdl.handle.net/11234/1-1905
- 12.Taboada, M., Brooke, J., Stede, M.: Genre-based paragraph classification for sentiment analysis. In: Healey, P., Pieraccini, R., Byron, D., Young, S., Purver, M. (eds.) Proceedings of the SIGDIAL 2009 Conference. The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 62–70. Association for Computational Linguistics, Stroudsburg (2009)Google Scholar
- 13.Webber, B.: Genre distinctions for discourse in the Penn TreeBank. In: Su, K.Y., Su, J., Wiebe, J., Li, H. (eds.) Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. pp. 674–682. Association for Computational Linguistics, Suntec (2009)Google Scholar
- 14.Webber, B., Prasad, R., Lee, A., Joshi, A.: The Penn Discourse Treebank 3.0 Annotation Manual. Technical report, University of Edinburgh (2019)Google Scholar
- 15.Webber, B., Stone, M., Joshi, A., Knott, A.: Anaphora and discourse structure. Comput. Linguist. 29(4), 545–587 (2003)CrossRefGoogle Scholar
- 16.Zeyrek, D., Demirşahin, I., Çallı, A.B.S., Kurfali, M.: Annotating implicit discourse relations in Turkish & the challenge of annotating corrective discourse relations. Oral presentation. In: IPrA Conference 2015, Antverp, Belgium (2016)Google Scholar
- 17.Zikánová, Š., Synková, P., Mírovský, J.: Enriched Discourse Annotation of PDiT Subset 1.0 (PDiT-EDA 1.0). Data/Software. LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University (2018). http://hdl.handle.net/11234/1-2906