Skip to main content

Text Integrity Assessment: Sentiment Profile vs Rhetoric Structure

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9042))

Abstract

We formulate the problem of text integrity assessment as learning the discourse structure of text given the dataset of texts with high integrity and low integrity. We use two approaches to formalizing the discourse structures, sentiment profile and rhetoric structures, relying on sentence-level sentiment classifier and rhetoric structure parsers respectively. To learn discourse structures, we use the graph-based nearest neighbor approach which allows for explicit feature engineering, and also SVM tree kernel–based learning. Both learning approaches operate on the graphs (parse thickets) which are sets of parse trees with nodes with either additional labels for sentiments, or additional arcs for rhetoric relations between different sentences. Evaluation in the domain of valid vs invalid customer complains (those with argumentation flow, non-cohesive, indicating a bad mood of a complainant) shows the stronger contribution of rhetoric structure information in comparison with the sentiment profile information. Both above learning approaches demonstrated that discourse structure as obtained by RST parser is sufficient to conduct the text integrity assessment. At the same time, sentiment profile-based approach shows much weaker results and also does not complement strongly the rhetoric structure ones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berzlánovich, I., Egg, M., Redeker, G.: Coherence structure and lexical cohesion in expository and persuasive texts. In: Proceedings of the Workshop on Constraints in Discourse III (2008)

    Google Scholar 

  2. Mann, W., Matthiessen, C., Thompson, S.: Rhetorical Structure Theory and Text Analysis. In: Mann, W.C., Thompson, S.A. (eds.) Discourse Description: Diverse Linguistic Analyses of a Fund-Raising Text, Amsterdam, pp. 39–78 (1992)

    Google Scholar 

  3. Galitsky, B., González, M., Chesñevar, C.: A novel approach for classifying customer complaints through graphs similarities in argumentative dialogues. Decision Support Systems (2009)

    Google Scholar 

  4. Egg, M., Redeker, G.: Underspecified discourse representation. In: Benz, A., Kühnlein, P. (eds.) Constraints in Discourse, pp. 117–138. Benjamins, Amsterdam (2008)

    Chapter  Google Scholar 

  5. Taboada, M.: The Genre Structure of Bulletin Board Messages. Text Technology 13(2), 55–82 (2004)

    Google Scholar 

  6. Todirascu, A., François, T., Gala, N., Fairon, C., Ligozat, A., Bernhard, B.: Coherence and Cohesion for the Assessment of Text Readability. In: Proceedings of NLPCS 2013, Marseille, France (October 2013)

    Google Scholar 

  7. Fox, B.A.: Discourse Structure and Anaphora: Written and Conversational English. Cambridge University Press, Cambridge (1987)

    Book  Google Scholar 

  8. Kong, K.C.C.: Are Simple Business Request Letters Really Simple? A Comparison of Chinese and English Business Request Letters. Text 18(1), 103–141 (1998)

    Google Scholar 

  9. Pelsmaekers, K., Braecke, C., Geluykens, R.: Rhetorical Relations and Subordination in L2 Writing. In: Sánchez-Macarro, A., Carter, R. (eds.) Linguistic Choice Across Genres: Variation in Spoken and Written English, pp. 191–213. John Benjamins, Amsterdam (1998)

    Chapter  Google Scholar 

  10. Torrance, M., Bouayad-Agha, N.: Rhetorical Structure Analysis as a Method for Understanding Writing Processes. In: Degand, L., Bestgen, Y., Spooren, W., van Waes, L. (eds.) Multidisciplinary Approaches to Discourse, Nodus, Amsterdam (2001)

    Google Scholar 

  11. Taboada, M., Mann, W.: Rhetorical Structure Theory: Looking Back and Moving Ahead. Discourse Studies 8(3), 423–459 (2006)

    Article  Google Scholar 

  12. Van Dijk, T.: Text and context. Explorations in the semantics and pragmatics of discourse. Longman, London (1977)

    Google Scholar 

  13. Foltz, P.W., Kintsch, W., Landauer, T.K.: The measurement of textual Coherence with Latent Semantic Analysis. Discourse Processes 25, 285–307 (1998)

    Article  Google Scholar 

  14. McNamara, D., Kintsch, E., Songer, N., Kintsch, W.: Are good texts always better? Interactions of text coherence, background knowledge, and levels of understanding in learning from text. Cognition and Instruction (1996)

    Google Scholar 

  15. O’reilly, T., McNamara, D.: Reversing the reverse cohesion effect: Good texts can be better for strategic, high-knowledge readers. Discourse Processes (2007)

    Google Scholar 

  16. Goutsos, D.: Modeling Discourse Topic: Sequential Relations and Strategies in Expository Text. Ablex, Norwood (1997)

    Google Scholar 

  17. Grosz, B., Sidner, C.: Attention, intentions, and the structure of discourse. Comput. Linguist. 12, 175–204 (1986)

    Google Scholar 

  18. DeVillez, R.: Writing: Step by step. Kendall Hunt, Dubuque (2003)

    Google Scholar 

  19. Golightly, K.B., Sanders, G.: Writing and Reading in the Disciplines. Pearson Custom Publishing, New Jersey (2000)

    Google Scholar 

  20. Halliday, M.A.K., Hasan, R.: Cohesion in English. Longman, London (1976)

    Google Scholar 

  21. Barzilay, R., Lapata, M.: Modeling Local Coherence: An Entity-based Approach. Computational Linguistics 34(1), 1–34 (2008)

    Article  Google Scholar 

  22. Redeker, G.: Coherence and structure in text and discourse. In: Black, W., Bunt, H. (eds.) Abduction, Belief and Context in Dialogue. Studies in Computational Pragmatics, pp. 233–263. Benjamins, Amsterdam (2000)

    Chapter  Google Scholar 

  23. Charolles, M.: Cohesion, coherence et pertinence de discours. Travaux de Linguistique 29, 125–151 (1995)

    Google Scholar 

  24. Hobbs, J.: Coherence and Coreference. Cognitive Science 3(1), 67–90 (1979)

    Article  Google Scholar 

  25. Schnedecker, C.: Nom propre et chaînes de reference. Recherches Linguistiques 21.Klincksieck, Paris (1997)

    Google Scholar 

  26. Schnedecker, C.: Les chaînes de reference dans les portraits journalistiques: éléments de description. Travaux de Linguistique 2, 85–133 (2005)

    Article  Google Scholar 

  27. Kleiber, G.: Anaphores et pronoms. Duculot, Louvain-la-Neuve (1994)

    Google Scholar 

  28. Grosz, B., Sidner, C.: Attention, intentions, and the structure of discourse. Comput. Linguist. 12(3), 175–204 (1986)

    Google Scholar 

  29. Carlson, L., Marcu, D., Okurowski, M.E.: Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: van Kuppevelt, J., Smith, R. (eds.) Current Directions in Discourse and Dialogue, pp. 85–112. Kluwer Academic Publishers, Dordrecht (2003)

    Chapter  Google Scholar 

  30. Joty, S., Carenini, G., Ng, R., Mehdad, Y.: Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria (2013)

    Google Scholar 

  31. Galitsky, B., Ilvovsky, D., Kuznetsov, S.O., Strok, F.: Matching sets of parse trees for answering multi-sentence questions. In: Proceedings of the Recent Advances in Natural Language Processing, RANLP 2013, pp. 285–294. INCOMA Ltd., Shoumen (2013)

    Google Scholar 

  32. Ilvovsky, D.: Going beyond sentences when applying tree kernels. In: Proceedings of the Student Research Workshop ACL 2014, pp. 56–63 (2014)

    Google Scholar 

  33. Galitsky, B., Kuznetsov, S.O.: Learning communicative actions of conflicting human agents. J. Exp. Theor. Artif. Intell. 20(4), 277–317 (2008)

    Article  MATH  Google Scholar 

  34. Vapnik, V.: The Nature of Statistical Learning Theory. Springer (1995)

    Google Scholar 

  35. Marcu, D.: From Discourse Structures to Text Summaries. In: Mani, I., Maybury, M. (eds.) Proceedings of ACL Workshop on Intelligent Scalable Text Summarization, Madrid, pp. 82–88 (1997)

    Google Scholar 

  36. Severyn, A., Moschitti, A.: Fast Support Vector Machines for Convolution Tree Kernels. Data Mining Knowledge Discovery 25, 325–357 (1997, 2012)

    Google Scholar 

  37. Recasens, M., de Marneffe, M.-C., Potts, C.: The Life and Death of Discourse Entities: Identifying Singleton Mentions. In: Proceedings of NAACL (2013)

    Google Scholar 

  38. Lee, H., Chang, A., Peirsman, Y., Chambers, N., Surdeanu, M., Jurafsky, D.: Deterministic coreference resolution based on entity-centric, precision-ranked rules. Computational Linguistics 39(4) (2013)

    Google Scholar 

  39. Collins, M., Duffy, N.: Convolution kernels for natural language. In: Proceedings of NIPS, pp. 625–632 (2002)

    Google Scholar 

  40. Moschitti, A.: Efficient Convolution Kernels for Dependency and Constituent Syntactic Trees. In: Proceedings of the 17th European Conference on Machine Learning, Berlin, Germany (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Boris Galitsky .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Galitsky, B., Ilvovsky, D., Kuznetsov, S.O. (2015). Text Integrity Assessment: Sentiment Profile vs Rhetoric Structure. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9042. Springer, Cham. https://doi.org/10.1007/978-3-319-18117-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18117-2_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18116-5

  • Online ISBN: 978-3-319-18117-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics