Skip to main content

How Did the Discussion Go: Discourse Act Classification in Social Media Conversations

  • Chapter
  • First Online:

Part of the book series: Unsupervised and Semi-Supervised Learning ((UNSESUL))

Abstract

Over the last two decades, social media has emerged as almost an alternate world where people communicate with each other and express opinions about almost anything. This makes platforms like Facebook, Reddit, Twitter, Myspace, etc., a rich bank of heterogeneous data, primarily expressed via text but reflecting all textual and non-textual data that human interaction can produce. We propose a novel attention-based hierarchical LSTM model to classify discourse act sequences in social media conversations, aimed at mining data from online discussion using textual meanings beyond sentence level. The very uniqueness of the task is the complete categorization of possible pragmatic roles in informal textual discussions, contrary to extraction of question–answers, stance detection, or sarcasm identification which are very much role specific tasks. Early attempt was made on a Reddit discussion dataset. We train our model on the same data, and present test results on two different datasets, one from Reddit and one from Facebook. Our proposed model outperformed the previous one in terms of domain independence; without using platform-dependent structural features, our hierarchical LSTM with word relevance attention mechanism achieved F1-scores of 71% and 66%, respectively, to predict discourse roles of comments in Reddit and Facebook discussions. Efficiency of recurrent and convolutional architectures in order to learn discursive representation on the same task has been presented and analyzed, with different word and comment embedding schemes. Our attention mechanism enables us to inquire into relevance ordering of text segments according to their roles in discourse. We present a human annotator experiment to unveil important observations about modeling and data annotation. Equipped with our text-based discourse identification model, we inquire into how heterogeneous non-textual features like location, time, leaning of information, etc. play their roles in characterizing online discussions on Facebook.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/dmorr-google/coarse-discourse.

  2. 2.

    http://scikit-learn.org.

References

  1. Arguello, J., Shaffer, K.: Predicting speech acts in MOOC forum posts. In: ICWSM, pp. 2–11 (2015)

    Google Scholar 

  2. Bhatia, S., Biyani, P., Mitra, P.: Identifying the role of individual user messages in an online discussion and its use in thread retrieval. J. Assoc. Inf. Sci. Technol. 67(2), 276–288 (2016)

    Article  Google Scholar 

  3. Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)

    Article  Google Scholar 

  4. Bunt, H.: A methodology for designing semantic annotation languages exploring semantic-syntactic isomorphisms. In: Proceedings of the Second International Conference on Global Interoperability for Language Resources (ICGL 2010), Hong Kong, pp. 29–46 (2010)

    Google Scholar 

  5. Chakraborty, T., Dalmia, A., Mukherjee, A., Ganguly, N.: Metrics for community analysis: a survey. ACM Comput. Surv. 50(4), 54 (2017)

    Article  Google Scholar 

  6. Clark, A., Popescu-Belis, A.: Multi-level dialogue act tags. In: SIGDIAL, Cambridge, pp. 163–170 (2004)

    Google Scholar 

  7. Cohen, W.W., Carvalho, V.R., Mitchell, T.M.: Learning to classify email into “speech acts”. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 1–8 (2004)

    Google Scholar 

  8. Ding, S., Cong, G., Lin, C.Y., Zhu, X.: Using conditional random fields to extract contexts and answers of questions from online forums. In: Proceedings of ACL-08: HLT, pp. 710–718 (2008)

    Google Scholar 

  9. Du, J., Xu, R., He, Y., Gui, L.: Stance classification with target-specific neural attention networks. In: International Joint Conferences on Artificial Intelligence, pp. 3988–3994 (2017)

    Google Scholar 

  10. Dutta, S., Das, D.: Dialogue modelling in multi-party social media conversation. In: International Conference on Text, Speech, and Dialogue, pp. 219–227. Springer, Berlin (2017)

    Chapter  Google Scholar 

  11. Eisenlauer, V.: Facebook as a third author—(semi-)automated participation framework in social network sites. J. Pragmat. 72, 73–85 (2014)

    Article  Google Scholar 

  12. Hasan, K.S., Ng, V.: Why are you taking this stance? Identifying and classifying reasons in ideological debates. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 751–762 (2014)

    Google Scholar 

  13. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  14. Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J., et al.: Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies. IEEE Press, New York (2001)

    Google Scholar 

  15. Kalchbrenner, N., Blunsom, P.: Recurrent convolutional neural networks for discourse compositionality. Preprint, arXiv:1306.3584 (2013)

    Google Scholar 

  16. Lai, M., Farías, D.I.H., Patti, V., Rosso, P.: Friends and enemies of Clinton and Trump: using context for detecting stance in political tweets. In: Mexican International Conference on Artificial Intelligence, pp. 155–168. Springer, New York (2016)

    Chapter  Google Scholar 

  17. Larson, M.L.: Meaning-Based Translation: A Guide to Cross-Language Equivalence. University press of America, Lanham (1984)

    Google Scholar 

  18. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)

    Google Scholar 

  19. Lotan, G.: Mapping information flows on twitter. In: Proceedings of the ICWSM Workshop on the Future of the Social Web (2011)

    Google Scholar 

  20. Malhotra, P., Vig, L., Shroff, G., Agarwal, P.: Long short term memory networks for anomaly detection in time series. In: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp. 89–94. Presses universitaires de Louvain, Louvain-la-Neuve (2015)

    Google Scholar 

  21. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  22. Misra, A., Walker, M.: Topic independent identification of agreement and disagreement in social media dialogue. Preprint, arXiv:1709.00661 (2017)

    Google Scholar 

  23. O’Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A., et al.: From tweets to polls: linking text sentiment to public opinion time series. International Conference on Weblogs and Social Media, vol. 11, no. 122–129, pp. 1–2 (2010)

    Google Scholar 

  24. Scott, K.: The pragmatics of hashtags: inference and conversational style on twitter. J. Pragmat. 81, 8–20 (2015)

    Article  Google Scholar 

  25. Somasundaran, S., Namata, G., Wiebe, J., Getoor, L.: Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 170–179. Association for Computational Linguistics, Morristown (2009)

    Google Scholar 

  26. Suh, B., Hong, L., Pirolli, P., Chi, E.H.: Want to be retweeted? Large scale analytics on factors impacting retweet in twitter network. In: 2010 IEEE Second International Conference on Social Computing (socialcom), pp. 177–184. IEEE, Piscataway (2010)

    Google Scholar 

  27. Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)

    Google Scholar 

  28. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  29. Tan, C., Lee, L., Tang, J., Jiang, L., Zhou, M., Li, P.: User-level sentiment analysis incorporating social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1397–1405. ACM, New York (2011)

    Google Scholar 

  30. Trevithick, P., Clippinger, J.H.: Method and system for characterizing relationships in social networks (2008). US Patent 7,366,759

    Google Scholar 

  31. Wang, H., Can, D., Kazemzadeh, A., Bar, F., Narayanan, S.: A system for real-time twitter sentiment analysis of 2012 US presidential election cycle. In: Proceedings of the ACL 2012 System Demonstrations, pp. 115–120. Association for Computational Linguistics, Morristown (2012)

    Google Scholar 

  32. Wang, K.C., Lai, C.M., Wang, T., Wu, S.F.: Bandwagon effect in Facebook discussion groups. In: Proceedings of the ASE BigData & Social Informatics 2015, p. 17. ACM, New York (2015)

    Google Scholar 

  33. Wang, Y., Huang, M., Zhao, L., et al.: Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 606–615 (2016)

    Google Scholar 

  34. Wong, F., Tan, C.W., Sen, S., Chiang, M.: Media, pundits and the US presidential election: quantifying political leanings from tweets. In: Proceedings of the International Conference on Weblogs and Social Media, pp. 640–649 (2013)

    Google Scholar 

  35. Zhang, A.X., Culbertson, B., Paritosh, P.: Characterizing online discussion using coarse discourse sequences. In: Proceedings of the Eleventh International Conference on Web and Social Media, pp. 1–10. AAAI Press, Palo Alto (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Dutta, S., Chakraborty, T., Das, D. (2019). How Did the Discussion Go: Discourse Act Classification in Social Media Conversations. In: P, D., Jurek-Loughrey, A. (eds) Linking and Mining Heterogeneous and Multi-view Data. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-01872-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01872-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01871-9

  • Online ISBN: 978-3-030-01872-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics