Abstract
Over the last two decades, social media has emerged as almost an alternate world where people communicate with each other and express opinions about almost anything. This makes platforms like Facebook, Reddit, Twitter, Myspace, etc., a rich bank of heterogeneous data, primarily expressed via text but reflecting all textual and non-textual data that human interaction can produce. We propose a novel attention-based hierarchical LSTM model to classify discourse act sequences in social media conversations, aimed at mining data from online discussion using textual meanings beyond sentence level. The very uniqueness of the task is the complete categorization of possible pragmatic roles in informal textual discussions, contrary to extraction of question–answers, stance detection, or sarcasm identification which are very much role specific tasks. Early attempt was made on a Reddit discussion dataset. We train our model on the same data, and present test results on two different datasets, one from Reddit and one from Facebook. Our proposed model outperformed the previous one in terms of domain independence; without using platform-dependent structural features, our hierarchical LSTM with word relevance attention mechanism achieved F1-scores of 71% and 66%, respectively, to predict discourse roles of comments in Reddit and Facebook discussions. Efficiency of recurrent and convolutional architectures in order to learn discursive representation on the same task has been presented and analyzed, with different word and comment embedding schemes. Our attention mechanism enables us to inquire into relevance ordering of text segments according to their roles in discourse. We present a human annotator experiment to unveil important observations about modeling and data annotation. Equipped with our text-based discourse identification model, we inquire into how heterogeneous non-textual features like location, time, leaning of information, etc. play their roles in characterizing online discussions on Facebook.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Arguello, J., Shaffer, K.: Predicting speech acts in MOOC forum posts. In: ICWSM, pp. 2–11 (2015)
Bhatia, S., Biyani, P., Mitra, P.: Identifying the role of individual user messages in an online discussion and its use in thread retrieval. J. Assoc. Inf. Sci. Technol. 67(2), 276–288 (2016)
Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)
Bunt, H.: A methodology for designing semantic annotation languages exploring semantic-syntactic isomorphisms. In: Proceedings of the Second International Conference on Global Interoperability for Language Resources (ICGL 2010), Hong Kong, pp. 29–46 (2010)
Chakraborty, T., Dalmia, A., Mukherjee, A., Ganguly, N.: Metrics for community analysis: a survey. ACM Comput. Surv. 50(4), 54 (2017)
Clark, A., Popescu-Belis, A.: Multi-level dialogue act tags. In: SIGDIAL, Cambridge, pp. 163–170 (2004)
Cohen, W.W., Carvalho, V.R., Mitchell, T.M.: Learning to classify email into “speech acts”. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 1–8 (2004)
Ding, S., Cong, G., Lin, C.Y., Zhu, X.: Using conditional random fields to extract contexts and answers of questions from online forums. In: Proceedings of ACL-08: HLT, pp. 710–718 (2008)
Du, J., Xu, R., He, Y., Gui, L.: Stance classification with target-specific neural attention networks. In: International Joint Conferences on Artificial Intelligence, pp. 3988–3994 (2017)
Dutta, S., Das, D.: Dialogue modelling in multi-party social media conversation. In: International Conference on Text, Speech, and Dialogue, pp. 219–227. Springer, Berlin (2017)
Eisenlauer, V.: Facebook as a third author—(semi-)automated participation framework in social network sites. J. Pragmat. 72, 73–85 (2014)
Hasan, K.S., Ng, V.: Why are you taking this stance? Identifying and classifying reasons in ideological debates. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 751–762 (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J., et al.: Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies. IEEE Press, New York (2001)
Kalchbrenner, N., Blunsom, P.: Recurrent convolutional neural networks for discourse compositionality. Preprint, arXiv:1306.3584 (2013)
Lai, M., Farías, D.I.H., Patti, V., Rosso, P.: Friends and enemies of Clinton and Trump: using context for detecting stance in political tweets. In: Mexican International Conference on Artificial Intelligence, pp. 155–168. Springer, New York (2016)
Larson, M.L.: Meaning-Based Translation: A Guide to Cross-Language Equivalence. University press of America, Lanham (1984)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
Lotan, G.: Mapping information flows on twitter. In: Proceedings of the ICWSM Workshop on the Future of the Social Web (2011)
Malhotra, P., Vig, L., Shroff, G., Agarwal, P.: Long short term memory networks for anomaly detection in time series. In: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp. 89–94. Presses universitaires de Louvain, Louvain-la-Neuve (2015)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Misra, A., Walker, M.: Topic independent identification of agreement and disagreement in social media dialogue. Preprint, arXiv:1709.00661 (2017)
O’Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A., et al.: From tweets to polls: linking text sentiment to public opinion time series. International Conference on Weblogs and Social Media, vol. 11, no. 122–129, pp. 1–2 (2010)
Scott, K.: The pragmatics of hashtags: inference and conversational style on twitter. J. Pragmat. 81, 8–20 (2015)
Somasundaran, S., Namata, G., Wiebe, J., Getoor, L.: Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 170–179. Association for Computational Linguistics, Morristown (2009)
Suh, B., Hong, L., Pirolli, P., Chi, E.H.: Want to be retweeted? Large scale analytics on factors impacting retweet in twitter network. In: 2010 IEEE Second International Conference on Social Computing (socialcom), pp. 177–184. IEEE, Piscataway (2010)
Sundermeyer, M., Schlüter, R., Ney, H.: LSTM neural networks for language modeling. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Tan, C., Lee, L., Tang, J., Jiang, L., Zhou, M., Li, P.: User-level sentiment analysis incorporating social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1397–1405. ACM, New York (2011)
Trevithick, P., Clippinger, J.H.: Method and system for characterizing relationships in social networks (2008). US Patent 7,366,759
Wang, H., Can, D., Kazemzadeh, A., Bar, F., Narayanan, S.: A system for real-time twitter sentiment analysis of 2012 US presidential election cycle. In: Proceedings of the ACL 2012 System Demonstrations, pp. 115–120. Association for Computational Linguistics, Morristown (2012)
Wang, K.C., Lai, C.M., Wang, T., Wu, S.F.: Bandwagon effect in Facebook discussion groups. In: Proceedings of the ASE BigData & Social Informatics 2015, p. 17. ACM, New York (2015)
Wang, Y., Huang, M., Zhao, L., et al.: Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 606–615 (2016)
Wong, F., Tan, C.W., Sen, S., Chiang, M.: Media, pundits and the US presidential election: quantifying political leanings from tweets. In: Proceedings of the International Conference on Weblogs and Social Media, pp. 640–649 (2013)
Zhang, A.X., Culbertson, B., Paritosh, P.: Characterizing online discussion using coarse discourse sequences. In: Proceedings of the Eleventh International Conference on Web and Social Media, pp. 1–10. AAAI Press, Palo Alto (2017)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Dutta, S., Chakraborty, T., Das, D. (2019). How Did the Discussion Go: Discourse Act Classification in Social Media Conversations. In: P, D., Jurek-Loughrey, A. (eds) Linking and Mining Heterogeneous and Multi-view Data. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-01872-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-01872-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01871-9
Online ISBN: 978-3-030-01872-6
eBook Packages: EngineeringEngineering (R0)