Empirical study of shallow and deep learning models for sarcasm detection using context in benchmark datasets

Kumar, Akshi; Garg, Geetanjali

doi:10.1007/s12652-019-01419-7

Empirical study of shallow and deep learning models for sarcasm detection using context in benchmark datasets

Original Research
Published: 12 August 2019

Volume 14, pages 5327–5342, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

947 Accesses
22 Citations
Explore all metrics

Abstract

Sarcastic expressions tend to flip the polarity of posts when being analyzed for sentiments. Detecting sarcastic tone, which conveys a sharp, bitter, or cutting expression, remark or taunt in natural language is tricky even for humans, making its automated detection more arduous. Computational models for sarcasm detection have often relied on the content of utterances in isolation whereas using contextual information definitely improves it. This work is a preliminary to understand the what, how and why of using context in sentiment analysis. The concept of ‘context in use’ is described by exemplifying content-based local and global context to predict sarcasm in user-generated social textual data. In this research Twitter data of benchmark SemEval 2015 Task 11 and nearly 20 k posts from Reddit are classified as sarcastic or non-sarcastic using three predictive learning models. The first model is based on the conventional Term Frequency-Inverse Document Frequency (TF-IDF) weighting which is trained over three classifiers, namely the Multinomial Naïve Bayes, Gradient Boosting and Random Forest and Ensemble Voting is utilized to generate the output. In the second model, a combination of semantic (sentiment) and pragmatic (punctuation) features are considered to model the context along with the top-200 TF-IDF features and results are observed using five baseline classifiers (Decision Tree, Support Vector Machine, Random Forest, K-Nearest Neighbour and Multi Layer Perceptron). The final model uses deep learning, that is, the Long–Short-Term-Memory (LSTM) and its variant Bi-directional LSTM applying GloVe (Global Vectors for Word Representation) for building semantic word embeddings and learning context. The empirical study using training set and test set performance metrics (Accuracy, Recall, Precision, F1 Score) is done to compare the three learning models for sarcasm classification within two datasets, and it is observed that Bi-directional LSTM model has the highest accuracy of 86.32% and 82.91% for the Twitter and Reddit datasets respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sarcasm Detection Using Feature-Variant Learning Models

Sarcasm Detection in Social Media Using Hybrid Deep Learning and Machine Learning Approaches

Deep Contextualised Text Representation and Learning for Sarcasm Detection

Article 14 August 2023

Notes

https://www.nltk.org/.

References

Amir S, Wallace BC, Lyu H, Silva PC (2016) Modelling context with user embeddings for sarcasm detection in social media. arXiv preprint arXiv:1607.00976
Bamman D, Smith NA (2015) Contextualized sarcasm detection on Twitter. ICWSM. 2:15
Google Scholar
Barbieri F, Saggion H, Ronzano F (2014) Modelling sarcasm in twitter, a novel approach. In: Proceedings of the 5th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 50–58
Barman U, Das A, Wagner J, Foster J (2014) Code mixing: a challenge for language identification in the language of social media. EMNLP 2014:13
Google Scholar
Bharti SK, Babu KS, Jena SK. (2015). Parsing-based sarcasm sentiment recognition in twitter data. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining, ACM, pp 1373–1380
Bharti SK, Vachha B, Pradhan RK, Babu KS, Jena SK (2016) Sarcastic sentiment detection in tweets streamed in real time: a big data approach. Digit Commun Netw 2(3):108–121
Article Google Scholar
Bhatia MPS, Kumar A (2008) Information retrieval and machine learning: supporting technologies for web mining research and practice. Webology 5(2):5
Bouazizi M, Ohtsuki TO (2016) A pattern-based approach for sarcasm detection on Twitter. IEEE Access 4:5477–5488
Article Google Scholar
Camp E (2012) Sarcasm, pretense, and the semantics/pragmatics distinction. Noûs 46(4):587–634
Article Google Scholar
Chen J, Yan S, Wong KC (2018) Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3442-0
Das A, Gambäck B (2014). Identifying languages at the word level in code-mixed Indian social media text. In: Proceedings of the 11th international conference on natural language processing, Goa, India, 2014, pp 169–178
Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on World Wide Web, ACM, pp 519–528
Davidov D, Tsur O, Rappoport A (2010).Semi-supervised recognition of sarcastic sentences in twitter and Amazon. In: Proceedings of the fourteenth conference on computational natural language learning, association for computational linguistics, pp 107–116
Felbo B, Mislove A, Søgaard A, Rahwan I, Lehmann S (2017) Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. arXiv preprint arXiv:1708.00524
Ghosh A, Veale T (2016) Fracking sarcasm using neural network. In: Proceedings of the 7th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 161–169
Ghosh A, Li G, Veale T, Rosso P, Shutova E, Barnden J, Reyes A (2015) Semeval-2015 task 11: Sentiment analysis of figurative language in twitter. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pp 470–478
González-Ibánez R, Muresan S, Wacholder N. (2011).Identifying sarcasm in Twitter: a closer look. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers association for computational linguistics vol 2, pp 581–586
Graves A, Jaitly N, Mohamed AR. (2013). Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE workshop 2013 on automatic speech recognition and understanding (ASRU), pp 273–278
Satapathy R., Guerreiro, C., Chaturvedi, I., Cambria, E (2017) Phonetic-based microtext normalization for twitter sentiment analysis. In: ICDM, pp 407–413
Hercig T, Lenc L (2017) The impact of figurative language on sentiment analysis. In: Proceedings of the international conference recent advances in natural language processing, pp 301–308
Joshi A, Tripathi V, Patel K, Bhattacharyya P, Carman M (2016) Are word embedding-based features useful for sarcasm detection? arXiv preprint arXiv:1610.00883
Joshi A, Bhattacharyya P, Carman M (2017) Automatic sarcasm detection: a survey. ACM Comput Surv 50(5):73
Google Scholar
Karoui J, Benamara F, Moriceau V, Aussenac-Gilles N, Hadrich Belguith L (2015) Towards a contextual pragmatic model to detect irony in tweets. Association for Computational Linguistics (ACL)
Khodak M, Saunshi N, Vodrahalli K (2017) A large self-annotated corpus for sarcasm. arXivpreprint arXiv:1704.05579
Kumar A, Garg G (2019) Systematic literature review on context-based sentiment analysis in social multimedia. Multimed Tools Appl. https://doi.org/10.1007/s11042-019-7346-5(in press)
Article Google Scholar
Kumar A, Sebastian TM (2012) Sentiment Analysis: a perspective on its past, present and future. Int J Intell Syst Appl 4(10):1–14
Google Scholar
Liebrecht CC, Kunneman FA, van den Bosch APJ (2013) The perfect solution for detecting sarcasm in tweets #not. In: Proceedings of the workshop on computational approaches to subjectivity, sentiment, and social media by the North American Chapter of the Association for Computational Linguistics (WASSA-NAACL’13)
Liu B (2015) Sentiment analysis mining opinions, sentiments, and emotions. Cambridge University Press, Cambridge
Book Google Scholar
Liu P, Chen W, Ou G, Wang T, Yang D, Lei K. (2014) Sarcasm detection in social media based on imbalanced classification. In: International conference on web-age information management. Springer, Cham 2014, pp 459–471
Loper E, Bird S. (2002). NLTK: The natural language toolkit. In: Proceedings of the ACL-02 workshop on effective tools and methodologies for teaching natural language processing and computational linguistics, Association for Computational Linguistics, vol 1, pp 63–70
Nakov P, Rosenthal S, Kozareva Z, Stoyanov V, Ritter A, Wilson T (2013) Semeval-2013 task 2: Sentiment analysis in twitter. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), Vol 2. Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Atlanta, Georgia, USA, June. Association for Computational Linguistics, pp 312–320
Nilep C (2006) Code switching in socio-cultural linguistics. Colorado Res Linguist 19(1):1–22
Google Scholar
Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. Int Lang Resour Eval 2010:1320–1326
Google Scholar
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135
Article Google Scholar
Pennington J, Socher R, Manning C (2014).Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Poria S, Cambria E, Hazarika D, Vij P (2016). A deeper look into sarcastic tweets using deep convolutional neural networks. In: Proceedings of international conference on computational linguistics, pp 1601–1612
Powers DMW (2011) Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. J Mach Learn Technol 2(1):37–63
MathSciNet Google Scholar
Ptáček T, Habernal I, Hong J (2014) Sarcasm detection on Czech and English twitter. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, pp 213–223
Rajadesingan A, Zafarani R, Liu H (2015) Sarcasm detection on twitter: a behavioral modeling approach. In: Proceedings of the eighth ACM international conference on web search and data, ACM, pp 97–106
Riloff E, Qadir A, Surve P, Silva LD, Gilbert N, Huang R (2013) Sarcasm as contrast between a positive sentiment and negative situation. EMNLP 13:704–714
Google Scholar
Salehinejad H, BaarbeJ, Sankar S, Barfett J, Colak E, Valaee S (2017) Recent advances in recurrent neural networks. arXiv preprint arXiv:1801.01078
Sulis E, Farías DI, Rosso P, Patti V, Ruffo G (2016) Figurative messages and affect in twitter: differences between# irony,# sarcasm and# not. Knowl-Based Syst 15(108):132–143
Article Google Scholar
Tang C (2018) Multiple sclerosis identification by 14-layer convolutional neural network with batch normalization, dropout, and stochastic pooling. Front Neurosci 12:818
Article Google Scholar
Wang Z, Wu Z, Wang R, Ren Y. (2015). Twitter sarcasm detection exploiting a context-based model. In: International conference on web information systems engineering. Springer, Cham 2015, pp 77–91
Wang S-H, Junding S, Preetha P, Guihu Z, Yudong Z (2018) Polarimetric synthetic aperture radar image segmentation by convolutional neural network using graphical processing units. J Real-Time Image Proc 15(3):631–642
Article Google Scholar
Wang S-H, Muhammad K, Hong J, Sangaiah KA, Zhang Y-D (2019) Alcoholism identification via convolutional neural network based on parametric ReLU, dropout, and batch normalization. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3924-0
Article Google Scholar
Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13(3):55–75
Article Google Scholar
Zhang M, Zhang Y, Fu G (2016) Tweet sarcasm detection using deep neural network. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 2449–2460
Zhang YD, Chichun P, Junding S, Chaosheng T (2018a) Multiple sclerosis identification by convolutional neural network with dropout and parametric ReLU. J Comput Sci 28:1–10
Article MathSciNet Google Scholar
Zhang YD, Yin Z, Xiao-Xia H, Hong C, Shuihua W (2018b) Seven-layer deep neural network based on sparse autoencoder for voxelwise detection of cerebral microbleed. Multimed Tools Appl 77(9):10521–10538
Article Google Scholar
Zhang YD, Chichun P, Xianqing C, Fubin W (2018c) Abnormal breast identification by nine-layer convolutional neural network with parametric rectified linear unit and rank-based stochastic pooling. J Comput Sci 27:57–68
Article Google Scholar
Zhang YD, Muhammad K, Chaosheng T (2018d) Twelve-layer deep convolutional neural network with stochastic pooling for tea category classification on GPU platform. Multimed Tools Appl 77(17):22821–22839
Article Google Scholar
Zubiaga A, Aker A, Bontcheva K, Liakata M, Procter R (2017) Detection and resolution of rumours in social media: a survey. arXiv preprint arXiv:1704.00656

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Delhi Technological University, Delhi, India
Akshi Kumar & Geetanjali Garg

Authors

Akshi Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Geetanjali Garg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akshi Kumar.

Ethics declarations

Conflict of interest

The authors certify that there is no conflict of interest in the subject matter discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, A., Garg, G. Empirical study of shallow and deep learning models for sarcasm detection using context in benchmark datasets. J Ambient Intell Human Comput 14, 5327–5342 (2023). https://doi.org/10.1007/s12652-019-01419-7

Download citation

Received: 21 February 2019
Accepted: 01 August 2019
Published: 12 August 2019
Issue Date: May 2023
DOI: https://doi.org/10.1007/s12652-019-01419-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empirical study of shallow and deep learning models for sarcasm detection using context in benchmark datasets

Abstract

Access this article

Similar content being viewed by others

Sarcasm Detection Using Feature-Variant Learning Models

Sarcasm Detection in Social Media Using Hybrid Deep Learning and Machine Learning Approaches

Deep Contextualised Text Representation and Learning for Sarcasm Detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Empirical study of shallow and deep learning models for sarcasm detection using context in benchmark datasets

Abstract

Access this article

Similar content being viewed by others

Sarcasm Detection Using Feature-Variant Learning Models

Sarcasm Detection in Social Media Using Hybrid Deep Learning and Machine Learning Approaches

Deep Contextualised Text Representation and Learning for Sarcasm Detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation