Abstract
Sarcasm is often related to something that has created a mass confusion among the general uninformed public. It is always associated with a mockery tone or trenchancy facial expression or weird language. Existing literatures that are profound in the field of sarcasm detection mainly focused on text-based input with sarcastic comments or facial expression-based analysis, i.e., image input. But both text and image input are not sufficient to analyze the underlying sarcasm behind the scene. This kind of analysis can also be misleading sometimes as the emotional expression can change with social circumstances (i.e., audio tone) over time. Hence to address these challenges, “A Smart Video Analytical framework for Sarcasm Detection using Deep Learning” is introduced where sarcasm detection is done by considering video modality. Proposed model extracts three important features from the video, i.e., text using proposed Enhanced-BERT, image using ImageNet and audio using Librosa. After extraction, each modality is addressed individually and is finally fused using proposed adaptive early fusion approach. The final task prediction of classification is done using novel deep neural network called “SarcasNet-99” to detect sarcasm in video over distributed framework called Apache Storm. TedX and GIF Reply datasets are used for model training and testing with around 10,000 + video clips. When compared against existing state-of-the-art techniques such as AlexNet, DenseNet, SqueezeNet and ResNet, the proposed model predicted accuracy 99.005% with LeakyReLU activation function.
Similar content being viewed by others
Data availability
The datasets generated during and/or analyzed during the current study are available in the MultiComp Lab repository, http://multicomp.cs.cmu.edu/resources/.
References
Chatterjee, S., Bhattacharjee, S., Ghosh, K., Das, A.K., Banerjee, S.: Class-biased sarcasm detection using BiLSTM variational autoencoder-based synthetic oversampling. Soft. Comput. 8, 1–8 (2023). https://doi.org/10.1007/s00500-023-08045-8
Moores, B., Mago, V.: A survey on automated sarcasm detection on Twitter (2022). arXiv preprint https://doi.org/10.48550/arXiv.2202.02516
Rahma, A., Azab, S.S., Mohammed, A.: A comprehensive review on arabic sarcasm detection: approaches, challenges and future trends. IEEE Access 8, 24 (2023)
Bhat, A., Jha, G.N.: Sarcasm detection of textual data on online socialmedia: a review. In: 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), pp. 1981–1985. IEEE (2022). https://10.0.4.85/ICACITE53722.2022.9823869
Dutta, P., Bhattacharyya, C.K.: Multi-modal sarcasm detection in social networks: a comparative review. In: 2022 6th International Conference on Computing Methodologies and Communication (ICCMC), pp. 207–214. IEEE (2022). https://10.0.4.85/ICCMC53470.2022.9753981
Vinoth, D., Prabhavathy, P.: An intelligent machine learning-based sarcasm detection and classification model on social networks. J. Supercomput. 78(8), 10575–10594 (2022). https://doi.org/10.1007/s11227-023-05071-z
Godara, J., Batra, I., Aron, R., Shabaz, M.: Ensemble classification approach for sarcasm detection. Behav. Neurol. 22, 2021 (2021). https://doi.org/10.1155/2021/9731519
Li, L., Levi, O., Hosseini, P., Broniatowski, D.A.: A multi-modal method for satire detection using textual and visual cues (2020). arXiv preprint https://doi.org/10.48550/arXiv.2010.06671
Muaad, A.Y., Jayappa Davanagere, H., Benifa, J.V., Alabrah, A., Naji Saif, M.A., Pushpa, D., Al-Antari, M.A., Alfakih, T.M.: Artificial intelligence-based approach for misogyny and sarcasm detection from Arabic texts. Comput. Intell. Neurosci. 26, 2022 (2022). https://doi.org/10.1155/2022/7937667
Ahuja, R., Sharma, S.C.: Transformer-based word embedding with CNN model to detect sarcasm and irony. Arab. J. Sci. Eng. 47(8), 9379–9392 (2022). https://doi.org/10.1007/s13369-021-06193-3
Yao, F., Sun, X., Yu, H., Zhang, W., Liang, W., Fu, K.: Mimicking the brain’s cognition of sarcasm from multidisciplines for Twitter sarcasm detection. IEEE Trans. Neural Netw. Learn. Syst. 24, 31 (2021)
Liang, B., Lou, C., Li, X., Gui, L., Yang, M., Xu, R.: Multi-modal sarcasm detection with interactive in-modal and cross-modal graphs. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4707–4715 (2021). https://doi.org/10.1145/3474085.3475190
Bedi, M., Kumar, S., Akhtar, M.S., Chakraborty, T.: Multi-modal sarcasm detection and humor classification in code-mixed conversations. IEEE Trans. Affect. Comput. (2021)
Sharma, D.K., Singh, B., Agarwal, S., Kim, H., Sharma, R.: Sarcasm detection over social media platforms using hybrid auto-encoder-based model. Electronics 11(18), 2844 (2022). https://doi.org/10.3390/electronics11182844
Kamal, A., Abulaish, M.: Cat-bigru: convolution and attention with bi-directional gated recurrent unit for self-deprecating sarcasm detection. Cogn. Comput. 1, 1–9 (2022). https://doi.org/10.1007/s12559-021-09821-0
Zhao, X., Huang, J., Yang, H.: CANs: coupled-attention networks for sarcasm detection on social media. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021). https://10.0.4.85/IJCNN52387.2021.9533800
Liang, B., Lou, C., Li, X., Yang, M., Gui, L., He, Y., Pei, W., Xu, R.: Multi-modal sarcasm detection via cross-modal graph convolutional network. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, vol 1: Long Papers, pp. 1767–1777 (2022). https://10.0.72.221/v1/2022.acl-long.124
Liu, H., Wang, W., Li, H.: Towards multi-modal sarcasm detection via hierarchical congruity modeling with knowledge enhancement. arXiv preprint https://doi.org/10.48550/arXiv.2210.03501
García-Díaz, J., Caparros-Laiz, C., Valencia-García, R.: UMUTeam at SemEval-2022 Task 5: combining image and textual embeddings for multi-modal automatic misogyny identification. In: Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pp. 742–747 (2022). https://10.0.72.221/v1/2022.semeval-1.103
Zuhri, A.T., Sagala, R.W.: Irony and sarcasm detection on public figure speech. J. Elem. School Educ. 1(1), 41–45 (2022)
Ray, A., Mishra, S., Nunna, A., Bhattacharyya, P.: A multimodal corpus for emotion recognition in sarcasm (2022). arXiv preprint https://doi.org/10.48550/arXiv.2206.02119
Ding, N., Tian, S.W., Yu, L.: A multimodal fusion method for sarcasm detection based on late fusion. Multimed. Tools Appl. 81(6), 8597–8616 (2022). https://doi.org/10.1007/s11042-022-12122-9
Khan, S., Kamal, A., Fazil, M., Alshara, M.A., Sejwal, V.K., Alotaibi, R.M., Baig, A.R., Alqahtani, S.: HCovBi-caps: hate speech detection using convolutional and bi-directional gated recurrent unit with capsule network. IEEE Access 10, 7881–7894 (2022)
Zhang, Y., Ma, D., Tiwari, P., Zhang, C., Masud, M., Shorfuzzaman, M., Song, D.: Stance level sarcasm detection with BERT and stance-centered graph attention networks. ACM Trans. Internet Technol. (2022). https://doi.org/10.1145/3533430
Juyal, P.: Multi-modal sentiment analysis of audio and visual context of the data using machine learning. In: 2022 3rd International Conference on Smart Electronics and Communication (ICOSEC), pp. 1198–1205. IEEE (2022). https://10.0.4.85/ICOSEC54921.2022.9951988
Acknowledgements
This research was supported by Ramaiah Institute of Technology (MSRIT), Bangalore-560054 and Visvesvaraya Technological University, Jnana Sangama, Belagavi-590018.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest.
Ethics approval
We did not use animals and Human participants in the study reported in this work.
Informed consent
For this type of study informed consent is not required.
Consent for publication
For this type of study consent for publication is not required.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Murthy, J.S., Siddesh, G.M. A smart video analytical framework for sarcasm detection using novel adaptive fusion network and SarcasNet-99 model. Vis Comput (2024). https://doi.org/10.1007/s00371-023-03224-y
Accepted:
Published:
DOI: https://doi.org/10.1007/s00371-023-03224-y