Data Augmentation for Improving Explainability of Hate Speech Detection

Ansari, Gunjan; Kaur, Parmeet; Saxena, Chandni

doi:10.1007/s13369-023-08100-4

Data Augmentation for Improving Explainability of Hate Speech Detection

Research Article-Computer Engineering and Computer Science
Published: 18 July 2023

Volume 49, pages 3609–3621, (2024)
Cite this article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Gunjan Ansari¹,
Parmeet Kaur² &
Chandni Saxena³

420 Accesses
Explore all metrics

Abstract

The paper presents a novel data augmentation-based approach to develop explainable, deep learning models for hate speech detection. Hate speech is widely prevalent on online social media but difficult to detect automatically due to challenges of natural language processing and complexity of hate speech. Further, the decisions of the existing solutions possess constrained explainability since limited annotated data are available for training and testing of models. Therefore, this work proposes the use of text-based data augmentation for improving the performance and explainability of deep learning models. Techniques based on easy data augmentation, bidirectional encoder representations from transformers and back translation have been utilized for data augmentation. Convolutional neural networks and long short-term memory models are trained with augmented data and evaluated on two publicly available datasets for hate speech detection. Methods of LIME and integrated gradients are used to retrieve explanations of the deep learning models. A diagnostic study is conducted on test samples to check for improvement in the models as a result of the data augmentation. The experimental results verify that the proposed approach improves the explainability as well as the accuracy of hate speech detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fake news, disinformation and misinformation in social media: a review

Article 09 February 2023

Fake news detection based on news content and social contexts: a transformer-based approach

Article 30 January 2022

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Article 04 June 2022

Data Availability

The datasets analyzed during the current study are available at https://www.dropbox.com/s/21wtzy9arc5skr8/ICWSM18 and https://data.mendeley.com/datasets/jf4pzyvnpj/1.

Notes

References

Wright, M.F.: Cyberbullying in cultural context. J. Cross-Cultural Psychol. 48(8), 1136–1137 (2017). https://doi.org/10.1177/0022022117723107
Article Google Scholar
MacAvaney, S.; Yao, H.-R.; Yang, E.; Russell, K.; Goharian, N.; Frieder, O.: Hate speech detection: challenges and solutions. PLoS ONE 14(8), 1–16 (2019). https://doi.org/10.1371/journal.pone.0221152
Article CAS Google Scholar
Agrawal, S.; Awekar, A.: Deep learning for detecting cyberbullying across multiple social media platforms. arXiv:1801.06482 (2018)
Dadvar, M.; Eckert, K.: Cyberbullying detection in social networks using deep learning based models; a reproducibility study. arxiv:1812.08046 (2018)
Zhang, Z.; Robinson, D.; Tepper, J.A.: Detecting hate speech on twitter using a convolution-gru based deep neural network. In: ESWC (2018)
Phanomtip, A.; Sueb-in, T.; Vittayakorn, S.: Cyberbullying detection on tweets. In: 2021 18th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), pp. 295–298 (2021). https://doi.org/10.1109/ECTI-CON51831.2021.9454848
Mishra, P.; Del Tredici, M.; Yannakoudakis, H.; Shutova, E.: Author profiling for abuse detection. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1088–1098. Association for Computational Linguistics, Santa Fe, New Mexico, USA (2018). https://aclanthology.org/C18-1093
Waseem, Z.; Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93. Association for Computational Linguistics, San Diego, California (2016). https://doi.org/10.18653/v1/N16-2013. https://aclanthology.org/N16-2013
Mathew, B.; Saha, P.; Yimam, S.M.; Biemann, C.; Goyal, P.; Mukherjee, A.: HateXplain: a benchmark dataset for explainable hate speech detection (2020)
Ribeiro, M.T.; Singh, S.; Guestrin, C.: "Why should i trust you?": explaining the predictions of any classifier (2016)
Simonyan, K.; Vedaldi, A.; Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps (2014)
Zeiler, M.D.; Fergus, R.: Visualizing and understanding convolutional networks (2013)
Castro, J.; Gomez, D.; Tejada, J.: Polynomial calculation of the Shapley value based on sampling. Comput. Oper. Res. 36, 1726–1730 (2009). https://doi.org/10.1016/j.cor.2008.04.004
Article MathSciNet Google Scholar
Atanasova, P.; Simonsen, J.G.; Lioma, C.; Augenstein, I.: A diagnostic study of explainability techniques for text classification (2020)
DeYoung, J.; Jain, S.; Rajani, N.F.; Lehman, E.; Xiong, C.; Socher, R.; Wallace, B.C.: Eraser: a benchmark to evaluate rationalized NLP models. arXiv:1911.03429 (2019)
Beddiar, D.R.; Jahan, M.S.; Oussalah, M.: Data expansion using back translation and paraphrasing for hate speech detection. Online Soc. Netw. Media 24, 100153 (2021). https://doi.org/10.1016/j.osnem.2021.100153
Article Google Scholar
Feng, S.Y.; Gangal, V.; Wei, J.; Chandar, S.; Vosoughi, S.; Mitamura, T.; Hovy, E.: A survey of data augmentation approaches for NLP (2021)
Chen, H.; Ji, Y.: Improving the explainability of neural sentiment classifiers via data augmentation. arXiv:1909.04225 (2019)
Doran, D.; Schulz, S.; Besold, T.R.: What does explainable AI really mean? A new conceptualization of perspectives. arXiv:1710.00794 (2017)
Hagras, H.: Toward human-understandable, explainable AI. Computer 51(9), 28–36 (2018)
Article Google Scholar
Došilović, F.K.; Brčić, M.; Hlupić, N.: Explainable artificial intelligence: a survey. In: 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 0210–0215 (2018). IEEE
Samek, W.; Müller, K.-R.: In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Towards Explainable Artificial Intelligence, pp. 5–22. Springer, Cham (2019)
Lundberg, S.M.; Lee, S.-I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Ribeiro, M.T.; Singh, S.; Guestrin, C.: " why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Kindermans, P.-J.; Schütt, K.T.; Alber, M.; Müller, K.-R.; Erhan, D.; Kim, B.; Dähne, S.: Learning how to explain neural networks: patternnet and patternattribution. arXiv:1705.05598 (2017)
Saxena, C.; Garg, M.; Saxena, G.: Explainable causal analysis of mental health on social media data. arXiv:2210.08430 (2022)
Garg, M.; Saxena, C.; Saha, S.; Krishnan, V.; Joshi, R.; Mago, V.: Cams: An annotated corpus for causal analysis of mental health issues in social media posts. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 6387–6396 (2022)
Lei, T.; Barzilay, R.; Jaakkola, T.: Rationalizing neural predictions. arXiv:1606.04155 (2016)
Caruana, R.; Lou, Y.; Gehrke, J.; Koch, P.; Sturm, M.; Elhadad, N.: Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1721–1730 (2015)
Amann, J.; Blasimme, A.; Vayena, E.; Frey, D.; Madai, V.I.: Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC Med. Inform. Decis. Mak. 20(1), 1–9 (2020)
Article Google Scholar
Pope, P.E.; Kolouri, S.; Rostami, M.; Martin, C.E.; Hoffmann, H.: Explainability methods for graph convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10772–10781 (2019)
Zablocki, É.; Ben-Younes, H.; Pérez, P.; Cord, M.: Explainability of vision-based autonomous driving systems: review and challenges. arXiv:2101.05307 (2021)
Mahajan, A.; Shah, D.; Jafar, G.: Explainable AI approach towards toxic comment classification, pp. 849–858 (2021). https://doi.org/10.1007/978-981-33-4367-2_81
Danilevsky, M.; Qian, K.; Aharonov, R.; Katsis, Y.; Kawas, B.; Sen, P.: A survey of the state of explainable AI for natural language processing. arXiv:2010.00711 (2020)
Badimala, P.; Mishra, C.; Modam Venkataramana, R.K.; Bukhari, S.; Dengel, A.: A study of various text augmentation techniques for relation classification in free text, pp. 360–367 (2019). https://doi.org/10.5220/0007311003600367
Feng, S.Y.; Gangal, V.; Wei, J.; Chandar, S.; Vosoughi, S.; Mitamura, T.; Hovy, E.: A survey of data augmentation approaches for NLP. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 968–988. Association for Computational Linguistics, Online (2021). https://doi.org/10.18653/v1/2021.findings-acl.84. https://aclanthology.org/2021.findings-acl.84
Wei, J.; Zou, K.: Eda: easy data augmentation techniques for boosting performance on text classification tasks. arXiv:1901.11196 (2019)
Ng, N.; Yee, K.; Baevski, A.; Ott, M.; Auli, M.; Edunov, S.: Facebook fair’s WMT19 news translation task submission. CoRR (2019) arXiv:1907.06616
Kobayashi, S.: Contextual augmentation: data augmentation by words with paradigmatic relations (2018)
Kumar, V.; Choudhary, A.; Cho, E.: Data augmentation using pre-trained transformer models. In: Proceedings of the 2nd Workshop on Life-long Learning for Spoken Language Systems, pp. 18–26. Association for Computational Linguistics, Suzhou, China (2020). https://www.aclweb.org/anthology/2020.lifelongnlp-1.3
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics, Doha, Qatar (2014). https://doi.org/10.3115/v1/D14-1181. https://aclanthology.org/D14-1181
van Aken, B.; Risch, J.; Krestel, R.; Löser, A.: Challenges for toxic comment classification: an in-depth error analysis. In: ALW (2018)
Sundararajan, M.; Taly, A.; Yan, Q.: Axiomatic attribution for deep networks. In: International Conference on Machine Learning, pp. 3319–3328. PMLR (2017)
Nguyen, D.: Comparing automatic and human evaluation of local explanations for text classification. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1069–1078. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1097. https://aclanthology.org/N18-1097
Chen, J.; Song, L.; Wainwright, M.J.; Jordan, M.I.: L-shapley and c-shapley: efficient model interpretation for structured data. arXiv:1808.02610 (2018)
Baccianella, S.; Esuli, A.; Sebastiani, F.: SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta, Malta (2010)
Salminen, J.; Almerekhi, H.; Milenković, M.; Jung, S.-G.; An, J.; Kwak, H.; Jansen, B.J.: Anatomy of online hate: developing a taxonomy and machine learning models for identifying and classifying hate in online news media. In: Twelfth International AAAI Conference on Web and Social Media (2018)
Ansari, G.; Garg, M.; Saxena, C.: Data augmentation for mental health classification on social media. arXiv:2112.10064 (2021)

Download references

Acknowledgements

Not applicable.

Funding

Not applicable.

Author information

Authors and Affiliations

Department of Information Technology, JSS Academy of Technical Education, Noida, India
Gunjan Ansari
Department of Computer Science and Information Technology, Jaypee Institute of Information Technology, Noida, India
Parmeet Kaur
The Chinese University of Hong Kong, Hong Kong, SAR, China
Chandni Saxena

Authors

Gunjan Ansari
View author publications
You can also search for this author in PubMed Google Scholar
Parmeet Kaur
View author publications
You can also search for this author in PubMed Google Scholar
Chandni Saxena
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Authors 1 and 2 conceived as well as designed the analysis and wrote the manuscript. Author 3 performed the analysis and compiled the results of implementation. All authors reviewed the manuscript.

Corresponding author

Correspondence to Parmeet Kaur.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical Approval and Consent to participate

Not applicable.

Human and Animal Ethics

Not applicable.

Consent for Publication

All authors have given consent to submit the manuscript in its present form. Consent from others is not applicable.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ansari, G., Kaur, P. & Saxena, C. Data Augmentation for Improving Explainability of Hate Speech Detection. Arab J Sci Eng 49, 3609–3621 (2024). https://doi.org/10.1007/s13369-023-08100-4

Download citation

Received: 05 September 2022
Accepted: 18 June 2023
Published: 18 July 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s13369-023-08100-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data Augmentation for Improving Explainability of Hate Speech Detection

Abstract

Access this article

Similar content being viewed by others

Fake news, disinformation and misinformation in social media: a review

Fake news detection based on news content and social contexts: a transformer-based approach

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval and Consent to participate

Human and Animal Ethics

Consent for Publication

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Data Augmentation for Improving Explainability of Hate Speech Detection

Abstract

Access this article

Similar content being viewed by others

Fake news, disinformation and misinformation in social media: a review

Fake news detection based on news content and social contexts: a transformer-based approach

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval and Consent to participate

Human and Animal Ethics

Consent for Publication

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation