HatEmoTweet: low-level emotion classifications and spatiotemporal trends of hate and offensive COVID-19 tweets

Adesokan, Ademola; Madria, Sanjay; Nguyen, Long

doi:10.1007/s13278-023-01132-6

HatEmoTweet: low-level emotion classifications and spatiotemporal trends of hate and offensive COVID-19 tweets

Original Article
Published: 18 October 2023

Volume 13, article number 136, (2023)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Ademola Adesokan¹,
Sanjay Madria^1,2^na1 &
Long Nguyen²^na1

172 Accesses
2 Citations
Explore all metrics

Abstract

Social media platforms (like Twitter) positively and negatively impact users in diverse societies; one of Twitter’s negative effects is the usage of hate and offensive language. Hate speech fosters prejudice; it also harms the vulnerable. There are always emotions associated with hateful and offensive actions. This work addressed hate and offensive tweet detection, low-level emotional classifications using 28 labels to train transformers models in three ways (model 1—\({BERT}_{{G28}}\), model 2—\({BERT}_{{G27}}\), and model 3—\({RoBERTa}_{{G27}}\)) before predicting the hateful and offensive tweets emotions. Model 1 was trained on low-level labels, and models 2 and 3 were trained on 27 labels excluding the neutral label. This study performed topic modeling to extract the discussed theme, spatiotemporal trends to determine where and when these tweets occurred, and event summarization for identified hate and offensive tweets. GoEmotions and Ekman were used for direct and indirect assessment, respectively, to evaluate the model’s precision, recall, and F1-score. In terms of precision evaluation, the model 1 outperformed Google Research on GoEmotions. Furthermore, this study’s model 2 and model 3 outperformed the Google research on both the GoEmotions and Ekman’s evaluation in terms of precision and F1-score. Generally, model 2 was the best model in the analysis for both recall and F1-score while model 3 performed better for precision. Due to the training on samples without the neutral label, model 2 obtained 27% and model 3 achieved 29% label prediction out of the 30% neutral samples that was predicted in model 1 for hate and offensive tweets. This is a significant improvement to optimize classified emotions that are not truly neutral by eliminating the false neutral class.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

Detection and moderation of detrimental content on social media platforms: current status and future directions

Article 05 September 2022

Social media discourse and voting decisions influence: sentiment analysis in tweets during an electoral period

Article Open access 07 March 2023

Notes

References

Adwan OY, Al-Tawil M, Huneiti A et al (2020) Twitter sentiment analysis approaches: a survey. Int J Emerg Technol Learn 15(15):79
Article Google Scholar
Agarwal A, Salehundam P, Padhee S, et al (2020) Leveraging natural language processing to mine issues on twitter during the COVID-19 pandemic. In: 2020 IEEE International conference on big data (Big Data). IEEE
Allahyari M, Pouriyeh S, Assefi M, et al (2017) Text summarization techniques: a brief survey. https://doi.org/10.48550/ARXIV.1707.02268,
Alsaeedi A, Zubair M (2019) A study on sentiment analysis techniques of twitter data. Int J Adv Comput Sci Appl 10(2):361
Google Scholar
Alshalan R, Al-Khalifa H, Alsaeed D et al (2020) Detection of hate speech in covid-19-related tweets in the arab region: deep learning and topic modeling approach. J Med Internet Res 22(12):609. https://doi.org/10.2196/22609
Article Google Scholar
Bogdanowicz A, Guan C (2022) Dynamic topic modeling of twitter data during the COVID-19 pandemic. PLoS One 17(5):e0268669
Article Google Scholar
Calabrese A, Bevilacqua M, Ross B, et al (2021) AAA: fair evaluation for abuse detection systems wanted. In: 13th ACM Web science conference 2021. ACM, New York, NY, USA
Chakrabarti D, Punera K (2021) Event summarization using tweets. Proc Int AAAI Conf Web Social Media 5(1):66–73
Article Google Scholar
Chiril P, Pamungkas EW, Benamara F et al (2022) Emotionally informed hate speech detection: a multi-target perspective. Cognit Comput 14(1):322–352
Article Google Scholar
Davidson T, Warmsley D, Macy M et al (2017) Automated hate speech detection and the problem of offensive language. Proc Int AAAI Conf Web Social Media 11(1):512–515
Article Google Scholar
Demszky D, Movshovitz-Attias D, Ko J, et al (2020) Goemotions: a dataset of fine-grained emotions. 2005.00547
Devlin J, Chang MW, Lee K, et al (2018) Bert: pre-training of deep bidirectional transformers for language understanding. https://doi.org/10.48550/ARXIV.1810.04805
Draw.io J (2005) Jgraph/drawio: Draw.io is a javascript, client-side editor for general diagramming and whiteboarding. https://github.com/jgraph/drawio
Dusart A, Pinel-Sauvagnat K, Hubert G (2021) Tssubert: tweet stream summarization using bert. arxiv:2106.08770
Egger R, Yu J (2022) A topic modeling comparison between LDA, NMF, Top2Vec, and BERTopic to demystify twitter posts. Front Sociol 7:886498
Article Google Scholar
George S, Vasudevan S (2021) Comparison of LDA and NMF topic modeling techniques for restaurant reviews. Indian J Nat Sci 10(62):28210
Google Scholar
Grant CE, George CP, Jenneisch C, et al (2011) Online topic modeling for real-time twitter search. In: text retrieval conference
Grootendorst M (2022) Bertopic: neural topic modeling with a class-based tf-idf procedure. https://doi.org/10.48550/ARXIV.2203.05794,
Gupta S, Kaur M, Lakra S (2021) BERT-BU12 hate speech detection using bidirectional encoder-decoder. Int J Syst Dyn Appl 11(2):1–16
Google Scholar
Hardage D, Najafirad P (2020) Hate and toxic speech detection in the context of covid-19 pandemic using XAI: Ongoing applied research. In: Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020. Association for Computational Linguistics, Stroudsburg, PA, USA
Kabir MY, Madria S (2021) EMOCOV: machine learning for emotion detection, analysis and visualization using COVID-19 tweets. Online Soc Netw Media 23(100135):100–135
Google Scholar
Kabir MY, Madria S (2022) A deep learning approach for ideology detection and polarization analysis using Covid-19 tweets. In: Ralyté J, Chakravarthy S, Mohania M et al (eds) Conceptual modeling. Springer, Cham, pp 209–223
Chapter Google Scholar
Kamal O, Kumar A, Vaidhya T (2021) Hostility detection in hindi leveraging pre-trained language models. arxiv:2101.05494
Li Q, Zhang Q (2021) Twitter event summarization by exploiting semantic terms and graph network. Proc Conf AAAI Artif Intell 35(17):347–354
Google Scholar
Liu Y, Ott M, Goyal N, et al (2019) Roberta: a robustly optimized bert pretraining approach. https://doi.org/10.48550/ARXIV.1907.11692,
Qomariyah S, Iriawan N, Fithriasari K (2019) Topic modeling twitter data using latent dirichlet allocation and latent semantic analysis. In: The 2nd international conference on science, mathematics, environment, and education. AIP Publishing
Qureshi KA, Sabih M (2021) Un-compromised credibility: social media based multi-class hate speech classification for text. IEEE Access 9:465–477
Article Google Scholar
Rudrapal D, Das A, Bhattacharya B (2018) A survey on automatic twitter event summarization. J Inf Process Syst 14(1):79–100. https://doi.org/10.3745/JIPS.02.0079
Article Google Scholar
Rudrapal D, Das A, Bhattacharya B (2019) A new approach for twitter event summarization based on sentence identification and partial textual entailment. Comput Sist 23(3):1065
Google Scholar
Savelieva A, Au-Yeung B, Ramani V (2020) Abstractive summarization of spoken and written instructions with bert. https://doi.org/10.48550/ARXIV.2008.09676,
Schober P, Boer C, Schwarte LA (2018) Correlation coefficients: appropriate use and interpretation. Anesth Analg 126(5):1763–1768. https://doi.org/10.1213/ane.0000000000002864
Article Google Scholar
Shi T, Kang K, Choo J, et al (2018) Short-text topic modeling via non-negative matrix factorization enriched with local word-context correlations. In: Proceedings of the 2018 World wide web conference on world wide web—WWW ’18. ACM Press, New York, New York, USA
Silva NFF, Hruschka ER, Hruschka ER (2014) Tweet sentiment analysis with classifier ensembles. Decis Support Syst 66:170–179
Article Google Scholar
Toliyat A, Levitan SI, Peng Z et al (2022) Asian hate speech detection on twitter during COVID-19. Front Artif Intell 5(932):381
Google Scholar
Varab D, Schluter N (2020) DaNewsroom: a large-scale Danish summarisation dataset. In: Proceedings of the Twelfth language resources and evaluation conference. European Language Resources Association, Marseille, France, pp 6731–6739, https://aclanthology.org/2020.lrec-1.831
Xiao Z, Song W, Xu H, et al (2020) Timme: Twitter ideology-detection via multi-task multi-relational embedding. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2258–2268
Yadav Y, Bajaj P, Gupta RK, et al (2021) A comparative study of deep learning methods for hate speech and offensive language detection in textual data. In: 2021 IEEE 18th India Council International Conference (INDICON). IEEE
Zhang J, Zhao Y, Saleh M, et al (2019) Pegasus: pre-training with extracted gap-sentences for abstractive summarization. https://doi.org/10.48550/ARXIV.1912.08777, https://arxiv.org/abs/1912.08777

Download references

Funding

This work was partially supported by NSF—USA CNS-2219614, CNS-2219615 and the Missouri University of Science and Technology’s Kummer Institute for Student Success, Research and Economic Development through the Kummer Innovation and Entrepreneurship Doctoral Fellowship.

Author information

Sanjay Madria and Long Nguyen have contributed equally to this work.

Authors and Affiliations

Department of Computer Science, Missouri University of Science and Technology, Rolla, MO, 65401, USA
Ademola Adesokan & Sanjay Madria
School of Applied Computational Sciences, Meharry Medical College, Nashville, TN, 37208, USA
Sanjay Madria & Long Nguyen

Authors

Ademola Adesokan
View author publications
You can also search for this author in PubMed Google Scholar
Sanjay Madria
View author publications
You can also search for this author in PubMed Google Scholar
Long Nguyen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AA was involved in methodology, formal analysis, software, writing—original draft, writing—review and editing, investigation. SM helped in conceptualization, data curation, writing—review and editing, supervision, funding acquisition, project administration. LN contributed to conceptualization, writing—review and editing, supervision, funding acquisition, project administration.

Corresponding author

Correspondence to Ademola Adesokan.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Adesokan, A., Madria, S. & Nguyen, L. HatEmoTweet: low-level emotion classifications and spatiotemporal trends of hate and offensive COVID-19 tweets. Soc. Netw. Anal. Min. 13, 136 (2023). https://doi.org/10.1007/s13278-023-01132-6

Download citation

Received: 15 March 2023
Revised: 21 May 2023
Accepted: 08 September 2023
Published: 18 October 2023
DOI: https://doi.org/10.1007/s13278-023-01132-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HatEmoTweet: low-level emotion classifications and spatiotemporal trends of hate and offensive COVID-19 tweets

Abstract

Access this article

Similar content being viewed by others

A review on sentiment analysis and emotion detection from text

Detection and moderation of detrimental content on social media platforms: current status and future directions

Social media discourse and voting decisions influence: sentiment analysis in tweets during an electoral period

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

HatEmoTweet: low-level emotion classifications and spatiotemporal trends of hate and offensive COVID-19 tweets

Abstract

Access this article

Similar content being viewed by others

A review on sentiment analysis and emotion detection from text

Detection and moderation of detrimental content on social media platforms: current status and future directions

Social media discourse and voting decisions influence: sentiment analysis in tweets during an electoral period

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation