Skip to main content
Log in

HatEmoTweet: low-level emotion classifications and spatiotemporal trends of hate and offensive COVID-19 tweets

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Social media platforms (like Twitter) positively and negatively impact users in diverse societies; one of Twitter’s negative effects is the usage of hate and offensive language. Hate speech fosters prejudice; it also harms the vulnerable. There are always emotions associated with hateful and offensive actions. This work addressed hate and offensive tweet detection, low-level emotional classifications using 28 labels to train transformers models in three ways (model 1—\({BERT}_{{G28}}\), model 2—\({BERT}_{{G27}}\), and model 3—\({RoBERTa}_{{G27}}\)) before predicting the hateful and offensive tweets emotions. Model 1 was trained on low-level labels, and models 2 and 3 were trained on 27 labels excluding the neutral label. This study performed topic modeling to extract the discussed theme, spatiotemporal trends to determine where and when these tweets occurred, and event summarization for identified hate and offensive tweets. GoEmotions and Ekman were used for direct and indirect assessment, respectively, to evaluate the model’s precision, recall, and F1-score. In terms of precision evaluation, the model 1 outperformed Google Research on GoEmotions. Furthermore, this study’s model 2 and model 3 outperformed the Google research on both the GoEmotions and Ekman’s evaluation in terms of precision and F1-score. Generally, model 2 was the best model in the analysis for both recall and F1-score while model 3 performed better for precision. Due to the training on samples without the neutral label, model 2 obtained 27% and model 3 achieved 29% label prediction out of the 30% neutral samples that was predicted in model 1 for hate and offensive tweets. This is a significant improvement to optimize classified emotions that are not truly neutral by eliminating the false neutral class.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://www.businessofapps.com/data/twitter-statistics/.

  2. https://www.montclair.edu/school-of-communication-and-media/2022/10/29/study-finds-hate-speech-increases-on-twitter-after-elon-musk-acquisition/?

  3. https://www.cbsnews.com/news/elon-musk-twitter-caused-measurable-spike-hate-speech-study/.

  4. https://abcnews.go.com/Business/hate-speech-increased-twitter-elon-musk-takeover-study/story?id=92445797.

  5. https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy.

  6. https://time.com/6080324/twitter-hate-speech-penalties/.

  7. https://medium.com/swlh/detecting-depression-in-social-media-via-twitter-usage-2d8f3df9b313.

  8. https://www.kaggle.com/datasets/debarshichanda/goemotions.

  9. https://github.com/mykabir/COVID19/tree/master/data.

  10. https://pypi.org/project/hatesonar/.

  11. https://hatebase.org/.

  12. https://snyk.io/advisor/python/hatesonar.

  13. https://towardsdatascience.com/latent-semantic-analysis-intuition-math-implementation-a194aff870f8.

  14. https://github.com/HurmetNoka/religion_on_twitter.

  15. https://towardsdatascience.com/topic-modeling-articles-with-nmf-8c6b2a227a45.

  16. https://towardsdatascience.com/meet-bertopic-berts-cousin-for-advanced-topic-modeling-ea5bf0b7faa3.

  17. https://github.com/ddangelov/Top2Vec.

  18. https://towardsdatascience.com/topic-modeling-with-lsa-plsa-lda-nmf-bertopic-top2vec-a-comparison-5e6ce4b1e4a5.

  19. https://www.quora.com/What-are-the-pros-and-cons-of-LDA-and-NMF-in-topic-modeling-Under-what-situations-should-we-choose-LDA-or-NMF-Is-there-comparison-of-two-techniques-in-topic-modeling.

  20. https://datascience.oneoffcoder.com/topic-modeling-gensim.html.

  21. https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html.

  22. https://github.com/google-research/bert.

  23. https://towardsdatascience.com/bert-roberta-distilbert-xlnet-which-one-to-use-3d5ab82ba5f8.

  24. https://machinelearningmastery.com/gentle-introduction-text-summarization/.

  25. https://www.analyticsvidhya.com/blog/2019/06/comprehensive-guide-text-summarization-using-deep-learning-python/.

  26. https://gdc.unicef.org/resource/coronavirus-africa-five-reasons-why-covid-19-has-been-less-deadly-elsewhere.

  27. https://machinelearningmastery.com/how-to-use-correlation-to-understand-the-relationship-between-variables/.

  28. https://www.sjsu.edu/faculty/gerstman/StatPrimer/t-table.pdf.

References

  • Adwan OY, Al-Tawil M, Huneiti A et al (2020) Twitter sentiment analysis approaches: a survey. Int J Emerg Technol Learn 15(15):79

    Article  Google Scholar 

  • Agarwal A, Salehundam P, Padhee S, et al (2020) Leveraging natural language processing to mine issues on twitter during the COVID-19 pandemic. In: 2020 IEEE International conference on big data (Big Data). IEEE

  • Allahyari M, Pouriyeh S, Assefi M, et al (2017) Text summarization techniques: a brief survey. https://doi.org/10.48550/ARXIV.1707.02268,

  • Alsaeedi A, Zubair M (2019) A study on sentiment analysis techniques of twitter data. Int J Adv Comput Sci Appl 10(2):361

    Google Scholar 

  • Alshalan R, Al-Khalifa H, Alsaeed D et al (2020) Detection of hate speech in covid-19-related tweets in the arab region: deep learning and topic modeling approach. J Med Internet Res 22(12):609. https://doi.org/10.2196/22609

    Article  Google Scholar 

  • Bogdanowicz A, Guan C (2022) Dynamic topic modeling of twitter data during the COVID-19 pandemic. PLoS One 17(5):e0268669

    Article  Google Scholar 

  • Calabrese A, Bevilacqua M, Ross B, et al (2021) AAA: fair evaluation for abuse detection systems wanted. In: 13th ACM Web science conference 2021. ACM, New York, NY, USA

  • Chakrabarti D, Punera K (2021) Event summarization using tweets. Proc Int AAAI Conf Web Social Media 5(1):66–73

    Article  Google Scholar 

  • Chiril P, Pamungkas EW, Benamara F et al (2022) Emotionally informed hate speech detection: a multi-target perspective. Cognit Comput 14(1):322–352

    Article  Google Scholar 

  • Davidson T, Warmsley D, Macy M et al (2017) Automated hate speech detection and the problem of offensive language. Proc Int AAAI Conf Web Social Media 11(1):512–515

    Article  Google Scholar 

  • Demszky D, Movshovitz-Attias D, Ko J, et al (2020) Goemotions: a dataset of fine-grained emotions. 2005.00547

  • Devlin J, Chang MW, Lee K, et al (2018) Bert: pre-training of deep bidirectional transformers for language understanding. https://doi.org/10.48550/ARXIV.1810.04805

  • Draw.io J (2005) Jgraph/drawio: Draw.io is a javascript, client-side editor for general diagramming and whiteboarding. https://github.com/jgraph/drawio

  • Dusart A, Pinel-Sauvagnat K, Hubert G (2021) Tssubert: tweet stream summarization using bert. arxiv:2106.08770

  • Egger R, Yu J (2022) A topic modeling comparison between LDA, NMF, Top2Vec, and BERTopic to demystify twitter posts. Front Sociol 7:886498

    Article  Google Scholar 

  • George S, Vasudevan S (2021) Comparison of LDA and NMF topic modeling techniques for restaurant reviews. Indian J Nat Sci 10(62):28210

    Google Scholar 

  • Grant CE, George CP, Jenneisch C, et al (2011) Online topic modeling for real-time twitter search. In: text retrieval conference

  • Grootendorst M (2022) Bertopic: neural topic modeling with a class-based tf-idf procedure. https://doi.org/10.48550/ARXIV.2203.05794,

  • Gupta S, Kaur M, Lakra S (2021) BERT-BU12 hate speech detection using bidirectional encoder-decoder. Int J Syst Dyn Appl 11(2):1–16

    Google Scholar 

  • Hardage D, Najafirad P (2020) Hate and toxic speech detection in the context of covid-19 pandemic using XAI: Ongoing applied research. In: Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020. Association for Computational Linguistics, Stroudsburg, PA, USA

  • Kabir MY, Madria S (2021) EMOCOV: machine learning for emotion detection, analysis and visualization using COVID-19 tweets. Online Soc Netw Media 23(100135):100–135

    Google Scholar 

  • Kabir MY, Madria S (2022) A deep learning approach for ideology detection and polarization analysis using Covid-19 tweets. In: Ralyté J, Chakravarthy S, Mohania M et al (eds) Conceptual modeling. Springer, Cham, pp 209–223

    Chapter  Google Scholar 

  • Kamal O, Kumar A, Vaidhya T (2021) Hostility detection in hindi leveraging pre-trained language models. arxiv:2101.05494

  • Li Q, Zhang Q (2021) Twitter event summarization by exploiting semantic terms and graph network. Proc Conf AAAI Artif Intell 35(17):347–354

    Google Scholar 

  • Liu Y, Ott M, Goyal N, et al (2019) Roberta: a robustly optimized bert pretraining approach. https://doi.org/10.48550/ARXIV.1907.11692,

  • Qomariyah S, Iriawan N, Fithriasari K (2019) Topic modeling twitter data using latent dirichlet allocation and latent semantic analysis. In: The 2nd international conference on science, mathematics, environment, and education. AIP Publishing

  • Qureshi KA, Sabih M (2021) Un-compromised credibility: social media based multi-class hate speech classification for text. IEEE Access 9:465–477

    Article  Google Scholar 

  • Rudrapal D, Das A, Bhattacharya B (2018) A survey on automatic twitter event summarization. J Inf Process Syst 14(1):79–100. https://doi.org/10.3745/JIPS.02.0079

    Article  Google Scholar 

  • Rudrapal D, Das A, Bhattacharya B (2019) A new approach for twitter event summarization based on sentence identification and partial textual entailment. Comput Sist 23(3):1065

    Google Scholar 

  • Savelieva A, Au-Yeung B, Ramani V (2020) Abstractive summarization of spoken and written instructions with bert. https://doi.org/10.48550/ARXIV.2008.09676,

  • Schober P, Boer C, Schwarte LA (2018) Correlation coefficients: appropriate use and interpretation. Anesth Analg 126(5):1763–1768. https://doi.org/10.1213/ane.0000000000002864

    Article  Google Scholar 

  • Shi T, Kang K, Choo J, et al (2018) Short-text topic modeling via non-negative matrix factorization enriched with local word-context correlations. In: Proceedings of the 2018 World wide web conference on world wide web—WWW ’18. ACM Press, New York, New York, USA

  • Silva NFF, Hruschka ER, Hruschka ER (2014) Tweet sentiment analysis with classifier ensembles. Decis Support Syst 66:170–179

    Article  Google Scholar 

  • Toliyat A, Levitan SI, Peng Z et al (2022) Asian hate speech detection on twitter during COVID-19. Front Artif Intell 5(932):381

    Google Scholar 

  • Varab D, Schluter N (2020) DaNewsroom: a large-scale Danish summarisation dataset. In: Proceedings of the Twelfth language resources and evaluation conference. European Language Resources Association, Marseille, France, pp 6731–6739, https://aclanthology.org/2020.lrec-1.831

  • Xiao Z, Song W, Xu H, et al (2020) Timme: Twitter ideology-detection via multi-task multi-relational embedding. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2258–2268

  • Yadav Y, Bajaj P, Gupta RK, et al (2021) A comparative study of deep learning methods for hate speech and offensive language detection in textual data. In: 2021 IEEE 18th India Council International Conference (INDICON). IEEE

  • Zhang J, Zhao Y, Saleh M, et al (2019) Pegasus: pre-training with extracted gap-sentences for abstractive summarization. https://doi.org/10.48550/ARXIV.1912.08777, https://arxiv.org/abs/1912.08777

Download references

Funding

This work was partially supported by NSF—USA CNS-2219614, CNS-2219615 and the Missouri University of Science and Technology’s Kummer Institute for Student Success, Research and Economic Development through the Kummer Innovation and Entrepreneurship Doctoral Fellowship.

Author information

Authors and Affiliations

Authors

Contributions

AA was involved in methodology, formal analysis, software, writing—original draft, writing—review and editing, investigation. SM helped in conceptualization, data curation, writing—review and editing, supervision, funding acquisition, project administration. LN contributed to conceptualization, writing—review and editing, supervision, funding acquisition, project administration.

Corresponding author

Correspondence to Ademola Adesokan.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Adesokan, A., Madria, S. & Nguyen, L. HatEmoTweet: low-level emotion classifications and spatiotemporal trends of hate and offensive COVID-19 tweets. Soc. Netw. Anal. Min. 13, 136 (2023). https://doi.org/10.1007/s13278-023-01132-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-023-01132-6

Keywords

Navigation