Skip to main content

Evaluation of Word Embeddings for Toxic Span Prediction

  • Conference paper
  • First Online:
Proceedings of the International Conference on Cognitive and Intelligent Computing

Abstract

Toxic span prediction task is to predict the span of toxic content in social media posts, to identify objectionable parts of them. This task is dependent on word embeddings that represent the tokens as vectors and the model used for span prediction. This paper describes our experiment to evaluate word embeddings ranging from static and contextual to transformer-based embeddings for toxic span detection. Motivated by the top performers of SemEval 2021 Task 5, we chose to follow the sequence tagging approach for span prediction, hence, a stacked BiLSTM layer followed by CRF is used to predict the toxicity of each token. Our model with RoBERTa embeddings, fine-tuned on toxicity classification task has achieved the highest performance of 70.26 F1 score among all the word embeddings, which is on par with the top two performers of this task that are ensemble models. Our experiments compared word embeddings in terms of F1 score, training, and inference time of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://sites.google.com/view/toxicspans.

  2. 2.

    https://github.com/ipavlopoulos/toxic_spans/tree/master/data.

  3. 3.

    https://spacy.io.

  4. 4.

    https://www.nltk.org/_modules/nltk/tokenize/treebank.html.

  5. 5.

    https://www.clips.uantwerpen.be/conll2003/ner/.

  6. 6.

    https://pytorch.org/.

References

  1. Waseem Z, Hovy D (2016) Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In: Proceedings of the NAACL student research workshop. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 88–93

    Google Scholar 

  2. Founta A-M, Chatzakou D, Kourtellis N et al (2019) A unified deep learning architecture for abuse detection. In: Proceedings of the 10th ACM conference on web science. In: Proceedings of the 10th ACM conference on web science (WebSci’19). Association for Computing Machinery, New York, NY, USA, pp 105–114

    Google Scholar 

  3. Chakrabarty T, Gupta K, Muresan S (2019) Pay attention to your context when classifying abusive language. In: Proceedings of the third workshop on abusive language online. Association for Computational Linguistics, Florence, Italy, pp 70–79

    Google Scholar 

  4. Lample G, Ballesteros M, Subramanian S et al (2016) Neural architectures for named entity recognition. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 260–270

    Google Scholar 

  5. Waseem Z (2016) Are you a racist or am i seeing things? Annotator influence on hate speech detection on twitter. In: Proceedings of the first workshop on NLP and computational social science. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 138–142

    Google Scholar 

  6. Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Proceedings of 11th international conference web and social media, ICWSM 2017, pp 512–515

    Google Scholar 

  7. Zhang Z, Robinson D, Tepper J (2018) Detecting hate speech on Twitter using a convolution-GRU based deep neural network. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). Springer Verlag, pp 745–760

    Google Scholar 

  8. Xiang T, Macavaney S, Yang E, Goharian N (2021) ToxCCIn: toxic content classification with interpretability. In: Proceedings of the 11th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 1–12

    Google Scholar 

  9. Ranasinghe T, Zampieri M (2021) MUDES: multilingual detection of offensive spans. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies: demonstrations, pp 144–152

    Google Scholar 

  10. Gia Hoang P, Thanh Nguyen L, Nguyen K (2021) UIT-E10dot3 at SemEval-2021 Task 5: toxic spans detection with named entity recognition and question-answering approaches. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021). Association for Computational Linguistics, online, pp 919–926

    Google Scholar 

  11. Khan Y, Ma W, Vosoughi S (2021) Lone pine at SemEval-2021 task 5: fine-grained detection of hate speech using BERToxic. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021). Association for Computational Linguistics, online, pp 967–973

    Google Scholar 

  12. Mathew B, Saha P, Yimam SM et al (2020) HateXplain: a benchmark dataset for explainable hate speech detection. arXiv:2012.10289

  13. Seo M, Kembhavi A, Farhadi A, Hajishirzi H (2017) Bidirectional attention flow for machine comprehension. In: Proceedings of ICLR 2017

    Google Scholar 

  14. Chhablani G, Sharma A, Pandey H et al (2021) NLRG at SemEval-2021 task 5: toxic spans detection leveraging BERT-based token classification and span prediction techniques. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021). Association for Computational Linguistics, online, pp 233–242

    Google Scholar 

  15. Li X, Feng J, Meng Y, et al (2020) A unified MRC framework for named entity recognition. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, online, pp 5849–5859

    Google Scholar 

  16. Karimi A, Rossi L, Prati A (2021) UniParma at SemEval-2021 task 5: toxic spans detection using character BERT and bag-of-words model. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021). Association for Computational Linguistics, online, pp 220–224

    Google Scholar 

  17. Ma W, Cui Y, Si C et al (2020) CharBERT: character-aware pre-trained language model. In: Proceedings of the 28th international conference on computational linguistics, pp 39–50

    Google Scholar 

  18. Wang C, Liu T, Zhao T (2021) HITMI&T at SemEval-2021 task 5: integrating transformer and CRF for toxic spans detection. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021). Association for Computational Linguistics, online, pp 870–874

    Google Scholar 

  19. Nguyen VA, Nguyen TM, Quang Dao H, Huu Pham Q (2021) S-NLP at SemEval-2021 task 5: an analysis of dual networks for sequence tagging. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021). Association for Computational Linguistics, online, pp 888–897

    Google Scholar 

  20. Mikolov T, Sutskever I, Chen K et al (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems

    Google Scholar 

  21. Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In: EMNLP 2014—2014 conference on empirical methods in natural language processing, proceedings of the conference. Association for Computational Linguistics (ACL), pp 1532–1543

    Google Scholar 

  22. Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics: volume 2, short papers. Association for Computational Linguistics, Valencia, Spain, pp 427–431

    Google Scholar 

  23. Heinzerling B, Strube M (2018) BPEmb: tokenization-free pre-trained subword embeddings in 275 languages. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018)

    Google Scholar 

  24. dos Santos C, Guimarães V (2015) Boosting named entity recognition with neural character embeddings. In: Proceedings of the fifth named entity workshop. Association for Computational Linguistics, Beijing, China, pp 25–33

    Google Scholar 

  25. Speer R, Chin J (2016) An ensemble method to produce high-quality word embeddings. Computing research repository. arXiv:1604.01692

  26. Peters ME, Neumann M, Iyyer M et al (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (long papers), pp 2227–2237

    Google Scholar 

  27. Akbik A, Blythe D, Vollgraf R (2018) Contextual string embeddings for sequence labeling. In: Proceedings of the 27th international conference on computational linguistics, pp 1638–1649

    Google Scholar 

  28. Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186

    Google Scholar 

  29. Joshi M, Chen D, Liu Y et al (2019) SpanBERT: improving pre-training by representing and predicting spans. Trans Assoc Comput Linguist 2020:64–77

    Google Scholar 

  30. Liu Y, Ott M, Goyal N et al (2019) RoBERTa: a robustly optimized BERT pretraining approach. arXiv:1907.11692

  31. Jiang Z, Yu W, Zhou D et al (2020) ConvBERT: improving BERT with span-based dynamic convolution. In: Proceedings of the 34th conference on neural information processing systems (NeurIPS 2020)

    Google Scholar 

  32. Clark K, Luong M-T, Le QV, Manning CD (2020) ELECTRA: pre-training text encoders as discriminators rather than generators. In: Proceedings of the eighth international conference on learning representations (ICLR 2020)

    Google Scholar 

  33. Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681

    Google Scholar 

  34. Lafferty J, McCallum A, Pereira FCN (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the eighteenth international conference on machine learning (ICML’01), pp 282–289

    Google Scholar 

  35. Akbik A, Bergmann T, Blythe D et al (2019) FLAIR: an easy-to-use framework for state-of-the-art NLP. In: NAACL, 2019 annual conference of the North American chapter of the association for computational linguistics (demonstrations), pp 54–59

    Google Scholar 

  36. Ethayarajh K (2019) How contextual are contextualized word representations? comparing the geometry of BERT, ELMo, and GPT-2 embeddings. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 55–65

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Hima Bindu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kiran Babu, N., Hima Bindu, K. (2022). Evaluation of Word Embeddings for Toxic Span Prediction. In: Kumar, A., Ghinea, G., Merugu, S., Hashimoto, T. (eds) Proceedings of the International Conference on Cognitive and Intelligent Computing. Cognitive Science and Technology. Springer, Singapore. https://doi.org/10.1007/978-981-19-2350-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-2350-0_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-2349-4

  • Online ISBN: 978-981-19-2350-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics