Advertisement

Convolutional Neural Networks for Twitter Text Toxicity Analysis

  • Spiros V. Georgakopoulos
  • Sotiris K. Tasoulis
  • Aristidis G. Vrahatis
  • Vassilis P. PlagianakosEmail author
Conference paper
Part of the Proceedings of the International Neural Networks Society book series (INNS, volume 1)

Abstract

Toxic comment classification is an emerging research field with several studies that have address several tasks in the detection of unwanted messages on communication platforms. Although sentiment analysis is an accurate approach for observing the crowd behavior, it is incapable of discovering other types of information in text, such as toxicity, which can usually reveal hidden information. Towards this direction, a model for temporal tracking of comments toxicity is proposed using tweets related to the hashtag under study. More specifically, a classifier is trained for toxic comments prediction using a Convolutional Neural Network model. Next, given a hashtag all relevant tweets are parsed and used as input in the classifier, hence, the knowledge about toxic texts is transferred to a new dataset for categorization. In the meantime, an adapted change detection approach is applied for monitoring the toxicity trend changes over time within the hashtag tweets. Our experimental results showed that toxic comment classification on twitter conversations can reveal significant knowledge and changes in the toxicity are accurately identified over time.

Keywords

Convolutional neural networks Toxic comments Twitter conversations Change detection 

Notes

Acknowledgment

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research. This project has received funding from the Hellenic Foundation for Research and Innovation (HFRI) and the General Secretariat for Research and Technology (GSRT), under grant agreement No 1901.

References

  1. 1.
    Anastasia, S., Budi, I.: Twitter sentiment analysis of online transportation service providers. In: 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 359–365, October 2016Google Scholar
  2. 2.
    Basseville, M., Nikiforov, I.V.: Detection of abrupt changes: theory and application (1993)Google Scholar
  3. 3.
    Bottou, L.: On-line learning and stochastic approximations. In: On-Line Learning in Neural Networks, pp. 9–42. Cambridge University Press, New York (1998). http://dl.acm.org/citation.cfm?id=304710.304720
  4. 4.
    Burgess, J., Bruns, A.: (Not) the Twitter election: the dynamics of the# ausvotes conversation in relation to the Australian media ecology. Journal. Pract. 6(3), 384–402 (2012)Google Scholar
  5. 5.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)zbMATHGoogle Scholar
  6. 6.
    Enli, G.S., Skogerbø, E.: Personalized campaigns in party-centred politics: Twitter and Facebook as arenas for political communication. Inf. Commun. Soc. 16(5), 757–774 (2013)CrossRefGoogle Scholar
  7. 7.
    Gal, Y., Ghahramani, Z.: A theoretically grounded application of dropout in recurrent neural networks. In: Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain, 5–10 December 2016, pp. 1019–1027 (2016)Google Scholar
  8. 8.
    Georgakopoulos, S.V., Tasoulis, S.K., Plagianakos, V.P.: Efficient change detection for high dimensional data streams. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 2219–2222, October 2015Google Scholar
  9. 9.
    Georgakopoulos, S.V., Tasoulis, S.K., Vrahatis, A.G., Plagianakos, V.P.: Convolutional neural networks for toxic comment classification. CoRR abs/1802.09957 (2018). http://arxiv.org/abs/1802.09957
  10. 10.
    Granjon, P.: The CUSUM algorithm a small review (2014)Google Scholar
  11. 11.
    Haselmayer, M., Jenny, M.: Sentiment analysis of political communication: combining a dictionary approach with crowdcoding. Qual. Quant. 51(6), 2623–2646 (2017)CrossRefGoogle Scholar
  12. 12.
    Hester, J.: glue: Interpreted String Literals (2017). https://CRAN.R-project.org/package=glue, r package version 1.2.0
  13. 13.
    Hosseini, H., Kannan, S., Zhang, B., Poovendran, R.: Deceiving Google’s perspective API built for detecting toxic comments. arXiv preprint arXiv:1702.08138 (2017)
  14. 14.
    Kalucki, J.: Twitter streaming API (2010). http://apiwiki.twitter.com/Streaming-API-Documentation
  15. 15.
    Kearney, M.W.: rtweet: Collecting Twitter Data (2017). R package version 0.6.0Google Scholar
  16. 16.
    Killick, R., Fearnhead, P., Eckley, I.: Optimal detection of changepoints with a linear computational cost 107, 1590–1598 (2012)Google Scholar
  17. 17.
    Killick, R., Haynes, K., Eckley, I.A.: changepoint: an R package for changepoint analysis (2016). https://CRAN.R-project.org/package=changepoint. R package version 2.2.2
  18. 18.
    Kušen, E., Strembeck, M.: Politics, sentiments, and misinformation: an analysis of the Twitter discussion on the 2016 Austrian presidential elections. Online Soc. Netw. Media 5, 37–50 (2018)CrossRefGoogle Scholar
  19. 19.
    Li, S.: Application of recurrent neural networks in toxic comment classification. Ph.D. thesis, UCLA (2018)Google Scholar
  20. 20.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Neural and Information Processing System (NIPS) (2013)Google Scholar
  21. 21.
    Page, E.S.: Continuous inspection schemes. Biometrika 41(1/2), 100–115 (1954)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Pagolu, V.S., Reddy, K.N., Panda, G., Majhi, B.: Sentiment analysis of Twitter data for predicting stock market movements. In: 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), pp. 1345–1350, October 2016Google Scholar
  23. 23.
    Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2011)CrossRefGoogle Scholar
  24. 24.
    Ranco, G., Aleksovski, D., Caldarelli, G., Grčar, M., Mozetič, I.: The effects of Twitter sentiment on stock price returns. PLoS One 10(9), e0138441 (2015)CrossRefGoogle Scholar
  25. 25.
    Ringsquandl, M., Petkovic, D.: Analyzing political sentiment on Twitter. In: AAAI Spring Symposium: Analyzing Microtext. AAAI Technical report, vol. SS-13-01. AAAI (2013)Google Scholar
  26. 26.
    Risch, J., Krestel, R.: Aggression identification using deep learning and data augmentation. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC 2018), pp. 150–158 (2018)Google Scholar
  27. 27.
    Tasoulis, S., Doukas, C., Plagianakos, V., Maglogiannis, I.: Statistical data mining of streaming motion data for activity and fall recognition in assistive environments. Neurocomputing 107, 87–96 (2013). Timely Neural Networks Applications in EngineeringCrossRefGoogle Scholar
  28. 28.
    Tasoulis, S.K., Vrahatis, A.G., Georgakopoulos, S.V., Plagianakos, V.P.: Real time sentiment change detection of Twitter data streams. CoRR abs/1804.00482 (2018)Google Scholar
  29. 29.
    Thelwall, M.: The heart and soul of the web? Sentiment strength detection in the social web with sentistrength, pp. 119–134. Springer, Cham (2017)Google Scholar
  30. 30.
    Wang, H., Can, D., Kazemzadeh, A., Bar, F., Narayanan, S.: A system for real-time Twitter sentiment analysis of 2012 US presidential election cycle. In: Proceedings of the ACL 2012 System Demonstrations, pp. 115–120. Association for Computational Linguistics (2012)Google Scholar
  31. 31.
    Wickham, H.: stringr: Simple, Consistent Wrappers for Common String Operations (2017). https://CRAN.R-project.org/package=stringr. R package version 1.2.0
  32. 32.
    Wulczyn, E., Thain, N., Dixon, L.: Ex machina: personal attacks seen at scale. In: Proceedings of the 26th International Conference on World Wide Web, WWW 2017, pp. 1391–1399. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva (2017)Google Scholar
  33. 33.
    Wulczyn, E., Thain, N., Dixon, L.: Wikipedia talk labels: aggression (2017)Google Scholar
  34. 34.
    Wulczyn, E., Thain, N., Dixon, L.: Wikipedia talk labels: personal attacks (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Spiros V. Georgakopoulos
    • 1
  • Sotiris K. Tasoulis
    • 1
  • Aristidis G. Vrahatis
    • 1
  • Vassilis P. Plagianakos
    • 1
    Email author
  1. 1.Department of Computer Science and Biomedical InformaticsUniversity of ThessalyLamiaGreece

Personalised recommendations