Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis

  • 675 Accesses

  • 5 Citations

Abstract

Cyberbullying and hate speeches are common issues in online etiquette. To tackle this highly concerned problem, we propose a text classification model based on convolutional neural networks for the de facto verbal aggression dataset built in our previous work and observe significant improvement, thanks to the proposed 2D TF-IDF features instead of pre-trained methods. Experiments are conducted to demonstrate that the proposed system outperforms our previous methods and other existing methods. A case study of word vectors is carried out to address the difficulty in using pre-trained word vectors for our short-text classification task, demonstrating the necessities of introducing 2D TF-IDF features. Furthermore, we also conduct visual analysis on the convolutional and pooling layers of the convolutional neural networks trained.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

References

  1. 1.

    Pang B, Lee L (2008) Opinion mining and sentiment analysis. Foundations and Trends®. Inf Retrieval 2(1–2):1–135

  2. 2.

    Zhang W, Xu H, Wan W (2012) Weakness Finder: find product weakness from Chinese reviews by using aspects based sentiment analysis. Expert Syst Appl 39(11):10283–10291

  3. 3.

    Long W, Tang Y-R, Tian Y-J (2016) Investor sentiment identification based on the universum SVM. Neural Comput Appl. https://doi.org/10.1007/s00521-016-2684-y

  4. 4.

    Hájek P (2018) Combining bag-of-words and sentiment features of annual reports to predict abnormal stock returns. Neural Comput Appl 29(7):343–358. https://doi.org/10.1007/s00521-017-3194-2

  5. 5.

    Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: LREc, vol 10, no. 2010

  6. 6.

    Kouloumpis E, Wilson T, Moore JD (2011) Twitter sentiment analysis: the good the bad and the omg! Icwsm 11(538–541):164

  7. 7.

    Mullen T, Malouf R (2006) A preliminary investigation into sentiment analysis of informal political discourse. In: AAAI spring symposium: computational approaches to analyzing weblogs, pp 159–162

  8. 8.

    Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, pp 142–150

  9. 9.

    Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp 1422–1432

  10. 10.

    Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis. In: Mining text data. Springer, New York, pp 415–463

  11. 11.

    Chen J, Yan S, Wong KC (2017). Aggressivity detection on social network comments. In: Proceedings of the 2017 international conference on intelligent systems, metaheuristics & swarm intelligence. ACM, pp 103–107

  12. 12.

    Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224 N Project Report, Stanford, 1(2009), 12

  13. 13.

    Fellbaum C (1998) WordNet. Wiley, New York

  14. 14.

    Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137

  15. 15.

    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

  16. 16.

    Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. Wiley, New York

  17. 17.

    Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, Inc., New York

  18. 18.

    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

  19. 19.

    Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882

  20. 20.

    Lee G, Jeong J, Seo S, Kim C, Kang P (2017) Sentiment classification with word attention based on weakly supervised leaning. arXiv preprint arXiv:1709.09885

  21. 21.

    Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32(14):2627–2636

  22. 22.

    Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013). Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119

  23. 23.

    Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  24. 24.

    Dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: COLING, pp 69–78

  25. 25.

    Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural networks. In Advances in neural information processing systems, pp 1019–1027

  26. 26.

    Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523

  27. 27.

    Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2):270–280

  28. 28.

    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

  29. 29.

    Mikolov T, Karafiát M, Burget L, Cernocký J, Khudanpur S (2010) Recurrent neural network based language model. In Interspeech, vol 2, p 3

  30. 30.

    Yang Z, Yang D, Dyer C, He X, Smola AJ, Hovy EH (2016) Hierarchical attention networks for document classification. In: HLT-NAACL, pp 1480–1489

  31. 31.

    Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  32. 32.

    Refaeilzadeh P, Tang L, Liu H (2009) Cross-validation. In: Encyclopedia of database systems. Springer, New York, pp 532–538

  33. 33.

    Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874

  34. 34.

    Yang Y (1999) An evaluation of statistical approaches to text categorization. Inf Retrieval 1(1):69–90

Download references

Acknowledgements

The work described in this paper was substantially supported by two grants from the Research Grants Council of the Hong Kong Special Administrative Region (CityU 21200816) and (CityU 11203217).

Author information

Correspondence to Ka-Chun Wong.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chen, J., Yan, S. & Wong, K. Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis. Neural Comput & Applic (2018). https://doi.org/10.1007/s00521-018-3442-0

Download citation

Keywords

  • Aggression detection
  • Sentiment analysis
  • Machine learning
  • Convolutional neural network