Graph-Based Features for Automatic Online Abuse Detection

  • Etienne PapegniesEmail author
  • Vincent Labatut
  • Richard Dufour
  • Georges Linarès
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10583)


While online communities have become increasingly important over the years, the moderation of user-generated content is still performed mostly manually. Automating this task is an important step in reducing the financial cost associated with moderation, but the majority of automated approaches strictly based on message content are highly vulnerable to intentional obfuscation. In this paper, we discuss methods for extracting conversational networks based on raw multi-participant chat logs, and we study the contribution of graph features to a classification system that aims to determine if a given message is abusive. The conversational graph-based system yields unexpectedly high performance, with results comparable to those previously obtained with a content-based approach.


Text categorization Abuse detection Online communities Moderation 



This work was financed by a grant from the Provence Alpes Cte d’Azur region (France) and the Nectar de Code company.


  1. 1.
    Balci, K., Salah, A.A.: Automatic analysis and identification of verbal aggression and abusive behaviors for online social games. Comput. Hum. Behav. 53, 517–526 (2015)CrossRefGoogle Scholar
  2. 2.
    Bonacich, P.F.: Power and centrality: a family of measures. Am. J. Sociol. 92, 1170–1182 (1987)CrossRefGoogle Scholar
  3. 3.
    Brin, S., Page, L.E.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998)CrossRefGoogle Scholar
  4. 4.
    Chavan, V.S., Shylaja, S.S.: Machine learning approach for detection of cyber-aggressive comments by peers on social media network. In: IEEE ICACCI, pp. 2354–2358 (2015)Google Scholar
  5. 5.
    Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: PASSAT/SocialCom, pp. 71–80 (2012)Google Scholar
  6. 6.
    Cheng, J., Danescu-Niculescu-Mizil, C., Leskovec, J.: Antisocial behavior in online discussion communities. Preprint arXiv:1504.00680 (2015)
  7. 7.
    Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Syst. 1695(5), 1–9 (2006)Google Scholar
  8. 8.
    Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. Soc. Mob. Web 11, 02 (2011)Google Scholar
  9. 9.
    Freeman, L.C.: Centrality in social networks i: conceptual clarification. Soc. Netw. 1(3), 215–239 (1978)CrossRefGoogle Scholar
  10. 10.
    Harary, F.: Graph Theory. Addison-Wesley, Reading (1969)CrossRefzbMATHGoogle Scholar
  11. 11.
    Hosseini, H., Kannan, S., Zhang, B., Poovendran, R.: Deceiving Google’s perspective API built for detecting toxic comments. Preprint arXiv:1702.08138 (2017)
  12. 12.
    Kleinberg, J.: Authoritative sources in a hyperlinked environment. J. Assoc. Comput. Mach. 46(5), 604–632 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Mutton, P.: Inferring and visualizing social networks on internet relay chat. In: 8th International Conference on Information Visualisation, pp. 35–43 (2004)Google Scholar
  14. 14.
    Newman, M.E.J.: Assortative mixing in networks. Phys. Rev. Lett. 89(20), 208701 (2002)CrossRefGoogle Scholar
  15. 15.
    Papegnies, E., Labatut, V., Dufour, R., Linares, G.: Impact of content features for automatic online abuse detection. In: International Conference on Computational Linguistics and Intelligent Text Processing (2017)Google Scholar
  16. 16.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Seidman, S.B.: Network structure and minimum degree. Soc. Netw. 5(3), 269–287 (1983)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Sinha, T., Rajasingh, I.: Investigating substructures in goal oriented online communities: case study of Ubuntu IRC. In: IEEE International Advance Computing Conference, pp. 916–922 (2014)Google Scholar
  19. 19.
    Spertus, E.: Smokey: automatic recognition of hostile messages. In: 14th National Conference on Artificial Intelligence and 9th Conference on Innovative Applications of Artificial Intelligence, pp. 1058–1065 (1997)Google Scholar
  20. 20.
    Tavassoli, S., Moessner, M., Zweig, K.A.: Constructing social networks from semi-structured chat-log data. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 146–149 (2014)Google Scholar
  21. 21.
    Yin, D., Xue, Z., Hong, L., Davison, B.D., Kontostathis, A., Edwards, L.: Detection of harassment on web 2.0. In: WWW Workshop: Content Analysis in the WEB 2.0 (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Etienne Papegnies
    • 1
    • 2
    Email author
  • Vincent Labatut
    • 1
  • Richard Dufour
    • 1
  • Georges Linarès
    • 1
  1. 1.LIA – EA 4128, University of AvignonAvignonFrance
  2. 2.Nectar de CodeBarbentaneFrance

Personalised recommendations