On the Design and Tuning of Machine Learning Models for Language Toxicity Classification in Online Platforms

  • Maciej Rybinski
  • William Miller
  • Javier Del Ser
  • Miren Nekane Bilbao
  • José F. Aldana-Montes
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 798)


One of the most concerning drawbacks derived from the lack of supervision in online platforms is their exploitation by misbehaving users to deliver offending (toxic) messages while remaining unknown themselves. Given the huge volumes of data handled by these platforms, the detection of toxicity in exchanged comments and messages has naturally called for the adoption of machine learning models to automate this task. In the last few years Deep Learning models and related techniques have played a major role in this regard due to their superior modeling capabilities, which have made them stand out as the prevailing choice in the related literature. By addressing a toxicity classification problem over a real dataset, this work aims at throwing light on two aspects of this noted dominance of Deep Learning models: (1) an empirical assessment of their predictive gains with respect to traditional Shallow Learning models; and (2) the impact of using different text embedding methods and data augmentation techniques in this classification task. Our findings reveal that in our case study the application of non-optimized Shallow and Deep Learning models attains very competitive accuracy scores, thus leaving a narrow improvement margin for the fine-grained refinement of the models or the addition of data augmentation techniques.


Natural language processing Deep learning Online platforms Multilabel classification 



The work of Maciej Rybinski has been partially funded by grant TIN2017-86049-R (Ministerio de Economa, Industria y Competitividad, Spain). Javier Del Ser also thanks the Basque Government for its funding support through the EMAITEK program.


  1. 1.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRefGoogle Scholar
  2. 2.
    Escalante, H.J., Villatoro-Tello, E., Garza, S.E., López-Monroy, A.P., Montes-y Gómez, M., Villaseñor-Pineda, L.: Early detection of deception and aggressiveness using profile-based representations. Expert. Syst. Appl. 89, 99–111 (2017)CrossRefGoogle Scholar
  3. 3.
    Hashim, E.N., Nohuddin, P.N.: Data mining techniques for recidivism prediction: a survey paper. Adv. Sci. Lett. 24(3), 1616–1618 (2018)CrossRefGoogle Scholar
  4. 4.
    Lara-Cabrera, R., Gonzalez-Pardo, A., Camacho, D.: Statistical analysis of risk assessment factors and metrics to evaluate radicalisation in twitter. Futur. Gener. Comput. Syst. (2017).
  5. 5.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRefGoogle Scholar
  6. 6.
    Miikkulainen, R., Liang, J., Meyerson, E., Rawal, A., Fink, D., Francon, O., Raju, B., Shahrzad, H., Navruzyan, A., Duffy, N., et al.: Evolving deep neural networks (2017). arXiv preprint arXiv:170300548
  7. 7.
    Nam, J., Kim, J., Mencía, E.L., Gurevych, I., Fürnkranz, J.: Large-scale multi-label text classification revisiting neural networks. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 437–452. Springer (2014)Google Scholar
  8. 8.
    Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks (2013). arXiv preprint arXiv:13126026
  9. 9.
    Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)Google Scholar
  10. 10.
    Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3859–3869 (2017)Google Scholar
  11. 11.
    Severyn, A., Moschitti, A.: Twitter sentiment analysis with deep convolutional neural networks. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959–962. ACM (2015)Google Scholar
  12. 12.
    Villar-Rodriguez, E., Del Ser, J., Bilbao, M.N., Salcedo-Sanz, S.: A feature selection method for author identification in interactive communications based on supervised learning and language typicality. Eng. Appl. Artif. Intell. 56, 175–184 (2016)CrossRefGoogle Scholar
  13. 13.
    Villar-Rodríguez, E., Del Ser, J., Torre-Bastida, A.I., Bilbao, M.N., Salcedo-Sanz, S.: A novel machine learning approach to the detection of identity theft in social networks based on emulated attack instances and support vector machines. Concurr. Comput. Pract. Exp. 28(4), 1385–1395 (2016)CrossRefGoogle Scholar
  14. 14.
    Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification (2015). arXiv preprint arXiv:151003820
  15. 15.
    Zhou, P., Qi, Z., Zheng, S., Xu, J., Bao, H., Xu, B.: Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling (2016). arXiv preprint arXiv:161106639

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Maciej Rybinski
    • 1
  • William Miller
    • 2
  • Javier Del Ser
    • 3
    • 4
    • 5
  • Miren Nekane Bilbao
    • 5
  • José F. Aldana-Montes
    • 1
  1. 1.University of MálagaMálagaSpain
  2. 2.Anami PrecisionSan SebastiánSpain
  3. 3.TECNALIABizkaiaSpain
  4. 4.Basque Center for Applied Mathematics (BCAM)BizkaiaSpain
  5. 5.University of the Basque Country (UPV/EHU)BilbaoSpain

Personalised recommendations