Content-Based Classification Approach for Video-Spam Identification

  • Palak Agarwal
  • Mahak Sharma
  • Gagandeep Kaur
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 736)


In this paper the authors have worked on YouTube comment spamming. The work has been carried out on a large and labeled dataset of text-comments. Filtration and pre-processing was done to speed up the detection, elimination of redundancies as well as to increase the accuracy. Spam flags on each set of text-comments were used to check the accuracy in implementation of classification techniques. An improved algorithm has also been proposed based on term frequencies. The results were compared based on accuracy-score and F-score considering the spam flag corresponding to each comment. Further, the accuracy of SVM model was compared with respect to size of dataset, pre-processing of data as well as with XGBoost.


Spam/Ham XGBoost TF-IDF RCA SVM LDA Video Security 


  1. 1.
    Balakrishnan, A.: Google claims YouTube is 10x as Popular as Netflix or Facebook Video, and Approaching TV (2017).–us-tv-viewers-watch-125-billion-and-dropping.html
  2. 2.
    Google’s Bad Week: YouTube Loses Millions as Advertising Row Reaches US.
  3. 3.
    Wattenhofer, M., Wattenhofer, R., Zhu, Z.: The YouTube social network. In: Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media (2012)Google Scholar
  4. 4.
    Chaudhary, V., Sureka, A.: Contextual feature based one-class classifier approach for detecting video response spam on Youtube. In: Eleventh Annual International Conference on Privacy, Security and Trust (PST), pp. 195–204, July 2013Google Scholar
  5. 5.
    Jin, X., Lin, C.X., Luo, J., Han, J.: Social spam guard: a data mining-based spam detection system for social media networks. In: Proceedings of the Very Large Data Bases, pp. 1458–1461 (2011)Google Scholar
  6. 6.
    O’Callaghan, D., Harrigan, M., Carthy, J., Cunningham, P.: Network analysis of recurring youtube spam campaigns. In: Proceedings of the 6th International Conference on Weblogs and Social Media (ICWSM 2012), Dublin, Ireland, pp. 531–534Google Scholar
  7. 7.
    Abdulhamid, S.M., Latiff, M.S.A., Chiroma, H., Osho, O., Abdul-Salaam, G., Abubakar, A.I., Herawan, T.: A review on mobile SMS spam filtering techniques. IEEE Access 5, 15650–15666 (2017)CrossRefGoogle Scholar
  8. 8.
    Spirin, N., Han, J.: Survey on web spam detection: principles and algorithms. ACM SIGKDD Explor. Newsl. 13(2), 50–64 (2012)CrossRefGoogle Scholar
  9. 9.
    Ghiam, S., Pour, A.N.: A survey on web spam detection methods: taxonomy. Int. J. Netw. Secur. Appl. (IJNSA) 4(5), 119–134 (2012)Google Scholar
  10. 10.
    Rădulescu, C., Dinsoreanu, M., Potolea, R.: Identification of spam comments using natural language processing techniques. In: IEEE International Conference on Intelligent Computer Communication and Processing (ICCP) (2014)Google Scholar
  11. 11.
    Lesmeister, C.: Mastering Machine Learning with R. Packt Publishing Ltd., Birmingham (2017)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of CSE & ITJIITNoidaIndia

Personalised recommendations