Skip to main content

Topic-Specific YouTube Crawling to Detect Online Radicalization

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNISA,volume 8999)

Abstract

Online video sharing platforms such as YouTube contains several videos and users promoting hate and extremism. Due to low barrier to publication and anonymity, YouTube is misused as a platform by some users and communities to post negative videos disseminating hatred against a particular religion, country or person. We formulate the problem of identification of such malicious videos as a search problem and present a focused-crawler based approach consisting of various components performing several tasks: search strategy or algorithm, node similarity computation metric, learning from exemplary profiles serving as training data, stopping criterion, node classifier and queue manager. We implement two versions of the focused crawler: best-first search and shark search. We conduct a series of experiments by varying the seed, number of n-grams in the language model based comparer, similarity threshold for the classifier and present the results of the experiments using standard Information Retrieval metrics such as precision, recall and F-measure. The accuracy of the proposed solution on the sample dataset is 69% and 74% for the best-first and shark search respectively. We perform characterization study (by manual and visual inspection) of the anti-India hate and extremism promoting videos retrieved by the focused crawler based on terms present in the title of the videos, YouTube category, average length of videos, content focus and target audience. We present the result of applying Social Network Analysis based measures to extract communities and identify core and influential users.

Keywords

  • Mining User Generated Content
  • Social Media Analytics
  • Information Retrieval
  • Focused Crawler
  • Social Network Analysis
  • Hate and Extremism Detection
  • Video Sharing Website
  • Online Radicalization

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-16313-0_10
  • Chapter length: 19 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   44.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-16313-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   59.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, S., Sureka, A.: A focused crawler for mining hate and extremism promoting videos on youtube. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media, HT 2014, pp. 294–296. ACM, New York (2014), http://doi.acm.org/10.1145/2631775.2631776

    CrossRef  Google Scholar 

  2. Agrawal, S., Sureka, A.: Copyright infringement detection of music videos on youtube by mining video and uploader meta-data. In: Bhatnagar, V., Srinivasa, S. (eds.) BDA 2013. LNCS, vol. 8302, pp. 48–67. Springer, Heidelberg (2013), http://dx.doi.org/10.1007/978-3-319-03689-2_4

    CrossRef  Google Scholar 

  3. Chaudhary, V., Sureka, A.: Contextual feature based one-class classifier approach for detecting video response spam on youtube. In: 2013 Eleventh Annual International Conference on Privacy, Security and Trust (PST), pp. 195–204 (2013)

    Google Scholar 

  4. Chen, H.: Extremist youtube videos. In: Dark Web. Integrated Series in Information Systems, vol. 30, pp. 295–318. Springer, New York (2012), http://dx.doi.org/10.1007/978-1-4614-1557-2_15

  5. Chen, H., Denning, D., Roberts, N., Larson, C.A., Yu, X., Huang, C.-N.: Chapter 1 - revealing the hidden world of the dark web: Social media forums and videos. In: Yang, C., Mao, W., Zheng, X., Wang, H. (eds.) Intelligent Systems for Security Informatics, p. 1. Academic Press, Boston (2013), http://www.sciencedirect.com/science/article/pii/B978012404702000001X

    CrossRef  Google Scholar 

  6. Chen, H., Denning, D., Roberts, N., Larson, C.A., Yu, X., Huang, C.: The dark web forum portal: From multi-lingual to video. In: ISI, pp. 7–14. IEEE (2011), http://dblp.uni-trier.de/db/conf/isi/isi2011.html#ChenDRLYH11

  7. Conway, M., McInerney, L.: Jihadi video and auto-radicalisation: Evidence from an exploratory youtube study. In: Ortiz-Arroyo, D., Larsen, H.L., Zeng, D.D., Hicks, D., Wagner, G. (eds.) EuroIsI 2008. LNCS, vol. 5376, pp. 108–118. Springer, Heidelberg (2008), http://dx.doi.org/10.1007/978-3-540-89900-6_13

    CrossRef  Google Scholar 

  8. Fu, T., Chen, H.: Knowledge discovery and text mining

    Google Scholar 

  9. Goodwin, M.: The Roots of Extremism: The English Defence League and the Counter-Jihad Callenge. Chatham House (2013)

    Google Scholar 

  10. Hersovici, M., Jacovi, M., Maarek, Y.S., Pelleg, D., Shtalhaim, M., Ur, S.: The shark-search algorithm. an application: tailored web site mapping. Computer Networks and ISDN Systems 30(1), 317–326 (1998)

    CrossRef  Google Scholar 

  11. McNamee, L.G., Peterson, B.L., Peña, J.: A call to educate, participate, invoke and indict: Understanding the communication of online hate groups. Communication Monographs 77(2), 257–280 (2010)

    CrossRef  Google Scholar 

  12. Peng, F., Schuurmans, D., Wang, S.: Language and task independent text categorization with simple language models. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 110–117. Association for Computational Linguistics (2003)

    Google Scholar 

  13. Rawat, S., Patil, D.R.: Efficient focused crawling based on best first search. In: 2013 IEEE 3rd International Advance Computing Conference (IACC), pp. 908–911 (February 2013)

    Google Scholar 

  14. Reid, E., Chen, H.: Internet-savvy us and middle eastern extremist groups. Mobilization: An International Quarterly 12(2), 177–192 (2007)

    Google Scholar 

  15. Salem, A., Reid, E., Chen, H.: Content analysis of jihadi extremist groups’ videos. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, F.-Y. (eds.) ISI 2006. LNCS, vol. 3975, pp. 615–620. Springer, Heidelberg (2006)

    CrossRef  Google Scholar 

  16. Sureka, A.: Mining user comment activity for detecting forum spammers in youtube. arXiv preprint arXiv:1103.5044 (2011)

    Google Scholar 

  17. Sureka, A., Kumaraguru, P., Goyal, A., Chhabra, S.: Mining youTube to discover extremist videos, users and hidden communities. In: Cheng, P.-J., Kan, M.-Y., Lam, W., Nakov, P. (eds.) AIRS 2010. LNCS, vol. 6458, pp. 13–24. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  18. Ting, I.-H., Chi, H.-M., Wu, J.-S., Wang, S.-L.: An approach for hate groups detection in facebook. In: Uden, L., Wang, L.S.L., Hong, T.-P., Yang, H.-C., Ting, I.-H. (eds.) The 3rd International Workshop on Intelligent Data Analysis and Management. Springer Proceedings in Complexity, pp. 101–106. Springer, Netherlands (2013), http://dx.doi.org/10.1007/978-94-007-7293-9_11

  19. Yin, D., Xue, Z., Hong, L., Davison, B.D., Kontostathis, A., Edwards, L.: Detection of harassment on web 2.0. In: Proceedings of the Content Analysis in the WEB, vol. 2 (2009)

    Google Scholar 

  20. Zhou, Y., Reid, E., Qin, J., Chen, H., Lai, G.: Us domestic extremist groups on the web: link and content analysis. IEEE Intelligent Systems 20(5), 44–51 (2005)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Agarwal, S., Sureka, A. (2015). Topic-Specific YouTube Crawling to Detect Online Radicalization. In: Chu, W., Kikuchi, S., Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2015. Lecture Notes in Computer Science, vol 8999. Springer, Cham. https://doi.org/10.1007/978-3-319-16313-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16313-0_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16312-3

  • Online ISBN: 978-3-319-16313-0

  • eBook Packages: Computer ScienceComputer Science (R0)