Advertisement

Co-Detection of crowdturfing microblogs and spammers in online social networks

  • Bo LiuEmail author
  • Xiangguo Sun
  • Zeyang Ni
  • Jiuxin Cao
  • Junzhou Luo
  • Benyuan Liu
  • Xinwen Fu
Article
  • 61 Downloads
Part of the following topical collections:
  1. Special Issue on Trust, Privacy, and Security in Crowdsourcing Computing

Abstract

The rise of online crowdsourcing services has prompted an evolution from traditional spamming accounts, which spread unwanted advertisements and fraudulent content, into novel spammers that resemble those of normal users. Prior research has mainly focused on machine accounts and spams separately, but characteristics of new types of spammers and spamming make it difficult for traditional methods to perform well. In this paper, we integrate the study of these new types of spammers with the study of crowdturfing microblogs, investigating the mechanism of crowdsourcing and the close relationship between crowdturfing spammers and microblogs in order to detect new types of spammers and spams more precisely. We propose a novel semi-supervised learning framework for co-detecting crowdturfing microblogs and spammers by comprehensively modeling user behavior, message content, and users’ following and retweeting networks. In order to meet the challenge of sparsely labeled datasets, we design an elaborate co-detection target optimal function to minimize empirical error and to permit the dissemination of sparse labels to unlabeled samples. The advantage of this framework is threefold. First, through a deep-level mining of new-type spammers, we aggregate a number of new-found features that can help us make significant distinctions between normal users and new-type spammers. Secondly, by modeling both following networks and retweeting networks, we characterize the essence of the crowdsourcing mechanism abused by spammers in crowdturfing microblog diffusion to markedly increase detection performance. Thirdly, through our optimal function based on semi-supervised methods, we overcome the problem of label sparseness, thus obtaining a more reliable capacity to deal with the challenges of big, sparsely labeled data. Extensive experiments on real datasets demonstrate that our method outperforms four baselines in various metrics (Precision-Recall, AUC values, Precision@K and so on). We also develop a robust system, the functions of which include data collection and availability analysis, spam and spammer detection, and visualization. To render our experiments replicable, we have made our dataset and codes openly available at https://github.com/sunxiangguo/Crowdturfing.

Keywords

Crowdsourcing Spammer detection Semi-supervised learning Online social networks 

Notes

Acknowledgments

This work is supported by National Key R&D Program of China 2017YFB1003000, National Natural Science Foundation of China under Grants No. 61972087, No. 61772133, No.61472081, No. 61402104. Jiangsu Provincial Key Project BE2018706. Key Laboratory of Computer Network Technology of Jiangsu Province. Jiangsu Provincial Key Laboratory of Network and Information Security under Grants No. BM2003201, and Key Laboratory of Computer Network and Information Integration of Ministry of Education of China under Grants No. 93K-9.

References

  1. 1.
    Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on twitter. In: Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), Vol. 6, p. 12 (2010)Google Scholar
  2. 2.
    Brown, G., Howe, T., Ihbe, M., Prakash, A., Borders, K.: Social networks and context-aware spam. In: Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, pp. 403–412. ACM (2008)Google Scholar
  3. 3.
    Chen, T., Li, X., Yin, H., Zhang, J.: Call attention to rumors: deep attention based recurrent neural networks for early rumor detection. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 40–52. Springer (2018)Google Scholar
  4. 4.
    Chung, F.: Laplacians and the cheeger inequality for directed graphs. Ann. Comb. 9(1), 1–19 (2005)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Ding, C.H., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 45–55 (2010)CrossRefGoogle Scholar
  6. 6.
    Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and combating link farming in the twitter social network. In: Proceedings of the 21st International Conference on World Wide Web, pp. 61–70. ACM (2012)Google Scholar
  7. 7.
    Hu, X., Tang, J., Gao, H., Liu, H.: Social spammer detection with sentiment information. In: 2014 IEEE International Conference on Data Mining (ICDM), pp. 180–189. IEEE (2014)Google Scholar
  8. 8.
    Jiang, M., Cui, P., Beutel, A., Faloutsos, C., Yang, S.: Catching synchronized behaviors in large networks: a graph mining approach. ACM Trans. Knowl. Discov. Data (TKDD) 10(4), 35 (2016)Google Scholar
  9. 9.
    Kim, H.J., Chae, D.K., Kim, S.W., Lee, J.: Analyzing crowdsourced promotion effects in online social networks. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, pp. 820–823. ACM (2016)Google Scholar
  10. 10.
    Lee, K., Webb, S., Ge, H.: The dark side of micro-task marketplaces: characterizing fiverr and automatically detecting crowdturfing. In: ICWSM (2014)Google Scholar
  11. 11.
    Li, H., Chen, Z., Mukherjee, A., Liu, B., Shao, J.: Analyzing and detecting opinion spam on a large-scale dataset via temporal and spatial patterns. In: ICWSM, pp. 634–637 (2015)Google Scholar
  12. 12.
    Liu, B., Luo, J., Cao, J., Ni, X., Liu, B., Fu, X.: On crowd-retweeting spamming campaign in social networks. In: 2016 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2016)Google Scholar
  13. 13.
    Liu, B., Ni, Z., Luo, J., Cao, J., Ni, X., Liu, B., Fu, X.: Analysis of and defense against crowd-retweeting based spam in social networks. World Wide Web, pp. 1–23 (2018)Google Scholar
  14. 14.
    Liu, L., Jia, K.: Detecting spam in chinese microblogs-a study on sina weibo. In: 2012 Eighth International Conference on Computational Intelligence and Security (CIS), pp. 578–581. IEEE (2012)Google Scholar
  15. 15.
    Liu, Y., Liu, Y., Zhang, M., Ma, S.: Pay Me and I’ll follow you: detection of crowdturfing following activities in microblog environment. In: IJCAI, pp. 3789–3796 (2016)Google Scholar
  16. 16.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank Citation Ranking: Bringing Order to the Web. Tech. rep., Stanford InfoLab (1999)Google Scholar
  17. 17.
    Shu, K., Wang, S., Le, T., Lee, D., Liu, H.: Deep headline generation for clickbait detection. In: ICDM, pp. 467–476. IEEE Computer Society (2018)Google Scholar
  18. 18.
    Song, J., Lee, S., Kim, J.: Crowdtarget: Target-based detection of crowdturfing in online social networks. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 793–804. ACM (2015)Google Scholar
  19. 19.
    Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: Proceedings of the 26th Annual Computer Security Applications Conference, pp. 1–9. ACM (2010)Google Scholar
  20. 20.
    Tam, N.T., Weidlich, M., Zheng, B., Yin, H., Nguyen, Q.V.H., Stantic, B.: From anomaly detection to rumour detection using data streams of social platforms. In: Proceedings of the Forty-fifth International Conference on Very Large Data Bases (VLDB’19). CEUR-WS.org (2019)CrossRefGoogle Scholar
  21. 21.
    Thanh Tam, N., Matthias, W., Hongzhi, Y., Bolong, Z., Quoc Viet, H.N., Bela, S.: User guidance for efficient fact checking. In: Proceedings of the Forty-fifth International Conference on Very Large Data Bases (VLDB’19). CEUR-WS.org (2019)Google Scholar
  22. 22.
    Wang, G., Wilson, C., Zhao, X., Zhu, Y., Mohanlal, M., Zheng, H., Zhao, B.Y.: Serf and turf: crowdturfing for fun and profit. In: Proceedings of the 21st International Conference on World Wide Web, pp. 679–688. ACM (2012)Google Scholar
  23. 23.
    Wang, T., Wang, G., Li, X., Zheng, H., Zhao, B.Y.: Characterizing and Detecting Malicious Crowdsourcing. In: ACM SIGCOMM Computer Communication Review, Vol. 43, pp. 537–538. ACM (2013)Google Scholar
  24. 24.
    Wu, F., Shu, J., Huang, Y., Yuan, Z.: Social spammer and spam message co-detection in microblogging with social context regularization. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1601–1610. ACM (2015)Google Scholar
  25. 25.
    Yang, X., Yang, Q., Wilson, C.: Penny for Your Thoughts: Searching for the 50 Cent Party on Sina Weibo. In: ICWSM, pp. 694–697 (2015)Google Scholar
  26. 26.
    Yang, Z., Wilson, C., Wang, X., Gao, T., Zhao, B.Y., Dai, Y.: Uncovering social network sybils in the wild. ACM Trans. Knowl. Discov. Data (TKDD) 8(1), 2 (2014)Google Scholar
  27. 27.
    Yuan, D., Li, G., Li, Q., Zheng, Y.: Sybil defense in crowdsourcing platforms. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1529–1538. ACM (2017)Google Scholar
  28. 28.
    Zhu, Y., Wang, X., Zhong, E., Liu, N.N., Li, H., Yang, Q.: Discovering spammers in social networks. In: AAAI (2012)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringSoutheast UniverisityNanJingChina
  2. 2.School of Cyber Science and EngineeringSoutheast UniverisityNanJingChina
  3. 3.Department of Computer ScienceUniversity of Central FloridaOrlandoUSA
  4. 4.Department of Computer ScienceUniversity of Massachusetts LowellLowellUSA

Personalised recommendations