Advertisement

Detecting Spammers with Changing Strategies via a Transfer Distance Learning Method

  • Hao ChenEmail author
  • Jun Liu
  • Yanzhang Lv
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11323)

Abstract

Social spammers bring plenty of harmful influence to the social networking involving both social network sites and normal users. It is a consensus to detect and filter spammers. Existing social spammer detection approaches mainly focus on discovering discriminative features and organizing these features in a proper way to improve the detection performance, e.g., combining multiple features together. However, spammers are easy to escape being detected by using changing spamming strategies. Various spamming strategies bring differences in data distribution between training and testing data. Thus, previous fixed approaches are difficult to achieve desired performance in real applications. To address this, in this paper, we present a transfer distance learning approach, which combines distance learning and transfer learning to extract informative knowledge underlying training and testing instances in a unified framework. The proposed approach is validated on large real-world data. Empirical experiments results give the evidence that our method is efficient to detect spammers with changing spamming strategies.

Keywords

Transfer distance learning Social spammer detection Spamming strategies 

References

  1. 1.
    Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2009)Google Scholar
  2. 2.
    Bellet, A., Habrard, A., Sebban, M.: A survey on metric learning for feature vectors and structured data. arXiv preprint arXiv:1306.6709 (2013)
  3. 3.
    Cao, B., Ni, X., Sun, J.T., Wang, G., Yang, Q.: Distance metric learning under covariate shift. In: proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, p. 1204 (2011)Google Scholar
  4. 4.
    Chen, H., Liu, J., Lv, Y., Li, M.H., Liu, M., Zheng, Q.: Semi-supervised clue fusion for spammer detection in Sina Weibo. Inf. Fusion 44, 22–32 (2018)CrossRefGoogle Scholar
  5. 5.
    Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proceedings of the 24th International Conference on Machine Learning, pp. 209–216. ACM (2007)Google Scholar
  6. 6.
    Fleiss, J.L., Cohen, J.: The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ. Psychol. Measur. 33(3), 613–619 (1973)CrossRefGoogle Scholar
  7. 7.
    Hu, X., Tang, J., Zhang, Y., Liu, H.: Social spammer detection in microblogging. In: IJCAI, vol. 13, pp. 2633–2639 (2013)Google Scholar
  8. 8.
    Huang, J., Gretton, A., Borgwardt, K.M., Schölkopf, B., Smola, A.J.: Correcting sample selection bias by unlabeled data. In: Advances in Neural Information Processing Systems, pp. 601–608 (2007)Google Scholar
  9. 9.
    Jiang, J., Zhai, C.: Instance weighting for domain adaptation in NLP. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 264–271 (2007)Google Scholar
  10. 10.
    Liao, X., Xue, Y., Carin, L.: Logistic regression with an auxiliary data source. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 505–512. ACM (2005)Google Scholar
  11. 11.
    Marcos Alvarez, A., Yamada, M., Kimura, A., Iwata, T.: Clustering-based anomaly detection in multi-view data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 1545–1548. ACM (2013)Google Scholar
  12. 12.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  13. 13.
    Shen, H., Ma, F., Zhang, X., Zong, L., Liu, X., Liang, W.: Discovering social spammers from multiple views. Neurocomputing 225, 49–57 (2017)CrossRefGoogle Scholar
  14. 14.
    Shimodaira, H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Stat. Plann. Infer. 90(2), 227–244 (2000)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Wang, G., Xie, S., Liu, B., Philip, S.Y.: Review graph based online store review spammer detection. In: 2011 IEEE 11th International Conference on Data Mining (ICDM), pp. 1242–1247. IEEE (2011)Google Scholar
  16. 16.
    Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10(Feb), 207–244 (2009)Google Scholar
  17. 17.
    Wu, P., Dietterich, T.G.: Improving SVM accuracy by training on auxiliary data sources. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 110. ACM (2004)Google Scholar
  18. 18.
    Xu, Z., Zhang, Y., Wu, Y., Yang, Q.: Modeling user posting behavior on social media. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 545–554. ACM (2012)Google Scholar
  19. 19.
    Yang, W., Shen, G.W., Wang, W., Gong, L.Y., Yu, M., Dong, G.Z.: Anomaly detection in microblogging via co-clustering. J. Comput. Sci. Technol. 30(5), 1097–1108 (2015)CrossRefGoogle Scholar
  20. 20.
    Zhu, Y., Wang, X., Zhong, E., Liu, N.N., Li, H., Yang, Q.: Discovering spammers in social networks. In: AAAI (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.National Engineering Lab for Big Data AnalyticsXi’an Jiaotong UniversityXi’anChina
  2. 2.School of Electronic and Information EngineeringXi’an Jiaotong UniversityXi’anChina
  3. 3.Shaanxi Province Key Laboratory of Satellite and Terrestrial Network Tech. R&DXi’anChina

Personalised recommendations