World Wide Web

, Volume 16, Issue 5–6, pp 729–748 | Cite as

Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system

  • Jie Cao
  • Zhiang Wu
  • Bo Mao
  • Yanchun Zhang


Collaborative filtering (CF) technique is capable of generating personalized recommendations. However, the recommender systems utilizing CF as their key algorithms are vulnerable to shilling attacks which insert malicious user profiles into the systems to push or nuke the reputations of targeted items. There are only a small number of labeled users in most of the practical recommender systems, while a large number of users are unlabeled because it is expensive to obtain their identities. In this paper, Semi-SAD, a new semi-supervised learning based shilling attack detection algorithm is proposed to take advantage of both types of data. It first trains a naïve Bayes classifier on a small set of labeled users, and then incorporates unlabeled users with EM-λ to improve the initial naïve Bayes classifier. Experiments on MovieLens datasets are implemented to compare the efficiency of Semi-SAD with supervised learning based detector and unsupervised learning based detector. The results indicate that Semi-SAD can better detect various kinds of shilling attacks than others, especially against obfuscated and hybrid shilling attacks.


semi-supervised learning shilling attack detection collaborative filtering naïve Bayes EM 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bell, R.M., Koren, Y.: Improved neighborhood-based collaborative filtering. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07), pp. 7–14 (2007)Google Scholar
  2. 2.
    Burke, R., Mobasher, B., et al.: Classification features for attack detection in collaborative recommendation systems. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06), pp. 542–547 (2006)Google Scholar
  3. 3.
    Cacheda, F., Carneiro, V., Fernandez, D., Formoso, V.: Comparison of collaborative filtering algorithms: limitations of current techniques and proposals for scalable, high-performance recommender systems. ACM Trans. Web (TWEB’11) 5(1), 3–34 (2011)Google Scholar
  4. 4.
    Castelli, V., Cover, T.M.: On the exponential value of labeled samples. Pattern Recogn. Lett. 16(1), 105–111 (1995)CrossRefGoogle Scholar
  5. 5.
    Chiang, M.F., Peng, W.C., Yu, P.S.: Exploring latent browsing graph for question answering recommendation. WWWJ (2012). doi: 10.1007/s11280-011-0146-0 Google Scholar
  6. 6.
    Chirita, P.A., Nejdl, W., Zamfir, C.: Preventing shilling attacks in online recommender systems. In: Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management (WIDM’05), pp. 67–74 (2005)Google Scholar
  7. 7.
    Gunawardana, A., Meek, C.: A unified approach to building hybrid recommender systems. In: Proceedings of the Third ACM Conference on Recommender Systems (RecSys’09), pp. 117–124 (2009)Google Scholar
  8. 8.
    Hurley, N., Cheng, Z., Zhang, M.: Statistical attack detection. In: Proceedings of the Third ACM Conference on Recommender Systems (RecSys’09), pp. 149–156 (2009)Google Scholar
  9. 9.
    Lam, S.K., Riedl, J.: Shilling recommender systems for fun and profit. In: Proceedings of the 13th International Conference on World Wide Web (WWW’04), pp. 393–402 (2004)Google Scholar
  10. 10.
    Lee, J., Zhu, D.: Shilling attack detection—a new approach for a trustworthy recommender system. INFORMS J. Comput. 24(1), 117–131 (2012)CrossRefGoogle Scholar
  11. 11.
    Leung, C.W., Chan, S.C., Chung, F., Ngai, G.: A probabilistic rating inference framework for mining user preferences from reviews. WWWJ 14(2), 187–215 (2011)CrossRefGoogle Scholar
  12. 12.
    Manouselis, N., Costopoulou, C.: Analysis and classification of multi-criteria recommender systems. WWWJ 10(4), 415–441 (2007)CrossRefGoogle Scholar
  13. 13.
    Mehta, B., Hofmann, T., Fankhauser, P.: Lies and propaganda: detecting spam users in collaborative filtering. In: Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI’07), pp. 14–21 (2007)Google Scholar
  14. 14.
    Melville, P., Mooney, R.J., Nagarajan, R.: Content-boosted collaborative filtering for improved recommendations. In: 8th National Conference on Artificial Intelligence (AAAI’02), pp. 187–192 (2002)Google Scholar
  15. 15.
    Mobasher, B., Burke, R., Williams, C., Bhaumik, R.: Analysis and detection of segment-focused attacks against collaborative recommendation. In: WebKDD Workshop, pp. 96–118 (2006)Google Scholar
  16. 16.
    Nigam, K., Mccallum, A., Thrun, S., Mitchill, T.: Text classification from labeled and unlabeled documents using em. Machine Learn. 39(2), 103–134 (2000)CrossRefzbMATHGoogle Scholar
  17. 17.
    Shahshahani, B.M., Landgrebe, D.A.: The effect of unlabeled samples in reducing the small sample size problem and mitigating the hughes phenomenon. IEEE Trans. Geosci. Remote Sens. 32(5), 1087–1095 (1994)CrossRefGoogle Scholar
  18. 18.
    Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2005)Google Scholar
  19. 19.
    Williams, C.: Profile injection attack detection for securing collaborative recommender systems. Technical report, DePaul University (2006)Google Scholar
  20. 20.
    Wu, X., Kumar, V., Ross, J.Q., et al.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)CrossRefGoogle Scholar
  21. 21.
    Wu, J., Xiong, H., Chen, J.: COG: local decomposition for rare class analysis. Data Min. Knowl. Discovery 20(2), 191–220 (2010)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Wu, D., Ke, Y., Yu, J.X., Yu, P.S., Chen, L.: Leadership discovery when data correlatively evolve. WWWJ 14(1), 1–25 (2011)CrossRefGoogle Scholar
  23. 23.
    Wu, Z., Cao, J., Mao, B., Wang, Y.: SemiSAD: applying semi-supervised learning to shilling attack detection. In: Proceedings of ACM Conference on Recommender Systems (RecSys’11), pp. 289–292. Chicago, IL, USA (2011)Google Scholar
  24. 24.
    Zheng, Z., Ma, H., Lyu, M.R., King, I.: WSRec: a collaborative filtering based web service recommender system. In: IEEE International Conference on Web Services (ICWS’09), pp. 437–444 (2009)Google Scholar
  25. 25.
    Zhou, Z., Li, M.: Tri-training: exploiting unlabeled data using three classifiers. IEEE Trans. Knowl. Data Eng. (TKDE’05) 17(11), 1529–1541 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Jiangsu Provincial Key Laboratory of E-BusinessNanjing University of Finance and EconomicsNanjingChina
  2. 2.School of Computer Science & MathematicsVictoria UniversityMelbourneAustralia

Personalised recommendations