Skip to main content
Log in

Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system

  • Published:
World Wide Web Aims and scope Submit manuscript


Collaborative filtering (CF) technique is capable of generating personalized recommendations. However, the recommender systems utilizing CF as their key algorithms are vulnerable to shilling attacks which insert malicious user profiles into the systems to push or nuke the reputations of targeted items. There are only a small number of labeled users in most of the practical recommender systems, while a large number of users are unlabeled because it is expensive to obtain their identities. In this paper, Semi-SAD, a new semi-supervised learning based shilling attack detection algorithm is proposed to take advantage of both types of data. It first trains a naïve Bayes classifier on a small set of labeled users, and then incorporates unlabeled users with EM-λ to improve the initial naïve Bayes classifier. Experiments on MovieLens datasets are implemented to compare the efficiency of Semi-SAD with supervised learning based detector and unsupervised learning based detector. The results indicate that Semi-SAD can better detect various kinds of shilling attacks than others, especially against obfuscated and hybrid shilling attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others


  1. Bell, R.M., Koren, Y.: Improved neighborhood-based collaborative filtering. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07), pp. 7–14 (2007)

  2. Burke, R., Mobasher, B., et al.: Classification features for attack detection in collaborative recommendation systems. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06), pp. 542–547 (2006)

  3. Cacheda, F., Carneiro, V., Fernandez, D., Formoso, V.: Comparison of collaborative filtering algorithms: limitations of current techniques and proposals for scalable, high-performance recommender systems. ACM Trans. Web (TWEB’11) 5(1), 3–34 (2011)

    Google Scholar 

  4. Castelli, V., Cover, T.M.: On the exponential value of labeled samples. Pattern Recogn. Lett. 16(1), 105–111 (1995)

    Article  Google Scholar 

  5. Chiang, M.F., Peng, W.C., Yu, P.S.: Exploring latent browsing graph for question answering recommendation. WWWJ (2012). doi:10.1007/s11280-011-0146-0

    Google Scholar 

  6. Chirita, P.A., Nejdl, W., Zamfir, C.: Preventing shilling attacks in online recommender systems. In: Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management (WIDM’05), pp. 67–74 (2005)

  7. Gunawardana, A., Meek, C.: A unified approach to building hybrid recommender systems. In: Proceedings of the Third ACM Conference on Recommender Systems (RecSys’09), pp. 117–124 (2009)

  8. Hurley, N., Cheng, Z., Zhang, M.: Statistical attack detection. In: Proceedings of the Third ACM Conference on Recommender Systems (RecSys’09), pp. 149–156 (2009)

  9. Lam, S.K., Riedl, J.: Shilling recommender systems for fun and profit. In: Proceedings of the 13th International Conference on World Wide Web (WWW’04), pp. 393–402 (2004)

  10. Lee, J., Zhu, D.: Shilling attack detection—a new approach for a trustworthy recommender system. INFORMS J. Comput. 24(1), 117–131 (2012)

    Article  Google Scholar 

  11. Leung, C.W., Chan, S.C., Chung, F., Ngai, G.: A probabilistic rating inference framework for mining user preferences from reviews. WWWJ 14(2), 187–215 (2011)

    Article  Google Scholar 

  12. Manouselis, N., Costopoulou, C.: Analysis and classification of multi-criteria recommender systems. WWWJ 10(4), 415–441 (2007)

    Article  Google Scholar 

  13. Mehta, B., Hofmann, T., Fankhauser, P.: Lies and propaganda: detecting spam users in collaborative filtering. In: Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI’07), pp. 14–21 (2007)

  14. Melville, P., Mooney, R.J., Nagarajan, R.: Content-boosted collaborative filtering for improved recommendations. In: 8th National Conference on Artificial Intelligence (AAAI’02), pp. 187–192 (2002)

  15. Mobasher, B., Burke, R., Williams, C., Bhaumik, R.: Analysis and detection of segment-focused attacks against collaborative recommendation. In: WebKDD Workshop, pp. 96–118 (2006)

  16. Nigam, K., Mccallum, A., Thrun, S., Mitchill, T.: Text classification from labeled and unlabeled documents using em. Machine Learn. 39(2), 103–134 (2000)

    Article  MATH  Google Scholar 

  17. Shahshahani, B.M., Landgrebe, D.A.: The effect of unlabeled samples in reducing the small sample size problem and mitigating the hughes phenomenon. IEEE Trans. Geosci. Remote Sens. 32(5), 1087–1095 (1994)

    Article  Google Scholar 

  18. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2005)

  19. Williams, C.: Profile injection attack detection for securing collaborative recommender systems. Technical report, DePaul University (2006)

  20. Wu, X., Kumar, V., Ross, J.Q., et al.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)

    Article  Google Scholar 

  21. Wu, J., Xiong, H., Chen, J.: COG: local decomposition for rare class analysis. Data Min. Knowl. Discovery 20(2), 191–220 (2010)

    Article  MathSciNet  Google Scholar 

  22. Wu, D., Ke, Y., Yu, J.X., Yu, P.S., Chen, L.: Leadership discovery when data correlatively evolve. WWWJ 14(1), 1–25 (2011)

    Article  Google Scholar 

  23. Wu, Z., Cao, J., Mao, B., Wang, Y.: SemiSAD: applying semi-supervised learning to shilling attack detection. In: Proceedings of ACM Conference on Recommender Systems (RecSys’11), pp. 289–292. Chicago, IL, USA (2011)

  24. Zheng, Z., Ma, H., Lyu, M.R., King, I.: WSRec: a collaborative filtering based web service recommender system. In: IEEE International Conference on Web Services (ICWS’09), pp. 437–444 (2009)

  25. Zhou, Z., Li, M.: Tri-training: exploiting unlabeled data using three classifiers. IEEE Trans. Knowl. Data Eng. (TKDE’05) 17(11), 1529–1541 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Zhiang Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, J., Wu, Z., Mao, B. et al. Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system. World Wide Web 16, 729–748 (2013).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: