Machine Learning

, Volume 106, Issue 6, pp 771–798 | Cite as

Stream-based semi-supervised learning for recommender systems



To alleviate the problem of data sparsity inherent to recommender systems, we propose a semi-supervised framework for stream-based recommendations. Our framework uses abundant unlabelled information to improve the quality of recommendations. We extend a state-of-the-art matrix factorization algorithm by the ability to add new dimensions to the matrix at runtime and implement two approaches to semi-supervised learning: co-training and self-learning. We introduce a new evaluation protocol including statistical testing and parameter optimization. We then evaluate our framework on five real-world datasets in a stream setting. On all of the datasets our method achieves statistically significant improvements in the quality of recommendations.


Recommender systems Semi-supervised learning Matrix factorization Collaborative filtering Stream mining 


  1. Bosnić, Z., Demšar, J., Kešpret, G., Rodrigues, P. P., Gama, J., & Kononenko, I. (2014). Enhancing data stream predictions with reliability estimators and explanation. Engineering Applications of Artificial Intelligence, 34, 178–192.CrossRefGoogle Scholar
  2. Christakou, C., Lefakis, L., Vrettos, S., & Stafylopatis, A. (2005). A movie recommender system based on semi-supervised clustering. CIMCA/IAWTIC, 2, 897–903.Google Scholar
  3. Cremonesi, P., Koren, Y., & Turrin, R. (2010). Performance of recommender algorithms on top-n recommendation tasks. In RecSys ’10. ACM.Google Scholar
  4. de Souza, V. M. A., Silva, D. F., Gama, J., & Batista, G. E. A. P. A. (2015). Data stream classification guided by clustering on nonstationary environments and extreme verification latency. In S. Venkatasubramanian & J. Ye (Eds.), SDM (pp. 873–881). SIAM.Google Scholar
  5. Deshpande, M., & Karypis, G. (2004). Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS), 22(1), 143–177.CrossRefGoogle Scholar
  6. Desrosiers, C., & Karypis, G. (2011). A comprehensive survey of neighborhood-based recommendation methods. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender systems handbook (pp. 107–144). Berlin: Springer.CrossRefGoogle Scholar
  7. Dyer, K. B., Capo, R., & Polikar, R. (2014). COMPOSE: A semisupervised learning framework for initially labeled nonstationary streaming data. IEEE Transactions on Neural Networks and Learning Systems, 25(1), 12–26.CrossRefGoogle Scholar
  8. Gama, J., Sebastião, R., & Rodrigues, P. P. (2009). Issues in evaluation of stream learning algorithms. In KDD. ACM.Google Scholar
  9. Goldberg, D., Nichols, D., Oki, B. M., & Terry, D. (1992). Using collaborative filtering to weave an information tapestry. Communications of the ACM, 35(12), 61–70.CrossRefGoogle Scholar
  10. Halchenko, Y. O., & Hanke, M. (2012). Open is not enough. let’s take the next step: An integrated, community-driven computing platform for neuroscience. Frontiers in Neuroinformatics, 6, 22.Google Scholar
  11. Harper, F. M., & Konstan, J. A. (2016). The MovieLens datasets: History and context. TiiS, 5(4), 19.Google Scholar
  12. Hernando, A., Bobadilla, J., Ortega, F., & Tejedor, J. (2013). Incorporating reliability measurements into the predictions of a recommender system. Information Sciences, 218, 1–16.MathSciNetCrossRefGoogle Scholar
  13. Koren, Y. (2008). Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 426–434). ACM.Google Scholar
  14. Koren, Y. (2009). Collaborative filtering with temporal dynamics. In KDD ’09. ACM.Google Scholar
  15. Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30–37.CrossRefGoogle Scholar
  16. Linden, G., Smith, B., & York, J. (2003). Amazon.Com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 7, 76–80.Google Scholar
  17. Massa, P., & Avesani, P. (2006). Trust-aware bootstrapping of recommender systems. In ECAI workshop on recommender systems (pp. 29–33).Google Scholar
  18. Matuszyk, P., & Spiliopoulou, M. (2014a). Hoeffding-CF: Neighbourhood-based recommendations on reliably similar users. In V. Dimitrova, T. Kuflik, D. Chin, F. Ricci, P. Dolog & G. J. Houben (Eds.), User modeling, adaptation, and personalization, Lecture notes in computer science (Vol. 8538, pp. 146–157). Springer.Google Scholar
  19. Matuszyk, P., & Spiliopoulou, M. (2014b). Selective forgetting for incremental matrix factorization in recommender systems. In Discovery science, LNCS. Springer.Google Scholar
  20. Matuszyk, P., & Spiliopoulou, M. (2015). Semi-supervised learning for stream recommender systems. In: N. Japkowicz & S. Matwin (Eds.), Discovery science, Lecture notes in computer science (Vol. 9356, pp. 131–145). Springer.Google Scholar
  21. Matuszyk, P., Vinagre, J., Spiliopoulou, M., Jorge, A. M., & Gama, J. (2015). Forgetting methods for incremental matrix factorization in recommender systems. In Proceedings of the SAC’15 conference. ACM.Google Scholar
  22. McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2), 153–157.CrossRefGoogle Scholar
  23. Preisach, C., Marinho, L.B., & Schmidt-Thieme, L. (2010). Semi-supervised tag recommendation—Using untagged resources to mitigate cold-start problems. In M. J. Zaki, J. X. Yu, B. Ravindran & V. Pudi (Eds.), Advances in knowledge discovery and data mining, Lecture notes in computer science (Vol. 6118, pp. 348–357). Springer.Google Scholar
  24. Rodrigues, P. P., Gama, J., & Bosnic, Z. (2008). Online reliability estimates for individual predictions in data streams. In ICDM workshops (pp. 36–45). IEEE Computer Society.Google Scholar
  25. Rosenberg, C., Hebert, M., & Schneiderman, H. (2005). Semi-supervised self-training of object detection models. In WACV/MOTION (pp. 29–36). IEEE Computer Society.Google Scholar
  26. Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web, WWW ’01 (pp. 285–295). ACM, New York.Google Scholar
  27. Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2002). Incremental singular value decomposition algorithms for highly scalable recommender systems. In Fifth international conference on computer and information science (pp. 27–28).Google Scholar
  28. Shaffer, J. P. (1995). Multiple hypothesis testing. Annual Review of Psychology, 46(1), 561–584.CrossRefGoogle Scholar
  29. Sindhwani, V., Niyogi, P., & Belkin, M. (2005). A co-regularized approach to semi-supervised learning with multiple views. In Proceedings of the ICML workshop on learning with multiple views.Google Scholar
  30. Su, X., & Khoshgoftaar, T. (2009). A survey of collaborative filtering techniques. Advances in Artificial Intelligence, 2009, 4.CrossRefGoogle Scholar
  31. Takács, G., Pilászy, I., Németh, B., & Tikk, D. (2009). Scalable collaborative filtering approaches for large recommender systems. Journal of Machine Learning Research, 10, 623–656.Google Scholar
  32. Zhang, M., Tang, J., Zhang, X., & Xue, X. (2014). Addressing cold start in recommender systems: A semi-supervised co-training algorithm. In SIGIR. ACM.Google Scholar
  33. Zhou, Z. H., & Li, M. (2007). Semisupervised regression with cotraining-style algorithms. IEEE Transactions on Knowledge and Data Engineering, 19(11), 1479–1493.Google Scholar
  34. Zhou, Z. H., Zhan, D. C., & Qiang, Y. (2007). Semi-supervised learning with very few labeled training examples. In AAAI (pp. 675–680). AAAI Press.Google Scholar
  35. Zhu, T., Hu, B., Yan, J., & Li, X. (2010). Semi-supervised learning for personalized web recommender system. Computing and Informatics, 29(4), 617–627.MATHGoogle Scholar
  36. Zhu, X. (2005). Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison.Google Scholar

Copyright information

© The Author(s) 2017

Authors and Affiliations

  1. 1.Otto-von-Guericke-University MagdeburgMagdeburgGermany

Personalised recommendations