Abstract
In recommender systems, supervised information is usually obtained from the historical data of users. For example, if a user watched a movie, then the user-movie pair will be marked as positive. On the other hand, the user-movie pairs did not appear in the historical data could be either positive or negative. This phenomenon motivates us to formalize the recommender task as a Positive Unlabeled learning problem. As the model trained on the biased historical data may not generalize well on future data, we propose an active learning approach to improve the model by querying further labels from the unlabeled data pool. With the target of querying as few instances as possible, an active selection strategy is proposed to minimize the expected loss and match the distribution between labeled and unlabeled data. Experiments are performed on both classification datasets and movie recommendation dataset. Results demonstrate that the proposed approach can significantly reduce the labeling cost while achieving superior performance regarding multiple criteria.
This is a preview of subscription content, access via your institution.



References
Bezdek JC, Hathaway RJ (2003) Convergence of alternating optimization. Neural Parallel Sci Comput 11(4):351–368
Cai JJ, Tang J, Chen QG, Hu Y, Wang X, Huang SJ (2019) Multi-view active learning for video recommendation. In: IJCAI, pp 2053–2059
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):1–27
Denis F (1998) Pac learning from positive statistical queries. In: International Conference on Algorithmic Learning Theory, Springer, pp 112–126
Du Plessis M, Niu G, Sugiyama M (2015) Convex formulation for learning from positive and unlabeled data. In: International conference on machine learning, pp 1386–1394
Du Plessis MC, Niu G, Sugiyama M (2014) Analysis of learning from positive and unlabeled data. In: Advances in neural information processing systems, pp 703–711
Elkan C, Noto K (2008) Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 213–220
Floudas CA, Visweswaran V (1995) Quadratic optimization. In: Handbook of global optimization, Springer, pp 217–269
Ghasemi A, Rabiee HR, Fadaee M, Manzuri MT, Rohban MH (2011) Active learning from positive and unlabeled data. In: 2011 IEEE 11th International Conference on Data Mining Workshops, IEEE, pp 244–250
He G, Duan Y, Li Y, Qian T, He J, Jia X (2015) Active learning for multivariate time series classification with positive unlabeled data. In: 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), IEEE, pp 178–185
Hsieh CJ, Natarajan N, Dhillon I (2015) Pu learning for matrix completion. In: International Conference on Machine Learning, pp 2445–2453
Huang SJ, Chen S (2016) Transfer learning with active queries from source domain. In: IJCAI, pp 1592–1598
Huang SJ, Zhou ZH (2013) Active query driven by uncertainty and diversity for incremental multi-label learning. In: 2013 IEEE 13th International Conference on Data Mining, IEEE, pp 1079–1084
Huang SJ, Chen S, Zhou ZH (2015) Multi-label active learning: Query type matters. In: Twenty-Fourth International Joint Conference on Artificial Intelligence
Kelly D, Teevan J (2003) Implicit feedback for inferring user preference: a bibliography. ACM Sigir Forum 37(2):18–28
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882
Kiryo R, Niu G, Du Plessis MC, Sugiyama M (2017) Positive-unlabeled learning with non-negative risk estimator. In: Advances in neural information processing systems, pp 1675–1685
Li X, Liu B (2003) Learning to classify texts using positive and unlabeled data. IJCAI 3:587–592
Liu B, Lee WS, Yu PS, Li X (2002) Partially supervised classification of text documents. ICML 2:387–394
Niu G, du Plessis MC, Sakai T, Ma Y, Sugiyama M (2016) Theoretical comparisons of positive-unlabeled learning against positive-negative learning. In: Advances in neural information processing systems, pp 1199–1207
Pan R, Zhou Y, Cao B, Liu NN, Lukose R, Scholz M, Yang Q (2008) One-class collaborative filtering. In: 2008 Eighth IEEE International Conference on Data Mining, IEEE, pp 502–511
Settles B (2009) Active learning literature survey. Tech. rep., University of Wisconsin-Madison Department of Computer Sciences
Steck H (2010) Training and testing of recommender systems on data missing not at random. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 713–722
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
COI Affiliations: Nanjing University (nju.edu.cn) Nanjing University of Aeronautics and Astronautics (nuaa.edu.cn) There is no other conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, JL., Cai, JJ., Jiang, Y. et al. PU Active Learning for Recommender Systems. Neural Process Lett 53, 3639–3652 (2021). https://doi.org/10.1007/s11063-021-10496-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-021-10496-9
Keywords
- PU learning
- Active learning
- Recommender systems
- Implicit feedback