Abstract
Guest users, single-time clients who use an online service anonymously without prior registration, are common in real-world recommendation applications, requiring industrial recommendation systems to handle the “cold-start” problem in which no existing interactions between new users and recommendable items can be drawn from to make predictions.Prior work addresses this problem by learning profiling user representations to bootstrap recommendations for new users. However, this process can often be invasive, requiring new users to submit personal data, or shallow, yielding unexpressive representations for accurate recommendations. In this work, we propose new representations for guest users based on their “content basket.” A set of seed items is submitted by the user to use the service, allowing each user to be represented as a function of a collection of items. Simultaneously, we design a graph representation space in which items (nodes) are connected by edges that signify joint, written recommendations between items. We propose a graph neural network architecture that inductively learns item and inter-item (edge) representations as a combination of deep language encodings of textual content descriptions and graph embeddings learned via message passing on the edges. This scheme enables effective generalization to items unseen during training. To demonstrate the effectiveness of our model on a real-world setting in which guest users are prevalent, we present a new dataset for anime recommendations, AnimeULike, containing anonymized interactions between 13k users and 10k animes, with an accompanying recommendation engine which can exclusively serve guest users. Our empirical results on AnimeULike and a standard recommender systems benchmark dataset demonstrate significant performance improvements over previous cold-start solutions that do not learn to dynamically represent new users.
Similar content being viewed by others
Availability of data and materials
Our new dataset has been deposited on https://doi.org/10.7910/DVN/PT14ML.
Code availability
Our code for data collection and implemented experiments are on https://github.com/shiningsunnyday/animeulike.
Notes
The name “DropoutNet” is slightly misleading if the substitutions made are \(V_v \leftarrow \) 0 and \(U_u \leftarrow 0\). We keep more flexibility in the choice of the mask function.
It may be the case on AnimeULike, \(U_u\) was basically noise already, so DN couldn’t overfit to U, which caused the user transform approximation to help to an extent.
This is akin to DNN (removed GNN), except the language encoder is differentiable.
visualized using Cytoscape.js.
References
Ahmadian, S., Afsharchi, M., Meghdadi, M.: A novel approach based on multi-view reliability measures to alleviate data sparsity in recommender systems. Multimedia Tools Appl. 78(13), 17763–17798 (2019)
Bernardi, L., Kamps, J., Kiseleva, J., Müller, M.: The continuous cold-start problem in e-commerce recommender systems. CBRecSys@RecSys (2015)
Bobadilla, J., Ortega, F., Hernando, A., Gutiérrez, A.: Recommender systems survey. Knowl.-Based Syst. 46, 109–132 (2013). https://doi.org/10.1016/j.knosys.2013.03.012
Bogers, T., Van den Bosch, A.: Recommending scientific articles using citeulike. In: Proceedings of the 2008 ACM conference on Recommender systems, pp. 287–290 (2008)
Chee, S.H.S., Han, J., Wang, K.: Rectree: an efficient collaborative filtering method. In: International Conference on Data Warehousing and Knowledge Discovery. Springer, pp. 141–151 (2001)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Eksombatchai, C., Jindal, P., Liu, J.Z., Liu, Yuchen, S., Rahul, Sugnet, C., Ulrich, M., Leskovec, J.: Pixie: a system for recommending 3+ billion items to 200+ million users in real-time. In: Proceedings of the 2018 world wide web conference, pp. 1775–1784 (2018)
Gopalan, P., Hofman, J., Blei, D.: Scalable recommendation with hierarchical poisson factorization. In: Proceedings of the thirty-first conference on uncertainty in artificial intelligence, pp. 326–335 (2015)
Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In: NIPS (2017)
Hao, B., Zhang, J., Yin, H., Li, C., Chen, H.: Pre-training graph neural networks for cold-start users and items representation. In: Proceedings Of The 14th ACM International Conference On Web Search And Data Mining, pp. 265–273 (2021)
He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., Wang, M.: LightGCN: simplifying and powering graph convolution network for recommendation. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp. 639–648 (2020). https://doi.org/10.1145/3397271.3401063
Hofmann, T.: Latent semantic models for collaborative filtering. ACM Trans. Inf. Syst. (TOIS) 22(1), 89–115 (2004)
Hu, W., Liu, B, Gomes, J., Zitnik, M., Liang, P., Pande, V., Leskovec, J: Strategies for pre-training graph neural networks. In: International Conference on Learning Representations (2020)
Hu, Y., Koren, Yehuda, Volinsky, Chris: Collaborative filtering for implicit feedback datasets. In: 2008 Eighth IEEE International Conference on Data Mining. IEEE, pp. 263–272 (2008)
Hu, L., Jian, S., Cao, L., Gu, Z., Chen, Q., Amirbekyan, A.: HERS: modeling influential contexts with heterogeneous relations for sparse and cold-start recommendation. Proc. AAAI Conf. Artif. Intell. 33, 3830–3837 (2019). https://doi.org/10.1609/aaai.v33i01.33013830
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (ICLR) (2017)
Nikolakopoulos, A.N., Karypis, G.: RecWalk: Nearly uncoupled random walks for top-N recommendation. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (WSDM ’19). Association for Computing Machinery, New York, NY, USA, pp. 150–158 (2019)
Kouki, P., Schaffer, J., Pujara, J., O’Donovan, J., Getoor, L.: Generating and understanding personalized explanations in hybrid recommender systems. ACM Trans. Interact. Intell. Syst.. 10 (2020,11), https://doi.org/10.1145/3365843
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010). http://jmlr.org/papers/v11/vincent10a.html
Pi, Q., Bian, W., Zhou, G., Zhu, X., Gai, K.: Practice on long sequential user behavior modeling for click-through rate prediction. In: Proceedings Of The 25th ACM SIGKDD International Conference On Knowledge Discovery & Data Mining (2019)
Singh, M.: Scalability and sparsity issues in recommender datasets: a survey. Knowl. Inf. Syst. 62(1), 1–43 (2020)
Su, X., Khoshgoftaar, T.M: Collaborative filtering for multi-class data using belief nets algorithms. In: 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’06). IEEE, pp. 497–504 (2006)
Van Den Oord, A., Dieleman, S., Schrauwen, B.: Deep content-based music recommendation. In: Neural Information Processing Systems Conference (NIPS 2013), Vol. 26. Neural Information Processing Systems Foundation (NIPS) (2013)
Volkovs, M., Yu, G.W., Poutanen, T.: DropoutNet: Addressing Cold Start in Recommender Systems.. In: NIPS, pp. 4957–4966 (2017)
Wang, C., Blei, D.M: Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 448–456 (2011)
Wang, S., Hu, L., Wang, Y., He, X., Sheng, Q., Orgun, M., Cao, L., Ricci, F., Yu, P.: Graph learning based recommender systems: a review. In: Proceedings of The Thirtieth International Joint Conference On Artificial Intelligence, IJCAI-21. pp. 4644–4652 (2021). https://doi.org/10.24963/ijcai.2021/630, Survey Track
Lika, B., Kolomvatsos, K., Hadjiefthymiades, S.: Facing the cold start problem in recommender systems. Expert Syst. Appl. 41, 2065–2073 (2014). https://doi.org/10.1016/j.eswa.2013.09.005
Wang, H., Wang, N., Yeung, D.-Y.: Collaborative deep learning for recommender systems. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1235–1244 (2015)
Wang, H., Zhang, F., Zhang, M., Leskovec, J., Zhao, M., Li, W., Wang, Z.: Knowledge-aware graph neural networks with label smoothness regularization for recommender systems. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19). Association for Computing Machinery, New York, NY, USA, pp. 968–977 (2019)
Wang, S., Hu, L., Cao, L.: Perceiving the next choice with comprehensive transaction embeddings for online recommendation. ECML/PKDD (2017)
Wang, S., Hu, L., Cao, L., Huang, X., Lian, D., Liu, W.: Attention-based transactional context embedding for next-item recommendation. In: AAAI Conference On Artificial Intelligence (2018)
Wang, X., He, X., Cao, Y., Liu, M., Chua, T.: KGAT: knowledge graph attention network for recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference On Knowledge Discovery and Data Mining, pp. 950–958 (2019)
Wang, X., He, X., Wang, M., Feng, F., Chua, T.: Neural graph collaborative filtering. In: Proceedings of The 42nd International ACM SIGIR Conference on Research and Development In Information Retrieval. pp. 165–174 (2019). https://doi.org/10.1145/3331184.3331267
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T. Louf, R., Funtowicz, M., et al.: HuggingFace’s transformers: state-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019)
Wu, C.-Y., Ahmed, A., Beutel, A., Smola, A.J., Jing, H.: Recurrent recommender networks. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 495–503 (2017)
Xue, H.-J., Dai, X., Zhang, J., Huang, S., Chen, J.: Deep matrix factorization models for recommender systems. In: IJCAI, vol. 17, pp. 3203–3209. Melbourne, Australia (2017)
Ying, R., He, R., Chen, K., Eksombatchai, P., Hamilton, W.L., Leskovec, J.: Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 974–983 (2018)
Zhang, J., Shi, X., Zhao, S., King, I.: STAR-GCN: stacked and reconstructed graph convolutional networks for recommender systems. In: International Joint Conference On Artificial Intelligence (2019)
Huang, Z., Zeng, D.: Why does collaborative filtering work? Recommendation model validation and selection by analyzing bipartite random graphs. INFORMS J. Comput. 23, 138–152 (2011)
Zhang, Z.-K., Liu, C., Zhang, Y.-C., Zhou, T.: Solving the cold-start problem in recommender systems with social tags. EPL (Europhys. Lett.) 92, 28002 (2010)
Zhang, M., Chen, Y.: Inductive matrix completion based on graph neural networks. In: International Conference on Learning Representations (2020)
Zhu, Z., Sefati, S., Saadatpanah, P., Caverlee, J.: Recommendation for new users and new items via randomized training and mixture-of-experts transformation. In: Proceedings Of The 43rd International ACM SIGIR Conference On Research And Development In Information Retrieval (2020)
Acknowledgements
We would like to thank Stanford professors Dr. Jure Leskovec, Dr. Chris Manning and EPFL professor Dr. Antoine Bosselut for their mentoring and support for the project. We also want to acknowledge the Stanford undergraduate students who helped during the initial phases of the research when it was a course project. Finally, the experimental results would not have been possible without the computational resources of the Stanford Network Analysis Project while the authors were students in the group.
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Contributions
Michael Sun made the most significant contribution to all steps in the research process, including the topic conception, the code implementation and the results presentation. He built the data curation pipeline, made the new dataset, implemented the recommendation framework, carried out the experiments, designed the ablation study, and analyzed the findings. He wrote the first draft of the manuscript. He also built and currently maintains the live website https://otakuroll.net/ showcasing the model. The contributing author contributed to the Introduction and Relevant Work sections of the manuscript. He brought valuable insights and background knowledge to the project, and helped proofread and revise the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file 1 (mp4 234856 KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, M., Wang, A. Privacy preserving cold-start recommendation for out-of-matrix users via content baskets. Int J Data Sci Anal 16, 237–253 (2023). https://doi.org/10.1007/s41060-023-00388-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41060-023-00388-7