Skip to main content
Log in

Privacy preserving cold-start recommendation for out-of-matrix users via content baskets

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Guest users, single-time clients who use an online service anonymously without prior registration, are common in real-world recommendation applications, requiring industrial recommendation systems to handle the “cold-start” problem in which no existing interactions between new users and recommendable items can be drawn from to make predictions.Prior work addresses this problem by learning profiling user representations to bootstrap recommendations for new users. However, this process can often be invasive, requiring new users to submit personal data, or shallow, yielding unexpressive representations for accurate recommendations. In this work, we propose new representations for guest users based on their “content basket.” A set of seed items is submitted by the user to use the service, allowing each user to be represented as a function of a collection of items. Simultaneously, we design a graph representation space in which items (nodes) are connected by edges that signify joint, written recommendations between items. We propose a graph neural network architecture that inductively learns item and inter-item (edge) representations as a combination of deep language encodings of textual content descriptions and graph embeddings learned via message passing on the edges. This scheme enables effective generalization to items unseen during training. To demonstrate the effectiveness of our model on a real-world setting in which guest users are prevalent, we present a new dataset for anime recommendations, AnimeULike, containing anonymized interactions between  13k users and 10k animes, with an accompanying recommendation engine which can exclusively serve guest users. Our empirical results on AnimeULike and a standard recommender systems benchmark dataset demonstrate significant performance improvements over previous cold-start solutions that do not learn to dynamically represent new users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Availability of data and materials

Our new dataset has been deposited on https://doi.org/10.7910/DVN/PT14ML.

Code availability

Our code for data collection and implemented experiments are on https://github.com/shiningsunnyday/animeulike.

Notes

  1. The name “DropoutNet” is slightly misleading if the substitutions made are \(V_v \leftarrow \) 0 and \(U_u \leftarrow 0\). We keep more flexibility in the choice of the mask function.

  2. It may be the case on AnimeULike, \(U_u\) was basically noise already, so DN couldn’t overfit to U, which caused the user transform approximation to help to an extent.

  3. This is akin to DNN (removed GNN), except the language encoder is differentiable.

  4. http://otakuroll.net.

  5. https://www.anirec.net/

  6. visualized using Cytoscape.js.

References

  1. Ahmadian, S., Afsharchi, M., Meghdadi, M.: A novel approach based on multi-view reliability measures to alleviate data sparsity in recommender systems. Multimedia Tools Appl. 78(13), 17763–17798 (2019)

    Article  Google Scholar 

  2. Bernardi, L., Kamps, J., Kiseleva, J., Müller, M.: The continuous cold-start problem in e-commerce recommender systems. CBRecSys@RecSys (2015)

  3. Bobadilla, J., Ortega, F., Hernando, A., Gutiérrez, A.: Recommender systems survey. Knowl.-Based Syst. 46, 109–132 (2013). https://doi.org/10.1016/j.knosys.2013.03.012

    Article  Google Scholar 

  4. Bogers, T., Van den Bosch, A.: Recommending scientific articles using citeulike. In: Proceedings of the 2008 ACM conference on Recommender systems, pp. 287–290 (2008)

  5. Chee, S.H.S., Han, J., Wang, K.: Rectree: an efficient collaborative filtering method. In: International Conference on Data Warehousing and Knowledge Discovery. Springer, pp. 141–151 (2001)

  6. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  7. Eksombatchai, C., Jindal, P., Liu, J.Z., Liu, Yuchen, S., Rahul, Sugnet, C., Ulrich, M., Leskovec, J.: Pixie: a system for recommending 3+ billion items to 200+ million users in real-time. In: Proceedings of the 2018 world wide web conference, pp. 1775–1784 (2018)

  8. Gopalan, P., Hofman, J., Blei, D.: Scalable recommendation with hierarchical poisson factorization. In: Proceedings of the thirty-first conference on uncertainty in artificial intelligence, pp. 326–335 (2015)

  9. Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In: NIPS (2017)

  10. Hao, B., Zhang, J., Yin, H., Li, C., Chen, H.: Pre-training graph neural networks for cold-start users and items representation. In: Proceedings Of The 14th ACM International Conference On Web Search And Data Mining, pp. 265–273 (2021)

  11. He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., Wang, M.: LightGCN: simplifying and powering graph convolution network for recommendation. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp. 639–648 (2020). https://doi.org/10.1145/3397271.3401063

  12. Hofmann, T.: Latent semantic models for collaborative filtering. ACM Trans. Inf. Syst. (TOIS) 22(1), 89–115 (2004)

    Article  Google Scholar 

  13. Hu, W., Liu, B, Gomes, J., Zitnik, M., Liang, P., Pande, V., Leskovec, J: Strategies for pre-training graph neural networks. In: International Conference on Learning Representations (2020)

  14. Hu, Y., Koren, Yehuda, Volinsky, Chris: Collaborative filtering for implicit feedback datasets. In: 2008 Eighth IEEE International Conference on Data Mining. IEEE, pp. 263–272 (2008)

  15. Hu, L., Jian, S., Cao, L., Gu, Z., Chen, Q., Amirbekyan, A.: HERS: modeling influential contexts with heterogeneous relations for sparse and cold-start recommendation. Proc. AAAI Conf. Artif. Intell. 33, 3830–3837 (2019). https://doi.org/10.1609/aaai.v33i01.33013830

    Article  Google Scholar 

  16. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (ICLR) (2017)

  17. Nikolakopoulos, A.N., Karypis, G.: RecWalk: Nearly uncoupled random walks for top-N recommendation. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (WSDM ’19). Association for Computing Machinery, New York, NY, USA, pp. 150–158 (2019)

  18. Kouki, P., Schaffer, J., Pujara, J., O’Donovan, J., Getoor, L.: Generating and understanding personalized explanations in hybrid recommender systems. ACM Trans. Interact. Intell. Syst.. 10 (2020,11), https://doi.org/10.1145/3365843

  19. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010). http://jmlr.org/papers/v11/vincent10a.html

  20. Pi, Q., Bian, W., Zhou, G., Zhu, X., Gai, K.: Practice on long sequential user behavior modeling for click-through rate prediction. In: Proceedings Of The 25th ACM SIGKDD International Conference On Knowledge Discovery & Data Mining (2019)

  21. Singh, M.: Scalability and sparsity issues in recommender datasets: a survey. Knowl. Inf. Syst. 62(1), 1–43 (2020)

    Article  Google Scholar 

  22. Su, X., Khoshgoftaar, T.M: Collaborative filtering for multi-class data using belief nets algorithms. In: 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’06). IEEE, pp. 497–504 (2006)

  23. Van Den Oord, A., Dieleman, S., Schrauwen, B.: Deep content-based music recommendation. In: Neural Information Processing Systems Conference (NIPS 2013), Vol. 26. Neural Information Processing Systems Foundation (NIPS) (2013)

  24. Volkovs, M., Yu, G.W., Poutanen, T.: DropoutNet: Addressing Cold Start in Recommender Systems.. In: NIPS, pp. 4957–4966 (2017)

  25. Wang, C., Blei, D.M: Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 448–456 (2011)

  26. Wang, S., Hu, L., Wang, Y., He, X., Sheng, Q., Orgun, M., Cao, L., Ricci, F., Yu, P.: Graph learning based recommender systems: a review. In: Proceedings of The Thirtieth International Joint Conference On Artificial Intelligence, IJCAI-21. pp. 4644–4652 (2021). https://doi.org/10.24963/ijcai.2021/630, Survey Track

  27. Lika, B., Kolomvatsos, K., Hadjiefthymiades, S.: Facing the cold start problem in recommender systems. Expert Syst. Appl. 41, 2065–2073 (2014). https://doi.org/10.1016/j.eswa.2013.09.005

    Article  Google Scholar 

  28. Wang, H., Wang, N., Yeung, D.-Y.: Collaborative deep learning for recommender systems. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1235–1244 (2015)

  29. Wang, H., Zhang, F., Zhang, M., Leskovec, J., Zhao, M., Li, W., Wang, Z.: Knowledge-aware graph neural networks with label smoothness regularization for recommender systems. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19). Association for Computing Machinery, New York, NY, USA, pp. 968–977 (2019)

  30. Wang, S., Hu, L., Cao, L.: Perceiving the next choice with comprehensive transaction embeddings for online recommendation. ECML/PKDD (2017)

  31. Wang, S., Hu, L., Cao, L., Huang, X., Lian, D., Liu, W.: Attention-based transactional context embedding for next-item recommendation. In: AAAI Conference On Artificial Intelligence (2018)

  32. Wang, X., He, X., Cao, Y., Liu, M., Chua, T.: KGAT: knowledge graph attention network for recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference On Knowledge Discovery and Data Mining, pp. 950–958 (2019)

  33. Wang, X., He, X., Wang, M., Feng, F., Chua, T.: Neural graph collaborative filtering. In: Proceedings of The 42nd International ACM SIGIR Conference on Research and Development In Information Retrieval. pp. 165–174 (2019). https://doi.org/10.1145/3331184.3331267

  34. Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T. Louf, R., Funtowicz, M., et al.: HuggingFace’s transformers: state-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019)

  35. Wu, C.-Y., Ahmed, A., Beutel, A., Smola, A.J., Jing, H.: Recurrent recommender networks. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 495–503 (2017)

  36. Xue, H.-J., Dai, X., Zhang, J., Huang, S., Chen, J.: Deep matrix factorization models for recommender systems. In: IJCAI, vol. 17, pp. 3203–3209. Melbourne, Australia (2017)

    Google Scholar 

  37. Ying, R., He, R., Chen, K., Eksombatchai, P., Hamilton, W.L., Leskovec, J.: Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 974–983 (2018)

  38. Zhang, J., Shi, X., Zhao, S., King, I.: STAR-GCN: stacked and reconstructed graph convolutional networks for recommender systems. In: International Joint Conference On Artificial Intelligence (2019)

  39. Huang, Z., Zeng, D.: Why does collaborative filtering work? Recommendation model validation and selection by analyzing bipartite random graphs. INFORMS J. Comput. 23, 138–152 (2011)

    Article  MATH  Google Scholar 

  40. Zhang, Z.-K., Liu, C., Zhang, Y.-C., Zhou, T.: Solving the cold-start problem in recommender systems with social tags. EPL (Europhys. Lett.) 92, 28002 (2010)

    Article  Google Scholar 

  41. Zhang, M., Chen, Y.: Inductive matrix completion based on graph neural networks. In: International Conference on Learning Representations (2020)

  42. Zhu, Z., Sefati, S., Saadatpanah, P., Caverlee, J.: Recommendation for new users and new items via randomized training and mixture-of-experts transformation. In: Proceedings Of The 43rd International ACM SIGIR Conference On Research And Development In Information Retrieval (2020)

Download references

Acknowledgements

We would like to thank Stanford professors Dr. Jure Leskovec, Dr. Chris Manning and EPFL professor Dr. Antoine Bosselut for their mentoring and support for the project. We also want to acknowledge the Stanford undergraduate students who helped during the initial phases of the research when it was a course project. Finally, the experimental results would not have been possible without the computational resources of the Stanford Network Analysis Project while the authors were students in the group.

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Authors

Contributions

Michael Sun made the most significant contribution to all steps in the research process, including the topic conception, the code implementation and the results presentation. He built the data curation pipeline, made the new dataset, implemented the recommendation framework, carried out the experiments, designed the ablation study, and analyzed the findings. He wrote the first draft of the manuscript. He also built and currently maintains the live website https://otakuroll.net/ showcasing the model. The contributing author contributed to the Introduction and Relevant Work sections of the manuscript. He brought valuable insights and background knowledge to the project, and helped proofread and revise the manuscript.

Corresponding author

Correspondence to Michael Sun.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 234856 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, M., Wang, A. Privacy preserving cold-start recommendation for out-of-matrix users via content baskets. Int J Data Sci Anal 16, 237–253 (2023). https://doi.org/10.1007/s41060-023-00388-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-023-00388-7

Keywords

Navigation