Information Retrieval Journal

, Volume 20, Issue 2, pp 109–131 | Cite as

Neural Semantic Personalized Ranking for item cold-start recommendation

  • Travis Ebesu
  • Yi Fang


Recommender systems help users deal with information overload and enjoy a personalized experience on the Web. One of the main challenges in these systems is the item cold-start problem which is very common in practice since modern online platforms have thousands of new items published every day. Furthermore, in many real-world scenarios, the item recommendation tasks are based on users’ implicit preference feedback such as whether a user has interacted with an item. To address the above challenges, we propose a probabilistic modeling approach called Neural Semantic Personalized Ranking (NSPR) to unify the strengths of deep neural network and pairwise learning. Specifically, NSPR tightly couples a latent factor model with a deep neural network to learn a robust feature representation from both implicit feedback and item content, consequently allowing our model to generalize to unseen items. We demonstrate NSPR’s versatility to integrate various pairwise probability functions and propose two variants based on the Logistic and Probit functions. We conduct a comprehensive set of experiments on two real-world public datasets and demonstrate that NSPR significantly outperforms the state-of-the-art baselines.


Recommender systems Deep neural network Implicit feedback Pairwise learning 


  1. Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016). Layer normalization. arXiv preprint arXiv:160706450.
  2. Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3(Feb), 1137–1155.zbMATHGoogle Scholar
  3. Bennett, J., & Lanning, S. (2007). The netflix prize. In SIGKDD Cup (Vol. 2007, p. 35).Google Scholar
  4. Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In COMPSTAT (pp. 177–186). Berlin: Springer.Google Scholar
  5. Chen, T., Zhang, W., Lu, Q., Chen, K., Zheng, Z., & Yu, Y. (2012). Svdfeature: A toolkit for feature-based collaborative filtering. Journal of Machine Learning Research, 13(1), 3619–3622.MathSciNetzbMATHGoogle Scholar
  6. Cheng, C., Yang, H., King, I., & Lyu, M. R. (2012). Fused matrix factorization with geographical and social influence in location-based social networks. In AAAI.Google Scholar
  7. Cheng, H. T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., Anderson, G., Corrado, G., Chai, W., Ispir, M. et al. (2016). Wide & deep learning for recommender systems. arXiv preprint arXiv:160607792
  8. Deng, L., & Yu, D. (2014). Deep learning: Methods and applications. Foundations and Trends in Signal Processing, 7(3–4), 197–387.MathSciNetCrossRefzbMATHGoogle Scholar
  9. Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12, 2121–2159.MathSciNetzbMATHGoogle Scholar
  10. Dziugaite, G. K., & Roy, D. M. (2015). Neural network matrix factorization. CoRR arXiv:1511.06443.
  11. Gantner, Z., Drumond, L., Freudenthaler, C., & Schmidt-Thieme, L. (2012). Bayesian personalized ranking for non-uniformly sampled items. Journal of Machine Learning Research, 18, 231–247.Google Scholar
  12. Georgiev, K., & Nakov, P. (2013). A non-iid framework for collaborative filtering with restricted boltzmann machines. In ICML (pp. 1148–1156).Google Scholar
  13. He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017). Neural collaborative filtering. In Proceedings of the 26th international world wide web conference.Google Scholar
  14. Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97.CrossRefGoogle Scholar
  15. Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554.MathSciNetCrossRefzbMATHGoogle Scholar
  16. Hu, Y., Koren, Y., & Volinsky, C. (2008). Collaborative filtering for implicit feedback datasets. In ICDM (pp. 263–272). IEEE.Google Scholar
  17. Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:150203167
  18. Jing, H., & Smola, A. J. (2017). Neural survival recommender. In Proceedings of the tenth ACM international conference on web search and data mining (WSDM) (pp. 515–524). New York, NY: ACM.Google Scholar
  19. Koren, Y. (2010). Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1), 1.CrossRefGoogle Scholar
  20. Krohn-Grimberghe, A., Drumond, L., Freudenthaler, C., & Schmidt-Thieme, L. (2012). Multi-relational matrix factorization using bayesian personalized ranking for social network data. In WSDM (pp. 173–182). ACM.Google Scholar
  21. LeCun, Y. A., Bottou, L., Orr, G. B., & Müller, K. R. (2012). Efficient backprop. In Neural networks: Tricks of the trade (pp. 9–48). Berlin: Springer.Google Scholar
  22. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.CrossRefGoogle Scholar
  23. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.CrossRefGoogle Scholar
  24. Levy, O., & Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. In Advances in neural information processing systems (pp. 2177–2185).Google Scholar
  25. Li, S., Kawale, J., & Fu, Y. (2015). Deep collaborative filtering via marginalized denoising auto-encoder. In CIKM (pp. 811–820). ACM.Google Scholar
  26. Liang, D., Altosaar, J., Charlin, L., & Blei, D. M. (2016). Factorization meets the item embedding: Regularizing matrix factorization with item co-occurrence. In Proceedings of the 10th ACM conference on recommender systems (pp. 59–66). ACM.Google Scholar
  27. Linden, G., Smith, B., & York, J. (2003). recommendations: Item-to-item collaborative filtering. Internet Computing, 7(1), 76–80.CrossRefGoogle Scholar
  28. Liu, T. Y. (2009). Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3), 225–331.CrossRefGoogle Scholar
  29. Manning, C. D., Raghavan, P., Schütze, H., et al. (2008). Introduction to information retrieval. Cambridge: Cambridge University Press.CrossRefzbMATHGoogle Scholar
  30. McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (Vol. 37). Boca Raton, FL: CRC Press.CrossRefzbMATHGoogle Scholar
  31. Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 807–814).Google Scholar
  32. Pan, R., Zhou, Y., Cao, B., Liu, N. N., Lukose, R., Scholz, M., & Yang, Q. (2008). One-class collaborative filtering. In ICDM (pp. 502–511). IEEE.Google Scholar
  33. Pan, W., & Chen, L. (2013). GBPR: Group preference based Bayesian personalized ranking for one-class collaborative filtering. In IJCAI (Vol. 13, pp. 2691–2697).Google Scholar
  34. Pazzani, M. J., & Billsus, D. (2007). Content-based recommendation systems. In The adaptive web (pp. 325–341). Berlin: Springer.Google Scholar
  35. Rendle, S., & Freudenthaler, C. (2014). Improving pairwise learning for item recommendation from implicit feedback. In WSDM (pp. 273–282). New York, NY: ACM Press.Google Scholar
  36. Rendle, S., Freudenthaler, C., Gantner, Z., & Schmidt-thieme, L. (2009). BPR: Bayesian personalized ranking from implicit feedback. In UAI (pp. 452–461).Google Scholar
  37. Rendle, S., & Schmidt-Thieme, L. (2010). Pairwise interaction tensor factorization for personalized tag recommendation. In WSDM (pp. 81–90). ACM.Google Scholar
  38. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1988). Learning representations by back-propagating errors. Cognitive Modeling, 5(3), 1.Google Scholar
  39. Salakhutdinov, R., Mnih, A., & Hinton, G. (2007). Restricted Boltzmann machines for collaborative filtering. In ICML (pp. 791–798). ACM.Google Scholar
  40. Sedhain, S., Menon, A. K., Sanner, S., & Xie, L. (2015). AutoRec: Autoencoders meet collaborative filtering. In WWW (pp. 111–112).Google Scholar
  41. Singh, A. P., & Gordon, G. J. (2008). Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 650–658). ACM.Google Scholar
  42. Strub, F., & Jeremie, M. (2015). Collaborative filtering with stacked denoising AutoEncoders and sparse inputs. In NIPS workshop on machine learning for eCommerce. MontrealGoogle Scholar
  43. Van den Oord, A., Dieleman, S., & Schrauwen, B. (2013). Deep content-based music recommendation. In NIPS (pp. 2643–2651).Google Scholar
  44. Wang, C., & Blei, D. M. (2011). Collaborative topic modeling for recommending scientific articles. In SIGKDD (pp. 448–456).Google Scholar
  45. Wang, H., Wang, N., & Yeung, D. Y. (2015). Collaborative deep learning for recommender systems. In SIGKDD.Google Scholar
  46. Wang, X., & Wang, Y. (2014). Improving content-based and hybrid music recommendation using deep learning. In: International conference on multimedia (pp. 627–636). ACMGoogle Scholar
  47. Wu, C. Y., Ahmed, A., Beutel, A., Smola, A. J., & Jing, H. (2017). Recurrent recommender networks. In Proceedings of the tenth ACM international conference on web search and data mining (WSDM) (pp. 495–503). New York, NY: ACM.Google Scholar
  48. Wu, Y., Dubois, C., Zheng, A. X., & Ester, M. (2016). Collaborative denoising auto-encoders for Top-N recommender systems. In WSDM.Google Scholar
  49. Ying, H., Chen, L., Xiong, Y., & Wu, J. (2016). Collaborative deep ranking: A hybrid pair-wise recommendation algorithm with implicit feedback. In PAKDD.Google Scholar
  50. Zhang, F., Yuan, N. J., Lian, D., Xie, X., & Ma, W. Y. (2016a). Collaborative knowledge base embedding for recommender systems. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 353–362). ACM.Google Scholar
  51. Zhang, W., Du, T., & Wang, J. (2016b). Deep learning over multi-field categorical data. In European conference on information retrieval (pp. 45–57). Berlin: Springer.Google Scholar
  52. Zheng, L., Noroozi, V., & Yu, P. S. (2017). Joint deep modeling of users and items using reviews for recommendation. In Proceedings of the tenth ACM international conference on web search and data mining, WSDM ’17 (pp. 425–434). New York, NY: ACM. doi: 10.1145/3018661.3018665.

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Department of Computer EngineeringSanta Clara UniversitySanta ClaraUSA

Personalised recommendations