Bayesian Matrix Co-Factorization: Variational Algorithm and Cramér-Rao Bound

  • Jiho Yoo
  • Seungjin Choi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6913)

Abstract

Matrix factorization is a popular method for collaborative prediction, where unknown ratings are predicted by user and item factor matrices which are determined to approximate a user-item matrix as their product. Bayesian matrix factorization is preferred over other methods for collaborative filtering, since Bayesian approach alleviates overfitting, integrating out all model parameters using variational inference or sampling methods. However, Bayesian matrix factorization still suffers from the cold-start problem where predictions of ratings for new items or of new users’ preferences are required. In this paper we present Bayesian matrix co-factorization as an approach to exploiting side information such as content information and demographic user data, where multiple data matrices are jointly decomposed, i.e., each Bayesian decomposition is coupled by sharing some factor matrices. We derive variational inference algorithm for Bayesian matrix co-factorization. In addition, we compute Bayesian Cramér-Rao bound in the case of Gaussian likelihood, showing that Bayesian matrix co-factorization indeed improves the reconstruction over Bayesian factorization of single data matrix. Numerical experiments demonstrate the useful behavior of Bayesian matrix co-factorization in the case of cold-start problems.

Keywords

Root Mean Square Error Side Information Fisher Information Matrix Matrix Factorization Model Probablistic Matrix Factorization 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Kay, S.M.: Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice-Hall, Englewood Cliffs (1993)MATHGoogle Scholar
  2. 2.
    Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. IEEE Computer 42(8), 30–37 (2009)CrossRefGoogle Scholar
  3. 3.
    Lee, H., Choi, S.: Group nonnegative matrix factorization for EEG classification. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), Clearwater Beach, Florida (2009)Google Scholar
  4. 4.
    Lim, Y.J., Teh, Y.W.: Variational Bayesian approach to movie rating prediction. In: Proceedings of KDD Cup and Workshop, San Jose, CA (2007)Google Scholar
  5. 5.
    Raiko, T., Ilin, A., Karhunen, J.: Principal component analysis for large scale problems with lots of missing values. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 691–698. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Rennie, J.D.M., Srebro, N.: Fast maximum margin matrix factorization for collaborative prediction. In: Proceedings of the International Conference on Machine Learning (ICML), Bonn, Germany (2005)Google Scholar
  7. 7.
    Salakhutdinov, R., Mnih, A.: Bayesian probablistic matrix factorization using MCMC. In: Proceedings of the International Conference on Machine Learning (ICML), Helsinki, Finland (2008)Google Scholar
  8. 8.
    Salakhutdinov, R., Mnih, A.: Probablistic matrix factorization. In: Advances in Neural Information Processing Systems (NIPS), vol. 20. MIT Press, Cambridge (2008)Google Scholar
  9. 9.
    Singh, A.P., Gordon, G.J.: Relational learning via collective matrix factorization. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Las Vegas, Nevada (2008)Google Scholar
  10. 10.
    Singh, A.P., Gordon, G.J.: A Bayesian matrix factorization model for relational data. In: Proceedings of the Annual Conference on Uncertainty in Artificial Intelligence (UAI), Catalina Island, CA (2010)Google Scholar
  11. 11.
    Takács, G., Pilászy, I., Németh, B., Tikk, D.: Scalable collaborative filtering approaches for large recommender systems. Journal of Machine Learning Research 10, 623–656 (2009)Google Scholar
  12. 12.
    Tichavsky, P., Muravchik, C.H., Nehorai, A.: Posterior Cramér-Rao bounds for discrete-time nonlinear filtering. IEEE Transactions on Signal Processing 46(5), 1386–1395 (1998)CrossRefGoogle Scholar
  13. 13.
    Williamson, S., Ghahramani, Z.: Probabilistic models for data combination in recommender systems. In: NIPS 2008 Workshop on Learning from Multiple Sources, Whistler, Canada (2010)Google Scholar
  14. 14.
    Yoo, J., Choi, S.: Weighted nonnegative matrix co-tri-factorization for collaborative prediction. In: Zhou, Z.-H., Washio, T. (eds.) ACML 2009. LNCS, vol. 5828, pp. 396–411. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  15. 15.
    Yoo, J., Choi, S.: Matrix co-factorization on compressed sensing. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Barcelona, Spain (2011)Google Scholar
  16. 16.
    Yu, K., Yu, S., Tresp, V.: Multi-label informed latent semantic indexing. In: Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Salvador, Brazil (2005)Google Scholar
  17. 17.
    Zhu, S., Yu, K., Chi, Y., Gong, Y.: Combining content and link for classification using matrix factorization. In: Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Amsterdam, The Netherlands (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jiho Yoo
    • 1
  • Seungjin Choi
    • 1
    • 2
  1. 1.Department of Computer SciencePohang University of Science and TechnologyPohangKorea
  2. 2.Division of IT Convergence EngineeringPohang University of Science and TechnologyPohangKorea

Personalised recommendations