Abstract
This paper proposes a hierarchical probabilistic model for ordinal matrix factorization. Unlike previous approaches, we model the ordinal nature of the data and take a principled approach to incorporating priors for the hidden variables. Two algorithms are presented for inference, one based on Gibbs sampling and one based on variational Bayes. Importantly, these algorithms may be implemented in the factorization of very large matrices with missing entries.
The model is evaluated on a collaborative filtering task, where users have rated a collection of movies and the system is asked to predict their ratings for other movies. The Netflix data set is used for evaluation, which consists of around 100 million ratings. Using root mean-squared error (RMSE) as an evaluation metric, results show that the suggested model outperforms alternative factorization techniques. Results also show how Gibbs sampling outperforms variational Bayes on this task, despite the large number of ratings and model parameters. Matlab implementations of the proposed algorithms are available from cogsys.imm.dtu.dk/ordinalmatrixfactorization.
Similar content being viewed by others
References
Albert, J.H., Chib, S.: Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88(422), 669–679 (1993)
Ansari, A., Essegaier, S., Kohli, R.: Internet recommendation systems. J. Mark. Res., 363–375 (2000)
Bell, R.M., Koren, Y.: Improved neighborhood-based collaborative filtering. In: Proceedings of KDD Cup and Workshop (2007)
Bell, R.M., Koren, Y., Volinsky, C.: The BellKor solution to the Netflix prize. Tech. rep., AT&T Labs–Research (2007)
Berry, M.W., Browne, M., Langville, A.N., Pauca, V.P., Plemmons, R.J.: Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52(1), 155–173 (2007)
Chu, W., Ghahramani, Z.: Gaussian processes for ordinal regression. J. Mach. Learn. Res. 6, 1019–1041 (2005)
Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 721–741 (1984)
Hofmann, T.: Probabilistic latent semantic analysis. In: Uncertainty in Artificial Intelligence, pp. 289–296 (1999)
Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)
Koren, Y.: The BellKor solution to the Netflix Grand Prize. Tech. rep. (2009)
Lawrence, N.D., Urtasun, R.: Non-linear matrix factorization with Gaussian processes. In: Bottou, L., Littman, M. (eds.) Proceedings of the International Conference in Machine Learning. Morgan Kauffman, San Francisco (2009)
Lim, Y.J., Teh, Y.W.: Variational Bayesian approach to movie rating prediction. In: Proceedings of KDD Cup and Workshop (2007)
Linden, G., Smith, B., York, J.: Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput. 7(1), 76–80 (2003)
Mackey, L., Weiss, D., Jordan, M.I.: Mixed membership matrix factorization. In: Fürnkranz, J., Joachims, T. (eds.) Proceedings of the 27th International Conference on Machine Learning, pp. 711–718 (2010)
Marlin, B.: Modeling user rating profiles for collaborative filtering. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 16. MIT Press, Cambridge (2004)
McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman and Hall, London (1989)
Miettinen, P., Mielikäinen, T., Gionis, A., Das, G., Mannila, H.: The discrete basis problem. IEEE Trans. Knowl. Data Eng. 10, 1348–1362 (2008)
Neal, R.: Bayesian Learning for Neural Networks. Springer, New York (1996)
Piotte, M., Chabbert, M.: The Pragmatic Theory solution to the Netflix Grand Prize. Tech. rep. (2009)
Porteous, I., Asuncion, A., Welling, M.: Bayesian matrix factorization with side information and Dirichlet process mixtures. In: AAAI Conference on Artificial Intelligence (2010)
Rennie, J.D.M., Srebro, N.: Fast maximum margin matrix factorization for collaborative prediction. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 713–719 (2005)
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods, 2nd edn. Springer, Berlin (2004)
Salakhutdinov, R., Mnih, A.: Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In: Proceedings of the 25th International Conference on Machine Learning, pp. 880–887 (2008a)
Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems, vol. 20, pp. 1257–1264. MIT Press, Cambridge (2008b)
Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the International Conference on Machine Learning, vol. 24, pp. 791–798 (2007)
Shen, B.H., Ji, S., Ye, J.: Mining discrete patterns via binary matrix factorization. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 757–766 (2009)
Srebro, N., Rennie, J.D.M., Jaakkola, T.S.: Maximum-margin matrix factorization. Adv. Neural Inf. Process. Syst. 17, 1329–1336 (2005)
Stern, D.H., Herbrich, R., Graepel, T.: Matchbox: large scale online Bayesian recommendations. In: WWW, pp. 111–120 (2009)
Stevens, S.S.: On the theory of scales of measurement. Science 103(2684), 677–680 (1946)
Takács, G., Pilászy, I., Németh, B., Tikk, D.: Scalable collaborative filtering approaches for large recommender systems. J. Mach. Learn. Res. 10, 623–656 (2009)
Töscher, A., Jahrer, M., Bell, R.: The BigChaos solution to the Netflix Grand Prize. Tech. rep. (2009)
Yu, K., Lafferty, J., Zhu, S., Gong, Y.: Large-scale collaborative prediction using a nonparametric random effects model. In: Bottou, L., Littman, M. (eds.) Proceedings of the International Conference in Machine Learning. Morgan Kauffman, San Francisco (2009a)
Yu, K., Zhu, S., Lafferty, J., Gong, Y.: Fast nonparametric matrix factorization for large-scale collaborative filtering. In: Proceedings of the 32nd International ACM SIGIR Conference, pp. 211–218 (2009b)
Zhang, Z.Y., Li, T., Ding, C., Ren, X.W., Zhang, X.S.: Binary matrix factorization for analyzing gene expression data. Data Min. Knowl. Discov., 1–25 (2009)
Zhu, S., Yu, K., Gong, Y.: Stochastic relational models for large-scale dyadic data using MCMC. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 21, pp. 1993–2000 (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Paquet, U., Thomson, B. & Winther, O. A hierarchical model for ordinal matrix factorization. Stat Comput 22, 945–957 (2012). https://doi.org/10.1007/s11222-011-9264-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-011-9264-x