Machine Learning

, Volume 74, Issue 1, pp 75–109 | Cite as

Latent grouping models for user preference prediction

Article

Abstract

We tackle the problem of new users or documents in collaborative filtering. Generalization over users by grouping them into user groups is beneficial when a rating is to be predicted for a relatively new document having only few observed ratings. Analogously, generalization over documents improves predictions in the case of new users. We show that if either users and documents or both are new, two-way generalization becomes necessary. We demonstrate the benefits of grouping of users, grouping of documents, and two-way grouping, with artificial data and in two case studies with real data. We have introduced a probabilistic latent grouping model for predicting the relevance of a document to a user. The model assumes a latent group structure for both users and items. We compare the model against a state-of-the-art method, the User Rating Profile model, where only the users have a latent group structure. We compute the posterior of both models by Gibbs sampling. The Two-Way Model predicts relevance more accurately when the target consists of both new documents and new users. The reason is that generalization over documents becomes beneficial for new documents and at the same time generalization over users is needed for new users.

Keywords

Collaborative filtering Gibbs sampling Graphical model Latent topic model 

References

  1. Blei, D. M., & Jordan, M. I. (2003). Modeling annotated data. In Proceedings of the 26th annual international ACMSIGIR conference on research and development in information retrieval (pp. 127–134). New York: Assoc. Comput. Mach.. Google Scholar
  2. Blei, D., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. MATHCrossRefGoogle Scholar
  3. British Parliament data. Votings of the British Parliament. (1997–2001). http://www.publicwhip.org.uk/project/data.php.
  4. Buntine, W. (2002). Variational extensions to EM and multinomial PCA. In T. Elomaa, H. Mannila, & H. Toivonen (Eds.), Proceedings of the thirteenth European conference on machine learning, ECML’02 (Vol. 2430, pp. 23–34). Berlin: Springer. Google Scholar
  5. Buntine, W., & Jakulin, A. (2006). Discrete components analysis. In C. Saunders, M. Grobelnik, S. Gunn, & J. Shawe-Taylor (Eds.), Subspace, latent structure and feature selection techniques. Berlin: Springer. Google Scholar
  6. Erosheva, E., Fienberg, S., & Lafferty, J. (2004). Mixed membership models of scientific publications. Proceedings of the National Academy of Sciences, 101, 5220–5227. CrossRefGoogle Scholar
  7. Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1), 97–109. MATHCrossRefGoogle Scholar
  8. Hofmann, T. (2004). Latent semantic models for collaborative filtering. ACM Transactions on Information Systems, 22(1), 89–115. CrossRefGoogle Scholar
  9. Jin, R., & Si, L. (2004). A Bayesian approach towards active learning for collaborative filtering. In Proceedings of the twentieth conference on uncertainty in artificial intelligence, UAI’04 (pp. 278–285). AUAI Press. Google Scholar
  10. Keller, M., & Bengio, S. (2004). Theme topic mixture model: A graphical model for document representation. In PASCAL workshop on text mining and understanding. Google Scholar
  11. Koivisto, M. (2004). Sum-product algorithms for the analysis of genetic risks. Doctoral dissertation, Department of Computer Science, University of Helsinki. Google Scholar
  12. Konstan, J., Miller, B., & Maltz, D., Herlocker, J. (1997). GroupLens: Applying collaborative filtering to usenet news. Communications of the ACM, 40(3), 77–87. CrossRefGoogle Scholar
  13. Madeira, S. C., & Oliveira, A. L. (2004). Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1, 24–45. CrossRefGoogle Scholar
  14. Marlin, B. (2004). Modeling user rating profiles for collaborative filtering. In Advances in neural information processing systems (Vol. 16, pp. 627–634). Cambridge: MIT Press. Google Scholar
  15. Marlin, B., Roweis, S. T., & Zemel, R. S. (2005). Unsupervised learning with non-ignorable missing data. In R.G. Cowell, & Z. Ghahramani (Eds.), Proceedings of the tenth international workshop on artificial intelligence and statistics, AISTATS’05 (pp. 222–229). Society for Artificial Intelligence and Statistics. (Available electronically at http://www.gatsby.ucl.ac.uk/aistats/).
  16. Marlin, B., Zemel, R. S. (2004). The multiple multiplicative factor model for collaborative filtering. In ICML’04: Proceedings of the 21th international conference on machine learning (p. 73). New York: Assoc. Comput. Mach. Press. Google Scholar
  17. Marlin, B. M., Zemel, R. S., Roweis, S., & Slaney, M. (2007). Collaborative filtering and the missing at random assumption. In Proceedings of the 23rd conference on uncertainty in artificial intelligence, UAI’07. Google Scholar
  18. McCallum, A., Corrada-Emmanuel, A., & Wang, X. (2004). The author-recipient-topic model for topic and role discovery in social networks: Experiments with Enron and Academic Email (Technical Report). Google Scholar
  19. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6), 1087–1092. CrossRefGoogle Scholar
  20. Popescul, A., Ungar, L., Pennock, D., & Lawrence, S. (2001). Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments. In Proceedings of the 17th conference on uncertainty in artificial intelligence, UAI’01 (pp. 437–444). San Mateo: Morgan Kaufmann. Google Scholar
  21. Pritchard, J. K., Stephens, M., & Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics, 155, 945–959. Google Scholar
  22. Puolamäki, K., Salojärvi, J., Savia, E., Simola, J., & Kaski, S. (2005). Combining eye movements and collaborative filtering for proactive information retrieval. In G. Marchionini, A. Moffat, J. Tait, R. Baeza-Yates, & N. Ziviani (Eds.), SIGIR’05: proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval (pp. 146–153). New York: Assoc. Comput. Mach. Press. CrossRefGoogle Scholar
  23. Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In Proceedings of the 20th conference on uncertainty in artificial intelligence, UAI’04 (pp. 487–494). AUAI Press. Google Scholar
  24. Savia, E., Puolamäki, K., Sinkkonen, J., & Kaski, S. (2005). Two-way latent grouping model for user preference prediction. In F. Bacchus, & T. Jaakkola (Eds.), Proceedings of the 21st conference on uncertainty in artificial intelligence, UAI’05 (pp. 518–525). AUAI Press. Google Scholar
  25. Shardanand, U., & Maes, P. (1995). Social information filtering: Algorithms for automating ‘word of mouth’. In Proceedings of the ACM CHI95 human factors in computing systems conference (pp. 210–217). Cambridge: Assoc. Comput. Mach. Google Scholar
  26. Si, L., & Jin, R. (2003). Flexible mixture model for collaborative filtering. In T. Fawcett & N. Mishra (Eds.), Proceedings of the twentieth international conference on machine learning, ICML’03 (pp. 704–711). Menlo Park: AAAI Press. Google Scholar
  27. Tanay, A., Sharan, R., & Shamir, R. (2006). Biclustering algorithms: A Survey. In Handbook of computational molecular biology. London: Chapman & Hall. Google Scholar
  28. Wettig, H., Lahtinen, J., Lepola, T., Myllymäki, P., & Tirri, H. (2003). Bayesian analysis of online newspaper log data. In Proceedings of the 2003 symposium on applications and the Internet workshops (pp. 282–278). Los Alamitos: IEEE Comput. Soc. CrossRefGoogle Scholar
  29. Yu, K., Schwaighofer, A., Tresp, V., Xu, X., & Kriegel, H.-P. (2004). Probabilistic memory-based collaborative filtering. IEEE Transactions on Knowledge and Data Engineering, 16(1), 56–69. CrossRefGoogle Scholar
  30. Yu, K., Yu, S., & Tresp, V. (2005a). Dirichlet enhanced latent semantic analysis. In R.G. Cowell, & Z. Ghahramani (Eds.), Proceedings of the tenth international workshop on artificial intelligence and statistics, AISTATS’05 (pp. 437–444). Society for Artificial Intelligence and Statistics. Google Scholar
  31. Yu, S., Yu, K., Tresp, V., & Kriegel, H.-P. (2005b). A probabilistic clustering-projection model for discrete data. In A. Jorge, L. Torgo, P. Brazdil, R. Camacho, & J. Gama (Eds.), Proceedings of the 9th European conference on principles and practice of knowledge discovery in databases, PKDD’05 (Vol. 3721, pp. 417–428). Berlin: Springer. Google Scholar
  32. Zitnick, C., & Kanade, T. (2004). Maximum entropy for collaborative filtering. In Proceedings of the 20th conference on uncertainty in artificial intelligence, UAI’04 (pp. 636–643). AUAI Press. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Adaptive Informatics Research Centre, Department of Information and Computer ScienceHelsinki University of TechnologyTKKFinland
  2. 2.Helsinki Institute for Information Technology HIIT, Department of Information and Computer ScienceHelsinki University of TechnologyTKKFinland

Personalised recommendations