Two-Way Grouping by One-Way Topic Models

  • Eerika Savia
  • Kai Puolamäki
  • Samuel Kaski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5772)

Abstract

We tackle the problem of new users or documents in collaborative filtering. Generalization over users by grouping them into user groups is beneficial when a rating is to be predicted for a relatively new document having only few observed ratings. The same applies for documents in the case of new users. We have shown earlier that if there are both new users and new documents, two-way generalization becomes necessary, and introduced a probabilistic Two-Way Model for the task. The task of finding a two-way grouping is a non-trivial combinatorial problem, which makes it computationally difficult. We suggest approximating the Two-Way Model with two URP models; one that groups users and one that groups documents. Their two predictions are combined using a product of experts model. This combination of two one-way models achieves even better prediction performance than the original Two-Way Model.

Keywords

User Group Latent Dirichlet Allocation Latent Semantic Analysis Document Cluster Expert Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Konstan, J., Miller, B., Maltz, D., Herlocker, J.: GroupLens: Applying collaborative filtering to usenet news. Communications of the ACM 40(3), 77–87 (1997)CrossRefGoogle Scholar
  2. 2.
    Shardanand, U., Maes, P.: Social information filtering: Algorithms for automating ‘word of mouth’. In: Proceedings of the ACM CHI 1995 Human Factors in Computing Systems Conference, pp. 210–217 (1995)Google Scholar
  3. 3.
    Hofmann, T.: Latent semantic models for collaborative filtering. ACM Trans. Inf. Syst. 22(1), 89–115 (2004)CrossRefGoogle Scholar
  4. 4.
    Jin, R., Si, L.: A Bayesian approach towards active learning for collaborative filtering. In: Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence, UAI 2004, pp. 278–285. AUAI Press (2004)Google Scholar
  5. 5.
    Wettig, H., Lahtinen, J., Lepola, T., Myllymäki, P., Tirri, H.: Bayesian analysis of online newspaper log data. In: Proc. of the 2003 Symposium on Applications and the Internet Workshops (SAINT 2003), pp. 282–287. IEEE Computer Society, Los Alamitos (2003)CrossRefGoogle Scholar
  6. 6.
    Zitnick, C., Kanade, T.: Maximum entropy for collaborative filtering. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, UAI 2004, pp. 636–643. AUAI Press (2004)Google Scholar
  7. 7.
    Blei, D.M., Jordan, M.I.: Modeling annotated data. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 127–134. ACM Press, New York (2003)Google Scholar
  8. 8.
    Buntine, W., Jakulin, A.: Discrete component analysis. In: Saunders, C., Grobelnik, M., Gunn, S., Shawe-Taylor, J. (eds.) SLSFS 2005. LNCS, vol. 3940, pp. 1–33. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Erosheva, E., Fienberg, S., Lafferty, J.: Mixed membership models of scientific publications. Proc. of the National Academy of Sciences 101, 5220–5227 (2004)CrossRefGoogle Scholar
  10. 10.
    Keller, M., Bengio, S.: Theme topic mixture model: A graphical model for document representation. In: PASCAL Workshop on Text Mining and Understanding (2004)Google Scholar
  11. 11.
    Marlin, B., Zemel, R.S.: The multiple multiplicative factor model for collaborative filtering. In: ICML 2004: Proceedings of the 21th International Conference on Machine Learning, p. 73. ACM Press, New York (2004)Google Scholar
  12. 12.
    McCallum, A., Corrada-Emmanuel, A., Wang, X.: The author-recipient-topic model for topic and role discovery in social networks: Experiments with Enron and Academic Email. Technical report, University of Massachusetts (2004)Google Scholar
  13. 13.
    Popescul, A., Ungar, L., Pennock, D., Lawrence, S.: Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments. In: Proceedings of UAI 2001, pp. 437–444. Morgan Kaufmann, San Francisco (2001)Google Scholar
  14. 14.
    Pritchard, J.K., Stephens, M., Donnelly, P.: Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000)Google Scholar
  15. 15.
    Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, UAI 2004, pp. 487–494. AUAI Press (2004)Google Scholar
  16. 16.
    Yu, K., Yu, S., Tresp, V.: Dirichlet enhanced latent semantic analysis. In: Cowell, R.G., Ghahramani, Z. (eds.) Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, AISTATS 2005, pp. 437–444. Society for Artificial Intelligence and Statistics (2005)Google Scholar
  17. 17.
    Yu, S., Yu, K., Tresp, V., Kriegel, H.-P.: A probabilistic clustering-projection model for discrete data. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 417–428. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  18. 18.
    Blei, D., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)MATHGoogle Scholar
  19. 19.
    Buntine, W.: Variational extensions to EM and multinomial PCA. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 23–34. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  20. 20.
    Marlin, B.: Modeling user rating profiles for collaborative filtering. In: Advances in Neural Information Processing Systems 16, pp. 627–634. MIT Press, Cambridge (2004)Google Scholar
  21. 21.
    Si, L., Jin, R.: Flexible mixture model for collaborative filtering. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference on Machine Learning, ICML 2003, pp. 704–711. AAAI Press, Menlo Park (2003)Google Scholar
  22. 22.
    Savia, E., Puolamäki, K., Kaski, S.: Latent grouping models for user preference prediction. Machine Learning 74(1), 75–109 (2009)CrossRefGoogle Scholar
  23. 23.
    Lam, X.N., Vu, T., Le, T.D., Duong, A.D.: Addressing cold-start problem in recommendation systems. In: ICUIMC 2008: Proceedings of the 2nd international conference on Ubiquitous information management and communication, pp. 208–211. ACM, New York (2008)Google Scholar
  24. 24.
    Savia, E., Puolamäki, K., Sinkkonen, J., Kaski, S.: Two-way latent grouping model for user preference prediction. In: Bacchus, F., Jaakkola, T. (eds.) Uncertainty in Artificial Intelligence 21, pp. 518–525. AUAI Press, Corvallis (2005)Google Scholar
  25. 25.
    Puolamäki, K., Hanhijärvi, S., Garriga, G.C.: An approximation ratio for biclustering. Information Processing Letters 108, 45–49 (2008)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Anagnostopoulos, A., Dasgupta, A., Kumar, R.: Approximation algorithms for co-clustering. In: Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 201–210. ACM, New York (2008)CrossRefGoogle Scholar
  27. 27.
    Hinton, G.E.: Training Products of Experts by Minimizing Contrastive Divergence. Neural Computation 14(8), 1771–1800 (2002)CrossRefMATHGoogle Scholar
  28. 28.
    Savia, E., Puolamäki, K., Kaski, S.: On two-way grouping by one-way topic models. Technical Report TKK-ICS-R15, Helsinki University of Technology, Department of Information and Computer Science, Espoo, Finland (May 2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Eerika Savia
    • 1
  • Kai Puolamäki
    • 1
  • Samuel Kaski
    • 1
  1. 1.Helsinki Institute for Information Technology HIIT Department of Information and Computer ScienceHelsinki University of TechnologyFI-02015Finland

Personalised recommendations