Joint European Conference on Machine Learning and Knowledge Discovery in Databases

ECML PKDD 2015: Machine Learning and Knowledge Discovery in Databases pp 283-299 | Cite as

Gamma Process Poisson Factorization for Joint Modeling of Network and Documents

  • Ayan Acharya
  • Dean Teffer
  • Jette Henderson
  • Marcus Tyler
  • Mingyuan Zhou
  • Joydeep Ghosh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9284)

Abstract

Developing models to discover, analyze, and predict clusters within networked entities is an area of active and diverse research. However, many of the existing approaches do not take into consideration pertinent auxiliary information. This paper introduces Joint Gamma Process Poisson Factorization (J-GPPF) to jointly model network and side-information. J-GPPF naturally fits sparse networks, accommodates separately-clustered side information in a principled way, and effectively addresses the computational challenges of analyzing large networks. Evaluated with hold-out link prediction performance on sparse networks (both synthetic and real-world) with side information, J-GPPF is shown to clearly outperform algorithms that only model the network adjacency matrix.

Keywords

Network modeling Poisson factorization Gamma process 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Acharya, A., Ghosh, J., Zhou, M.: Nonparametric bayesian factor analysis for dynamic count matrices. In: Proc. of AISTATS (to appear, 2015)Google Scholar
  2. 2.
    Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. JMLR 9, 1981–2014 (2008)MATHGoogle Scholar
  3. 3.
    Balasubramanyan, R., Cohen, W.W.: Block-LDA: jointly modeling entity-annotated text and entity-entity links. In: Proc. of SDM, pp. 450–461 (2011)Google Scholar
  4. 4.
    Ball, B., Karrer, B., Newman, M.: Efficient and principled method for detecting communities in networks. Phys. Rev. E 84, September 2011Google Scholar
  5. 5.
    Blackwell, D., MacQueen, J.: Ferguson distributions via Pólya urn schemes. The Annals of Statistics (1973)Google Scholar
  6. 6.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. JMLR 3, 993–1022 (2003)MATHGoogle Scholar
  7. 7.
    Broderick, T., Mackey, L., Paisley, J., Jordan, M.I.: Combinatorial clustering and the beta negative binomial process. arXiv:1111.1802v5 (2013)
  8. 8.
    Canny, J.: Gap: a factor model for discrete data. In: SIGIR (2004)Google Scholar
  9. 9.
    Cemgil, A.T.: Bayesian inference for nonnegative matrix factorisation models. Intell. Neuroscience (2009)Google Scholar
  10. 10.
    Chaney, A., Gopalan, P., Blei, D.: Poisson trust factorization for incorporating social networks into personalized item recommendation. In: NIPS Workshop: What Difference Does Personalization Make? (2013)Google Scholar
  11. 11.
    Chang, J., Blei, D.: Relational topic models for document networks. In: Proc. of AISTATS (2009)Google Scholar
  12. 12.
    Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Statist. (1973)Google Scholar
  13. 13.
    Gopalan, P., Mimno, D.M., Gerrish, S., Freedman, M.J., Blei, D.M.: Scalable inference of overlapping communities. In: Proc. of NIPS, pp. 2258–2266 (2012)Google Scholar
  14. 14.
    Gopalan, P., Ruiz, F., Ranganath, R., Blei, D.: Bayesian nonparametric poisson factorization for recommendation systems. In: Proc. of AISTATS (2014)Google Scholar
  15. 15.
    Gopalan, P., Charlin, L., Blei, D.: Content-based recommendations with poisson factorization. In: Proc. of NIPS, pp. 3176–3184 (2014)Google Scholar
  16. 16.
    Hjort, N.L.: Nonparametric Bayes estimators based on beta processes in models for life history data. Ann. Statist. (1990)Google Scholar
  17. 17.
    Johnson, N.L., Kemp, A.W., Kotz, S.: Univariate Discrete Distributions. John Wiley & Sons (2005)Google Scholar
  18. 18.
    Kemp, C., Tenenbaum, J., Griffiths, T., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: Proc. of AAAI, pp. 381–388 (2006)Google Scholar
  19. 19.
    Kim, D.I., Gopalan, P., Blei, D.M., Sudderth, E.B.: Efficient online inference for bayesian nonparametric relational models. In: Proc. of NIPS, pp. 962–970 (2013)Google Scholar
  20. 20.
    Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS (2001)Google Scholar
  21. 21.
    Leskovec, J., Julian, J.: Learning to discover social circles in ego networks. In: Proc. of NIPS, pp. 539–547 (2012)Google Scholar
  22. 22.
    Ma, H., Yang, H., Lyu, M.R., King, I.: Sorec: social recommendation using probabilistic matrix factorization. In: Proc. of CIKM, pp. 931–940 (2008)Google Scholar
  23. 23.
    McCallum, A., Wang, X., Corrada-Emmanuel, A.: Topic and role discovery in social networks with experiments on enron and academic email. J. Artif. Int. Res. 30(1), 249–272 (2007)Google Scholar
  24. 24.
    Menon, A.K., Elkan, C.: Link prediction via matrix factorization. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part II. LNCS, vol. 6912, pp. 437–452. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  25. 25.
    Miller, K.T., Griffiths, T.L., Jordan, M.I.: Nonparametric latent feature models for link prediction. In: Proc. of NIPS, pp. 1276–1284 (2009)Google Scholar
  26. 26.
    Nallapati, R., Ahmed, A., Xing, E., Cohen, W.: Joint latent topic models for text and citations. In: Proc. of KDD, pp. 542–550 (2008)Google Scholar
  27. 27.
    Neal, R.M.: Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics (2000)Google Scholar
  28. 28.
    Palla, K., Ghahramani, Z., Knowles, D.A.: An infinite latent attribute model for network data. In: Proc. of ICML, pp. 1607–1614 (2012)Google Scholar
  29. 29.
    Roller, S., Speriosu, M., Rallapalli, S., Wing, B., Baldridge, J.: Supervised text-based geolocation using language models on an adaptive grid. In: Proc. of EMNLP-CoNLL, pp. 1500–1510 (2012)Google Scholar
  30. 30.
    Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proc. of UAI, pp. 487–494 (2004)Google Scholar
  31. 31.
    Salakhutdinov, R., Mnih, A.: Probabilistic matrix factorization. In: Proc. of NIPS (2007)Google Scholar
  32. 32.
    Titsias, M.K.: The infinite gamma-poisson feature model. In: Proc. of NIPS (2008)Google Scholar
  33. 33.
    Walker, S.G.: Sampling the Dirichlet mixture model with slices. Communications in Statistics Simulation and Computation (2007)Google Scholar
  34. 34.
    Wang, X., Mohanty, N., Mccallum, A.: Group and topic discovery from relations and their attributes. In: Proc. of NIPS, pp. 1449–1456 (2006)Google Scholar
  35. 35.
    Wen, Z., Lin, C.: Towards finding valuable topics. In: Proc. of SDM, pp. 720–731 (2010)Google Scholar
  36. 36.
    Wolpert, R.L., Clyde, M.A., Tu, C.: Stochastic expansions using continuous dictionaries: Lévy Adaptive Regression Kernels. Annals of Statistics (2011)Google Scholar
  37. 37.
    Yang, J., Leskovec, J.: Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proc. of WSDM, pp. 587–596 (2013)Google Scholar
  38. 38.
    Yang, J., McAuley, J.J., Leskovec, J.: Community detection in networks with node attributes. In: Proc. of ICDM, pp. 1151–1156 (2013)Google Scholar
  39. 39.
    Yoshida, T.: Toward finding hidden communities based on user profile. J. Intell. Inf. Syst. 40(2), 189–209 (2013)CrossRefGoogle Scholar
  40. 40.
    Zhou, M.: Infinite edge partition models for overlapping community detection and link prediction. In: Proc. of AISTATS (to appear, 2015)Google Scholar
  41. 41.
    Zhou, M., Carin, L.: Augment-and-conquer negative binomial processes. In: Proc. of NIPS (2012)Google Scholar
  42. 42.
    Zhou, M., Carin, L.: Negative binomial process count and mixture modeling. IEEE Trans. Pattern Analysis and Machine Intelligence (2015)Google Scholar
  43. 43.
    Zhou, M., Hannah, L., Dunson, D., Carin, L.: Beta-negative binomial process and poisson factor analysis. In: Proc. of AISTATS (2012)Google Scholar
  44. 44.
    Zhu, J.: Max-margin nonparametric latent feature models for link prediction. In: Proc. of ICML (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ayan Acharya
    • 1
  • Dean Teffer
    • 2
  • Jette Henderson
    • 2
  • Marcus Tyler
    • 2
  • Mingyuan Zhou
    • 3
  • Joydeep Ghosh
    • 1
  1. 1.Department of ECEUniversity of Texas at AustinAustinUSA
  2. 2.Applied Research LaboratoriesUniversity of Texas at AustinAustinUSA
  3. 3.Department of IROMUniversity of Texas at AustinAustinUSA

Personalised recommendations