Joint European Conference on Machine Learning and Knowledge Discovery in Databases

ECML PKDD 2015: Machine Learning and Knowledge Discovery in Databases pp 625-640 | Cite as

Finding Community Topics and Membership in Graphs

  • Matt Revelle
  • Carlotta Domeniconi
  • Mack Sweeney
  • Aditya Johri
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9285)

Abstract

Community detection in networks is a broad problem with many proposed solutions. Existing methods frequently make use of edge density and node attributes; however, the methods ultimately have different definitions of community and build strong assumptions about community features into their models. We propose a new method for community detection, which estimates both per-community feature distributions (topics) and per-node community membership. Communities are modeled as connected subgraphs with nodes sharing similar attributes. Nodes may join multiple communities and share common attributes with each. Communities have an associated probability distribution over attributes and node attributes are modeled as draws from a mixture distribution. We make two basic assumptions about community structure: communities are densely connected and have a small network diameter. These assumptions inform the estimation of community topics and membership assignments without being too prescriptive. We present competitive results against state-of-the-art methods for finding communities in networks constructed from NSF awards, the DBLP repository, and the Scratch online community.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ahn, Y.-Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466(7307), 761–764 (2010)CrossRefGoogle Scholar
  2. 2.
    Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. Journal of Machine Learning Research 9, 1981–2014 (2008)MATHGoogle Scholar
  3. 3.
    Balasubramanyan, R., Cohen, W.W.: Block-LDA: Jointly modeling entity-annotated text and entity-entity links. In: Proceedings of the SIAM International Conference on Data Mining, vol. 11, pp. 450–461. SIAM (2011)Google Scholar
  4. 4.
    Blei, D.M.: Probabilistic topic models. Communications of the ACM 55(4), 77–84 (2012)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)Google Scholar
  6. 6.
    Deng, H., Han, J., Zhao, B., Yu, Y., Lin, C.X.: Probabilistic topic models with biased propagation on heterogeneous information networks. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1271–1279. ACM (2011)Google Scholar
  7. 7.
    Domeniconi, C., Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensional data. In: Proceedings of the SIAM International Conference on Data Mining, pp. 517–521. SIAM (2004)Google Scholar
  8. 8.
    Fortunato, S.: Community detection in graphs. Physics Reports 486(3), 75–174 (2010)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI–6(6), 721–741 (1984)CrossRefGoogle Scholar
  10. 10.
    Günnemann, S., Boden, B., Färber, I., Seidl, T.: Efficient mining of combined subspace and subgraph clusters in graphs with feature vectors. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS, vol. 7818, pp. 261–275. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  11. 11.
    Günnemann, S., Färber, I., Boden, B., Seidl, T.: Subspace clustering meets dense subgraph mining: a synthesis of two paradigms. In: Proceedings of the IEEE International Conference on Data Mining, pp. 845–850. IEEE Computer Society (2010)Google Scholar
  12. 12.
    Kriegel, H.-P., Kröger, P., Zimek, A.: Subspace clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(4), 351–364 (2012)Google Scholar
  13. 13.
    Leskovec, J., McAuley, J.: Learning to discover social circles in ego networks. In: Advances in Neural Information Processing Systems, pp. 539–547 (2012)Google Scholar
  14. 14.
    Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link LDA: Joint models of topic and author community. In: Proceedings of the International Conference on Machine Learning, pp. 665–672. ACM (2009)Google Scholar
  15. 15.
    McCallum, A., Wang, X., Mohanty, N.: Joint group and topic discovery from relations and text. In: Airoldi, E.M., Blei, D.M., Fienberg, S.E., Goldenberg, A., Xing, E.P., Zheng, A.X. (eds.) ICML 2006. LNCS, vol. 4503, pp. 28–44. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  16. 16.
    Moser, F., Colak, R., Rafiey, A., Ester, M.: Mining cohesive patterns from graphs with feature vectors. In: Proceedings of the SIAM International Conference on Data Mining, vol. 9, pp. 593–604. SIAM (2009)Google Scholar
  17. 17.
    Pool, S., Bonchi, F., Leeuwen, M.: Description-driven community detection. ACM Transactions on Intelligent Systems and Technology 5(2), 28:1–28:28 (2014)CrossRefGoogle Scholar
  18. 18.
    Resnick, M., Maloney, J., Monroy-Hernández, A., Rusk, N., Eastmond, E., Brennan, K., Millner, A., Rosenbaum, E., Silver, J., Silverman, B., et al.: Scratch: Programming for all. Communications of the ACM 52(11), 60–67 (2009)CrossRefGoogle Scholar
  19. 19.
    Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: Arnetminer: Extraction and mining of academic social networks. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 990–998 (2008)Google Scholar
  20. 20.
    Yang, J., Leskovec, J.: Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the ACM International Conference on Web Search and Data Mining, pp. 587–596. ACM, New York (2013)Google Scholar
  21. 21.
    Yang, J., McAuley, J., Leskovec, J.: Community detection in networks with node attributes. In: IEEE 13th International Conference on Data Mining, pp. 1151–1156. IEEE (2013)Google Scholar
  22. 22.
    Yang, J., McAuley, J., Leskovec, J.: Detecting cohesive and 2-mode communities in directed and undirected networks. In: Proceedings of the ACM International Conference on Web Search and Data Mining, pp. 323–332. ACM (2014)Google Scholar
  23. 23.
    Zhao, Z., Feng, S., Wang, Q., Huang, J.Z., Williams, G.J., Fan, J.: Topic oriented community detection through social objects and link analysis in social networks. Knowledge-Based Systems 26, 164–173 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Matt Revelle
    • 1
  • Carlotta Domeniconi
    • 1
  • Mack Sweeney
    • 1
  • Aditya Johri
    • 1
  1. 1.George Mason UniversityFairfaxUSA

Personalised recommendations