SNAKDD 2008: Advances in Social Network Mining and Analysis pp 114-130 | Cite as
Information Theoretic Criteria for Community Detection
Abstract
Many algorithms for finding community structure in graphs search for a partition that maximizes modularity. However, recent work has identified two important limitations of modularity as a community quality criterion: a resolution limit; and a bias towards finding equal-sized communities. Information-theoretic approaches that search for partitions that minimize description length are a recent alternative to modularity. This paper shows that two information-theoretic algorithms are themselves subject to a resolution limit, identifies the component of each approach that is responsible for the resolution limit, proposes a variant, SGE (Sparse Graph Encoding), that addresses this limitation, and demonstrates on three artificial data sets that (1) SGE does not exhibit a resolution limit on sparse graphs in which other approaches do, and that (2) modularity and the compression-based algorithms, including SGE, behave similarly on graphs not subject to the resolution limit.
Keywords
Community Structure Utility Function Resolution Limit Community Detection Preferential AttachmentPreview
Unable to display preview. Download preview PDF.
References
- 1.Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Physical Review E 69, 066133 (2004)CrossRefGoogle Scholar
- 2.Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A.L., French, J.C.: Clustering large datasets in arbitrary metric spaces. In: Proceedings of the 15th IEEE International Conference on Data Engineering, Sydney, pp. 502–511 (1999)Google Scholar
- 3.Koutsourelakis, P., Eliassi-Rad, T.: Finding mixed-memberships in social networks. In: Papers from the 2008 AAAI Spring Symposium on Social Information Processing, Technical Report WW-08-06, pp. 48–53. AAAI Press, Menlo Park (2008)Google Scholar
- 4.Fortunato, S., Barthelemy, M.: Resolution limit in community detection. Proc. Natl. Acad. Sci. USA 104, 36 (2007)CrossRefGoogle Scholar
- 5.Ruan, J., Zhang, W.: Identifying network communities with a high resolution. PhysRevE (2007)Google Scholar
- 6.Ronhovde, P., Nussinov, Z.: An improved potts model applied to community detection. physics.soc-ph (2008)Google Scholar
- 7.Rosvall, M., Bergstrom, C.: An information-theoretic framework for resolving community structure in complex networks. Proc. Natl. Acad. Sci. USA 104(18), 7327–7331 (2007)CrossRefGoogle Scholar
- 8.Chakrabarti, D.: Autopart: Parameter-free graph partitioning and outlier detection. In: Proceedings of the European Conference on Machine Learning and Practice of Knowledge Discovery in Databases, pp. 112–124 (2004)Google Scholar
- 9.Sun, J., Faloutsos, C., Papadimitriou, S., Yu, P.: Graphscope: parameter-free mining of large time-evolving graphs. In: KDD 2007: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 687–696. ACM, New York (2007)CrossRefGoogle Scholar
- 10.Wallace, C.S., Dowe, D.L.: Minimum message length and Kolmogorov complexity. The Computer Journal 42(4), 270–283 (1999)MATHCrossRefGoogle Scholar
- 11.Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks. Physical review. E, Statistical, nonlinear, and soft matter physics 69(2 Pt 2) (February 2004)Google Scholar
- 12.Tasgin, M., Bingol, H.: Community detection in complex networks using genetic algorithm. In: ECCS 2006: Proc. of the European Conference on Complex Systems (2006)Google Scholar
- 13.Rattigan, M.J., Maier, M., Jensen, D.: Graph clustering with network structure indices. In: ICML 2007: Proceedings of the 24th international conference on Machine learning, pp. 783–790. ACM, New York (2007)CrossRefGoogle Scholar
- 14.Donetti, L., Muoz, M.: Detecting net-work communities: a new systematic and efficient algorithm. Journal of Statistical Mechanics: Theory and Experiment 10012, 1–15 (2004)Google Scholar
- 15.Zhang, H., Giles, C.L., Foley, H.C., Yen, J.: Probabilistic community discovery using hierarchical latent gaussian mixture model. In: AAAI 2007: Proceedings of the 22nd national conference on Artificial intelligence, pp. 663–668. AAAI Press, Menlo Park (2007)Google Scholar
- 16.Rosvall, M., Bergstrom, C.T.: An information-theoretic framework for resolving community structure in complex networks. PNAS 104(7327) (2007)Google Scholar
- 17.Rissanen, R.: A universal prior for integers and estimation by minimum description length. The Annals of Statistics 2, 416–431 (1983)CrossRefMathSciNetGoogle Scholar
- 18.Du, N., Wu, B., Pei, X., Wang, B., Xu, L.: Community detection in large-scale social networks. In: WebKDD/SNA-KDD 2007: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pp. 16–25. ACM, New York (2007)CrossRefGoogle Scholar
- 19.Bagrow, J.: Evaluating local community methods in networks. J. Stat. Mech. 2008(05), P05001 (2008)Google Scholar
- 20.Barabasi, A., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)CrossRefMathSciNetGoogle Scholar
- 21.Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data, cite arxiv:0706.1062 (2007) http://www.santafe.edu/~aaronc/powerlaws/
- 22.Clauset, A., Shalizi, C., Newman, M.: Power-law distributions in empirical data. SIAM Review 51(4), 661–703 (2009)MATHCrossRefMathSciNetGoogle Scholar
- 23.Rand, W.M.: Objective Criteria for the Evaluation of Clustering Methods. Journal of the American Statistical Association 66(336), 846–850 (1971)CrossRefGoogle Scholar
- 24.Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification 2, 193–218 (1985)CrossRefGoogle Scholar