Abstract
Local methods for detecting community structure are necessary when a graph’s size or node-expansion cost make global community detection methods infeasible. Various algorithms for local community detection have been proposed, but there has been little analysis of the circumstances under which one approach is preferable to another. This paper describes an evaluation comparing the accuracy of five alternative vertex selection policies in detecting two distinct types of community structures—vertex partitions that maximize modularity, and link partitions that maximize partition density—in a variety of graphs. In this evaluation, the vertex selection policy that most accurately identified vertex-partition community structure in a given graph depended on how closely the graph’s degree distribution approximated a power-law distribution. When the target community structure was partition-density maximization, however, an algorithm based on spreading activation generally performed best, regardless of degree distribution. These results indicate that local community detection should be context-sensitive in the sense of basing vertex selection on the graph’s degree distribution and the target community structure.
Similar content being viewed by others
Notes
The algorithm of Luo et al. (2008) considers each \(n \in N\) in ascending order of degree, adding to the community each n whose addition to C would increase M. Each element of C whose removal would increase M without disconnecting C is then removed. These two steps are repeated until no new vertices are added. The procedure described here differs from the algorithm of Luo et al. (2008) in that it selects the node that maximizes M, rather than the lowest degree node for which \(\Updelta M>O,\) and in that it is purely a node-selection policy, with no node filtering.
An alternative approach to spreading activation based on the Katz (1953) index assigns activation to node \(n \in N \hbox{ equal to} =\sum\nolimits_{l=1}^{\infty}\delta^l \cdot |\{ w_l(q,n)\}|,\) where {w l (q, n)} is the set of all walks of length l from query vertex q to vertex n and δ is an attenuation factor. This approach exhibited behavior very similar to that of MaxActivation in the evaluation set forth below but for brevity is omitted.
For a discussion of alternatives to the GN benchmarks, see Lancichinetti et al. (2008).
Clauset et al. (2009) describe a procedure for fitting degree distributions to a power-law function and provide code for this procedure at http://www.santafe.edu/~aaronc/powerlaws. Under this procedure, none of the 12 graphs has a statistically significant fit to a power-law distribution.
The highest modularity partition of a graph does not necessarily correspond to the actual community structure (Fortunato and Barthelemy 2007), and alternative metrics sometimes lead to better community structure (Rosvall and Bergstrom 2007; Branting 2010b; Koutsourelakis and Eliassi-Rad 2008). However, modularity is the best-known community-structure criterion, so for reproducibility of the results described here, the partition that globally optimizes modularity was chosen as the first target community structure.
For link-partition communities, k was the size of the largest community containing s.
References
Ahn Y-Y, Bagrow JP (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761–764
Bagrow J (2008) Evaluating local community methods in networks. J Stat Mech 2008(5):P05001
Bagrow J, Bollt E (2005) Local method for detecting communities. Phys Rev E 72(4):046108
Branting LK (2010a) Incremental detection of local community structure. In: International conference on advances, in social network analysis and mining. IEEE Computer Society, Los Alamitos, CA, USA, pp 80–87
Branting LK (2010b) Information theoretic criteria for community detection. Adv Soc Netw Min Anal Lect Notes Comput Sci 5498:114–130
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley, Reading
Chakrabarti D (2004) Autopart: parameter-free graph partitioning and outlier detection. In: Proceedings of the European conference on machine learning and practice of knowledge discovery in databases, pp 112–124
Chen J, Zaiane O, Goebel R (2009) Local community identification in social networks. In: Proceedings of the international conference on advances in social networks analysis and mining (ASONAM), Athens, Greece, 20–22 July
Clauset A (2005) Finding local community structure in networks. Phys Rev E (Stat Nonlinear Soft Matter Phys) 72(2):026132 (online). http://link.aps.org/abstract/PRE/v72/e026132
Clauset A, Shalizi C, Newman M (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703
Cohen J (2008) Barycentric graph clustering. http://www2.computer.org/cms/Computer.org/dl/mags/cs/2009/04/extras/msp2009040029s2.pdf
Collins A, Loftus F (1975) A spreading-activation theory of semantic processing. Psychol Rev 82(6):407–428
Crestani F (1997) Application of spreading activation techniques in information retrieval. Artif Intell Rev 11(6):453–482
Danon L, Díaz-Guilera A, Duch J, Arenas A (2005) Comparing community structure identification. J Stat Mech Theory Exp 2005(9):P09008 (online). http://stacks.iop.org/1742-5468/2005/i=09/a=P09008
Fortunato S, Barthelemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41
Girvan M, Newman M (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12):7821–7826
Gleiser PM, Danon L (2003) Community structure in jazz. Adv Complex Syst (ACS) 6(4):565–573
Jin D, He D, Liu D, Baquero C (2010) Genetic algorithm with local search for community mining in complex networks. In: ICTAI (1)’10, pp 105–112
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43
Koutsourelakis P, Eliassi-Rad T (2008) Finding mixed-memberships in social networks. In: Proceedings of the 2008 AAAI spring symposium on social information processing. AAAI, Stanford, CA
Knuth D (1993) The Stanford GraphBase: a platform for combinatorial computing. ACM, New York
Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78:046110
Leskovec J, Lang K, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the international conference on World Wide Web (WWW), 26–30 April. ACM, Raleigh, NC
Luo F, Wang J, Promislow E (2008) Exploring local community structures in large networks. Web Intell Agent Syst 6(4):387–400
Lusseau D, Schneider K, Boisseau O, Haase P, Slooten E, Dawson S (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405
Muff ACS, Rao F (2005) Local modularity measure for network clusterizations. Phys Rev E 72:056107
Newman MEJ (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69:066133 (online). doi:10.1103/PhysRevE.69.066133
Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E (Stat Nonlinear Soft Matter Phys) 74(3):036104+ (online). http://dx.doi.org/10.1103/PhysRevE.74.036104
Newman M (2009) Network data. http://www-personal.umich.edu/~mejn/netdata/
Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E Stat Nonlinear Soft Matter Phys 69(2 Pt 2):026113 (online). http://view.ncbi.nlm.nih.gov/pubmed/14995526
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76:036106
Rosvall M, Bergstrom C (2007) An information-theoretic framework for resolving community structure in complex networks. Proc Natl Acad Sci USA 104:7327–7331
Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci USA 105(4):1118–1123
Watts D, Strogatz S (1998) Collective dynamics of ’small-world’ networks. Nature 393(6684):440–442
Zachary W (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473
Zhang H, Giles CL, Foley HC, Yen J (2007) Probabilistic community discovery using hierarchical latent gaussian mixture model. In: AAAI’07: proceedings of the 22nd national conference on artificial intelligence. AAAI Press, Menlo Park, pp 663–668
Acknowledgments
This work was funded under contract number CECOM W15P7T-09-C-F600. The MITRE Corporation is a not-for-profit Federally Funded Research and Development Center chartered in the public interest.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Branting, L.K. Context-sensitive detection of local community structure. Soc. Netw. Anal. Min. 2, 279–289 (2012). https://doi.org/10.1007/s13278-011-0035-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13278-011-0035-7