Abstract
In recent years, a series of algorithms have been proposed to detect community from social networks. Most of the algorithms are based on traditional spectral clustering algorithms such as k-means. One of the major limitations of such algorithms is that entire eigenvalues of the similarity matrix of the network need to be calculated in advance. In the case of a massive network, calculating entire eigenvalues is computationally expensive. This paper proposes a scalable geometric algorithm to find communities from large social networks. The major contributions of this work are: (1) We transform the network data into points by preserving the intrinsic properties and structure of the original data. (2) A novel geometric clustering is derived. And we use the data structure C-Tree and Voronoi diagram for identifying communities from the points in the Euclidean plane. (3) Since social networks grow dynamically, we further extend the algorithm to incrementally identify the community membership of newly introduced members. Experiments on both synthetic and real-world datasets show that the algorithm, in terms of objective matrices, is equally good as spectral clustering algorithm.
Similar content being viewed by others
References
Agarwal G, Kempe D (2008) Modularity-maximizing graph communities via mathematical programming. Eur Phys J B 66(3):409–418
Aggarwal CC, Wang H (2010) Managing and mining graph data. Springer, New York
Barnes ER (1982) An algorithm for partitioning the nodes of a graph. SIAM J Algebraic Discret Methods 3:541–550
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 10:P10008
Brand M, Huang K (2003) A unifying theorem for spectral embedding and clustering. In: AISTATS
Chen M, Kuzmin K, Szymanski BK (2014) Community detection via maximization of modularity and its variants. IEEE Trans Comput Soc Syst 1(1):46–65
Chung FRK (1997) Spectral graph theory, vol 92. American Mathematical Society, Providence
Condon A, Karp RM (2001) Algorithms for graph partitioning on the planted partition model. Random Struct Algorithms 18:116–140 Citeseer
Dhanjal C, Gaudel R, Clémençon S (2014) Efficient eigen-updating for spectral graph clustering. Neurocomputing 131:440–452 Elsevier
Dhillon IS, Guan Y, Kulis B (2007) Weighted graph cuts without eigenvectors a multilevel approach. IEEE Trans Pattern Anal Mach Intell 29:1944–1957
DuBois CL (2008) UCI network data repository. University of California, Irvine, School of Information and Computer Sciences.http://networkdata.ics.uci.edu
Duchon J (1977) Splines minimizing rotation-invariant semi-norms in Sobolev spaces. In: Schempp W, Zeller K(eds) Constructive theory of functions of several variables. Springer, Berlin Heidelberg, pp 85–100
Ganti V, Ramakrishnan R, Gehrke J, Powell A, French J (1999) Clustering large datasets in arbitrary metric spaces. In: Proceedings of the 15th IEEE international conference on data engineering, 1999, pp 502–511
Hartigan JA, Wong MA (1979) Algorithm as 136: A k-means clustering algorithm. J R Stat Soc. Series C (Applied Statistics) 28(1):100–108
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392
Karypis G, Kumar V (1999) Parallel multilevel series k-way partitioning scheme for irregular graphs. SIAM Rev 41(2):278–300
Karypis G, Vipin K (1997) A coarse-grain parallel formulation of multilevel k-way graph partitioning algorithm. In: PPSC
Kernighan BW, Lin S (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J 49(2):291–307
Lancichinetti Andrea, Fortunato Santo, Radicchi Filippo (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78(4):46–110
Lanczos C (1950) An iteration method for the solution of the eigen- value problem of linear differential and integral operators. United States Government. Press Office
Leskovec J, Krevl A (2014) SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data
Massen CP, Doye JPK (2005) Identifying communities within energy landscapes. Phys Rev E 71(4):046101
Medus A, Acuna G, Dorso CO (2005) Detection of community structures in networks via global optimization. Phys A Stat Mech Appl 358(2):593–604 Elsevier
Min W, Ke L, He X (2004) Locality pursuit embedding. Pattern Recognit 37(4):781–788
Newman MEJ (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69(6):066–133
Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582
Newman MEJ (2013) Spectral methods for community detection and graph partitioning. Phys Rev E 88(4):042822
Ng AY, Jordan MI, Weiss Y et al (2002) On spectral clustering: Analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856
Nguyen NP, Dinh TN, Xuan Y, Thai MT (2011) Adaptive algorithms for detecting community structure in dynamic social networks. In: Proceedings of the IEEE INFOCOM, pp 2282–2290
Nguyen NP, Dinh TN, Shen Y, Thai MT (2014) Dynamic social community detection and its applications. PloS One 9(4):e91431
Ning H, Xu W, Chi Y, Gong Y, Huang TS (2007) Incremental spectral clustering with application to monitoring of evolving blog communities. In: SIAM, pp 261–272
Preparata FP, Shamos M (2012) Computational geometry: an introduction. Springer, New York
Rattigan MJ, Maier M, Jensen D (2007) Graph clustering with network structure indices. In: Proceedings of the 24th international conference on machine learning, ACM, pp 783–790
Sales-Pardo M, Guimera R, Moreira AA, Nunes Amaral LA (2007) Extracting the hierarchical organization of complex systems. Proc Natl Acad Sci 104(39):15224–15229
Shen H-W, Cheng X-Q, Wang Y-Z, Chen Y (2012) A dimensionality reduction framework for detection of multiscale structure in heterogeneous networks. J Comput Sci Technol 27(2):341–357
Surendran S, Chitraprasad D, Kaimal MR (2014) Voronoi diagram-based geometric approach for social network analysis. In: Computational intelligence, cyber security and computational models, Springer, pp 359–369
van der Maaten Laurens JP, Postma Eric O, van den Herik H Jaap (2009) Dimensionality reduction: a comparative review. J Mach Learn Res 10(1–41):66–71
Whang JJ, Sui X, Dhillon IS (2012) Scalable and memory efficient clustering of large-scale social networks. In: IEEE 12th international conference on data mining (ICDM), pp 705–714
Xiang Shiming, Nie Feiping, Song Yangqiu, Zhang Changshui, Zhang Chunxia (2009) Embedding new data points for manifold learning via coordinate propagation. Knowl Inf Syst 19(2):159–184
Xie J, Chen M, Szymanski BK (2013) LabelrankT: incremental community detection in dynamic networks via label propagation. In: Proceedings of the workshop on dynamic networks management and mining, ACM, pp 25–32
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Surendran, S., Chithraprasad, D. & Kaimal, M.R. A scalable geometric algorithm for community detection from social networks with incremental update. Soc. Netw. Anal. Min. 6, 90 (2016). https://doi.org/10.1007/s13278-016-0399-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-016-0399-9