Overlapping Community Detection with a Maximal Clique Enumeration Method in MapReduce

Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 297)

Abstract

Overlapping community detection is progressively becoming an important issue in social network analysis (SNA). Faced with massive amounts of information while simultaneously restricted by hardware specifications and computation time limits, it is difficult for clustering analysis to reflect the latest developments or changes in complex networks. To meet these demands, this research proposes a novel distributed computation method, which combines MapReduce, a distributed computation framework, and the TTT algorithm, to speed up the discovery of all maximal cliques in large-scale social networks. Then, overlapping community detection is implemented by the Clique Percolation Method (CPM) to incrementally merge adjacent cliques based on k-cliques with k-1 common nodes. Six groups of YouTube datasets (from 50K to 300K nodes with interval 50K) are adopted to evaluate clustering quality and execution time of the proposed method.

Keywords

Social Network Analysis Overlapping Community Detection MapReduce 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Wasserman, S.: Social network analysis: Methods and applications. Cambridge University Press (1994)Google Scholar
  2. 2.
    Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99(12), 7821–7826 (2002)CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Xie, J., Kelley, S., Szymanski, B.K.: Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Computing Surveys (CSUR) 45(4), 43 (2013)CrossRefGoogle Scholar
  4. 4.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Communications of the ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  5. 5.
    Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theoretical Computer Science 363(1), 28–42 (2006)CrossRefMATHMathSciNetGoogle Scholar
  6. 6.
    Bron, C., Kerbosch, J.: Algorithm 457: finding all cliques of an undirected graph. Communications of the ACM 16(9), 575–577 (1973)CrossRefMATHGoogle Scholar
  7. 7.
    Schmidt, M.C., Samatova, N.F., Thomas, K., Park, B.H.: A scalable, parallel algorithm for maximal clique enumeration. Journal of Parallel and Distributed Computing 69(4), 417–428 (2009)CrossRefGoogle Scholar
  8. 8.
    Palla, G., Derényi, I., Vicsek, T.: The critical point of k-Clique percolation in the Erdős–Rényi graph. Journal of Statistical Physics 128(1-2), 219–227 (2007)CrossRefMATHMathSciNetGoogle Scholar
  9. 9.
    Michael, R.G., Johnson, D.S.: Computers and Intractability: A guide to the theory of NP-completeness. WH Freeman & Co., San Francisco (1979)MATHGoogle Scholar
  10. 10.
    Stam, C.J., Jones, B.F., Nolte, G., Breakspear, M., Scheltens, P.: Small-world networks and functional connectivity in Alzheimer’s disease. Cerebral Cortex 17(1), 92–99 (2007)CrossRefGoogle Scholar
  11. 11.
    Wu, B., Yang, S., Zhao, H., Wang, B.: A distributed algorithm to enumerate all maximal cliques in MapReduce. In: Proceedings of the Fourth International Conference on Frontier of Computer Science and Technology, FCST 2009, pp. 45–51 (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Computer Science and Information EngineeringShu-Te UniversityKaohsiung CityTaiwan

Personalised recommendations