Parallel Graph Clustering Based on Minhash

  • Byoungwook KimEmail author
  • Jaehwa Chung
  • Joon-Min Gil
  • JinGon Shon
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 536)


Graph clustering is a technique for grouping vertices having similar characteristics into the same cluster. It is widely used to analyze graph data and identify its characteristics. Recently, a large-capacity large-scale graph data is being generated in a variety of applications such as a social network service, a world wide web, and a telephone network. Therefore, the importance of clustering technique for efficiently processing large capacity graph data is increasing. In this paper, we propose a clustering algorithm that efficiently generates clusters of large capacity graph data. Our proposed method efficiently estimates the similarity between clusters in the graph using Min-Hash and generates clusters according to the calculated similarity. In the experiment using real world data, we show the efficiency of the proposed method compared with the proposed method and existing graph clustering methods.


Graph clustering Spark 


  1. 1.
    Kang, U., Faloutsos, C.: Big graph mining: algorithms and discoveries. ACM SIGKDD Explor. Newslett. 14(2), 29–36 (2012)CrossRefGoogle Scholar
  2. 2.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Byoungwook Kim
    • 1
    Email author
  • Jaehwa Chung
    • 2
  • Joon-Min Gil
    • 3
  • JinGon Shon
    • 2
  1. 1.Department of Computer EngineeringDongguk UniversityGyeongjuKorea
  2. 2.Department of Computer ScienceKorea National Open UniversitySeoulKorea
  3. 3.School of Information Technology EngineeringCatholic University of DaeguDaeguKorea

Personalised recommendations