Parallel Graph Clustering Based on Minhash
Graph clustering is a technique for grouping vertices having similar characteristics into the same cluster. It is widely used to analyze graph data and identify its characteristics. Recently, a large-capacity large-scale graph data is being generated in a variety of applications such as a social network service, a world wide web, and a telephone network. Therefore, the importance of clustering technique for efficiently processing large capacity graph data is increasing. In this paper, we propose a clustering algorithm that efficiently generates clusters of large capacity graph data. Our proposed method efficiently estimates the similarity between clusters in the graph using Min-Hash and generates clusters according to the calculated similarity. In the experiment using real world data, we show the efficiency of the proposed method compared with the proposed method and existing graph clustering methods.