Parallel Clique-Like Subgraph Counting and Listing
Cliques and clique-like subgraphs (e.g., quasi-cliques) are important dense structures whose counting or listing are essential in applications like complex network analysis and community detection. These problems are usually solved by divide and conquer, where a task over a big graph can be recursively divided into subtasks over smaller subgraphs whose search spaces are disjoint. This divisible algorithmic paradigm brings enormous potential for parallelism, since different subtasks can run concurrently to drastically reduce the overall running time.
In this paper, we explore this potential by proposing a unified framework for counting and listing clique-like subgraphs. We study how to divide and distribute the counting and listing tasks, and meanwhile, to balance the assigned workloads of each thread dynamically. Four applications are studied under our parallel framework, i.e., triangle counting, clique counting, maximal clique listing and quasi-clique listing. Extensive experiments are conducted which demonstrate that our solution achieves an ideal speedup on various real graph datasets.
KeywordsDense subgraph mining Parallel computation Unified framework
Yang and Zhou were supported by National Natural Science Foundation of China (NSFC) under grant No. U1636205, Yan and Guo were partially supported by NSF OAC-1755464 and NSF DGE-1723250.
- 8.Cheng, J., Zhu, L., Ke, Y., Chu, S.: Fast algorithms for maximal clique enumeration with limited memory. In: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, Beijing, China, 12–16 August 2012, pp. 1240–1248 (2012). https://doi.org/10.1145/2339530.2339724
- 9.Du, N., Wu, B., Xu, L., Wang, B., Pei, X.: A parallel algorithm for enumerating all maximal cliques in complex network. In: Workshops Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), Hong Kong, China, 18–22 December 2006, pp. 320–324 (2006). https://doi.org/10.1109/ICDMW.2006.17
- 10.Finocchi, I., Finocchi, M., Fusco, E.G.: Counting small cliques in mapreduce. CoRR abs/1403.0734 (2014). http://arxiv.org/abs/abs/1403.0734
- 16.Michael, M.M., Scott, M.L.: Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In: Proceedings of the Fifteenth Annual ACM Symposium on Principles of Distributed Computing, Philadelphia, Pennsylvania, USA, 23–26 May 1996, pp. 267–275 (1996). https://doi.org/10.1145/248052.248106
- 19.Ribeiro, P.M.P., Silva, F.M.A., Lopes, L.M.B.: Efficient parallel subgraph counting using G-tries. In: Proceedings of the 2010 IEEE International Conference on Cluster Computing, Heraklion, Crete, Greece, 20–24 September 2010, pp. 217–226 (2010). https://doi.org/10.1109/CLUSTER.2010.27
- 22.Tsourakakis, C.E., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.A.: Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, 11–14 August 2013, pp. 104–112 (2013). https://doi.org/10.1145/2487575.2487645
- 23.Wu, B., Yang, S., Zhao, H., Wang, B.: A distributed algorithm to enumerate all maximal cliques in mapreduce. In: Fourth International Conference on Frontier of Computer Science and Technology, FCST 2009, Shanghai, China, 17–19 December 2009, pp. 45–51 (2009). https://doi.org/10.1109/FCST.2009.30
- 24.Xiang, J., Guo, C., Aboulnaga, A.: Scalable maximum clique computation using mapreduce. In: 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, 8–12 April 2013, pp. 74–85 (2013). https://doi.org/10.1109/ICDE.2013.6544815
- 26.Xu, Y., Cheng, J., Fu, A.W., Bu, Y.: Distributed maximal clique computation. In: 2014 IEEE International Congress on Big Data, Anchorage, AK, USA, 27 June–2 July 2014, pp. 160–167 (2014). https://doi.org/10.1109/BigData.Congress.2014.31