Advertisement

Parallel Clique-Like Subgraph Counting and Listing

  • Yi Yang
  • Da Yan
  • Shuigeng ZhouEmail author
  • Guimu Guo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11788)

Abstract

Cliques and clique-like subgraphs (e.g., quasi-cliques) are important dense structures whose counting or listing are essential in applications like complex network analysis and community detection. These problems are usually solved by divide and conquer, where a task over a big graph can be recursively divided into subtasks over smaller subgraphs whose search spaces are disjoint. This divisible algorithmic paradigm brings enormous potential for parallelism, since different subtasks can run concurrently to drastically reduce the overall running time.

In this paper, we explore this potential by proposing a unified framework for counting and listing clique-like subgraphs. We study how to divide and distribute the counting and listing tasks, and meanwhile, to balance the assigned workloads of each thread dynamically. Four applications are studied under our parallel framework, i.e., triangle counting, clique counting, maximal clique listing and quasi-clique listing. Extensive experiments are conducted which demonstrate that our solution achieves an ideal speedup on various real graph datasets.

Keywords

Dense subgraph mining Parallel computation Unified framework 

Notes

Acknowledgements

Yang and Zhou were supported by National Natural Science Foundation of China (NSFC) under grant No. U1636205, Yan and Guo were partially supported by NSF OAC-1755464 and NSF DGE-1723250.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
    Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. J. ACM 46(5), 720–748 (1999).  https://doi.org/10.1145/324133.324234MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Cheng, J., Zhu, L., Ke, Y., Chu, S.: Fast algorithms for maximal clique enumeration with limited memory. In: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, Beijing, China, 12–16 August 2012, pp. 1240–1248 (2012).  https://doi.org/10.1145/2339530.2339724
  9. 9.
    Du, N., Wu, B., Xu, L., Wang, B., Pei, X.: A parallel algorithm for enumerating all maximal cliques in complex network. In: Workshops Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), Hong Kong, China, 18–22 December 2006, pp. 320–324 (2006).  https://doi.org/10.1109/ICDMW.2006.17
  10. 10.
    Finocchi, I., Finocchi, M., Fusco, E.G.: Counting small cliques in mapreduce. CoRR abs/1403.0734 (2014). http://arxiv.org/abs/abs/1403.0734
  11. 11.
    Finocchi, I., Finocchi, M., Fusco, E.G.: Clique counting in mapreduce: algorithms and experiments. ACM J. Exp. Algorithmics 20, 1.7:1–1.7:20 (2015).  https://doi.org/10.1145/2794080MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Khosraviani, A., Sharifi, M.: A distributed algorithm for \(\gamma \)-quasi-clique extractions in massive graphs. In: Pichappan, P., Ahmadi, H., Ariwa, E. (eds.) INCT 2011. CCIS, vol. 241, pp. 422–431. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-27337-7_40CrossRefGoogle Scholar
  13. 13.
    Kumpula, J.M., Kivela, M., Kaski, K., Saramaki, J.: Sequential algorithm for fast clique percolation. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 78(2), 026109 (2008)CrossRefGoogle Scholar
  14. 14.
    Matula, D.W., Beck, L.L.: Smallest-last ordering and clustering and graph coloring algorithms. J. ACM 30(3), 417–427 (1983).  https://doi.org/10.1145/2402.322385MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    McCune, R.R., Weninger, T., Madey, G.: Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput. Surv. 48(2), 25:1–25:39 (2015).  https://doi.org/10.1145/2818185CrossRefGoogle Scholar
  16. 16.
    Michael, M.M., Scott, M.L.: Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In: Proceedings of the Fifteenth Annual ACM Symposium on Principles of Distributed Computing, Philadelphia, Pennsylvania, USA, 23–26 May 1996, pp. 267–275 (1996).  https://doi.org/10.1145/248052.248106
  17. 17.
    Pardalos, P.M., Rebennack, S.: Computational challenges with cliques, quasi-cliques and clique partitions in graphs. In: Festa, P. (ed.) SEA 2010. LNCS, vol. 6049, pp. 13–22. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-13193-6_2CrossRefGoogle Scholar
  18. 18.
    Pardalos, P.M., Xue, J.: The maximum clique problem. J. Global Optim. 4(3), 301–328 (1994)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Ribeiro, P.M.P., Silva, F.M.A., Lopes, L.M.B.: Efficient parallel subgraph counting using G-tries. In: Proceedings of the 2010 IEEE International Conference on Cluster Computing, Heraklion, Crete, Greece, 20–24 September 2010, pp. 217–226 (2010).  https://doi.org/10.1109/CLUSTER.2010.27
  20. 20.
    Schmidt, M.C., Samatova, N.F., Thomas, K., Park, B.: A scalable, parallel algorithm for maximal clique enumeration. J. Parallel Distrib. Comput. 69(4), 417–428 (2009).  https://doi.org/10.1016/j.jpdc.2009.01.003CrossRefGoogle Scholar
  21. 21.
    Svendsen, M., Mukherjee, A.P., Tirthapura, S.: Mining maximal cliques from a large graph using mapreduce: tackling highly uneven subproblem sizes. J. Parallel Distrib. Comput. 79–80, 104–114 (2015).  https://doi.org/10.1016/j.jpdc.2014.08.011CrossRefGoogle Scholar
  22. 22.
    Tsourakakis, C.E., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.A.: Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, 11–14 August 2013, pp. 104–112 (2013).  https://doi.org/10.1145/2487575.2487645
  23. 23.
    Wu, B., Yang, S., Zhao, H., Wang, B.: A distributed algorithm to enumerate all maximal cliques in mapreduce. In: Fourth International Conference on Frontier of Computer Science and Technology, FCST 2009, Shanghai, China, 17–19 December 2009, pp. 45–51 (2009).  https://doi.org/10.1109/FCST.2009.30
  24. 24.
    Xiang, J., Guo, C., Aboulnaga, A.: Scalable maximum clique computation using mapreduce. In: 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, 8–12 April 2013, pp. 74–85 (2013).  https://doi.org/10.1109/ICDE.2013.6544815
  25. 25.
    Xu, Y., Cheng, J., Fu, A.W.: Distributed maximal clique computation and management. IEEE Trans. Serv. Comput. 9(1), 110–122 (2016).  https://doi.org/10.1109/TSC.2015.2479225CrossRefGoogle Scholar
  26. 26.
    Xu, Y., Cheng, J., Fu, A.W., Bu, Y.: Distributed maximal clique computation. In: 2014 IEEE International Congress on Big Data, Anchorage, AK, USA, 27 June–2 July 2014, pp. 160–167 (2014).  https://doi.org/10.1109/BigData.Congress.2014.31
  27. 27.
    Yan, D., Bu, Y., Tian, Y., Deshpande, A.: Big graph analytics platforms. Found. Trends Databases 7(1–2), 1–195 (2017).  https://doi.org/10.1561/1900000056CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Shanghai Key Lab of Intelligent Information Processing, and School of Computer ScienceFudan UniversityShanghaiChina
  2. 2.Department of Computer ScienceThe University of Alabama at BirminghamBirminghamUSA

Personalised recommendations