Skip to main content
Log in

Efficient and effective algorithms for densest subgraph discovery and maintenance

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

The densest subgraph problem (DSP) is of great significance due to its wide applications in different domains. Meanwhile, diverse requirements in various applications lead to different density variants for DSP. Unfortunately, existing DSP algorithms cannot be easily extended to handle those variants efficiently and accurately. To fill this gap, we first unify different density metrics into a generalized density definition. We further propose a new model, c-core, to locate the general densest subgraph and show its advantage in accelerating the search process. Extensive experiments show that our c-core-based optimization can provide up to three orders of magnitude speedup over baselines. Methods for maintenance of c-core location are designed to accelerate updates on dynamic graphs. Moreover, we study an important variant of DSP under a size constraint, namely the densest-at-least-k-subgraph (DalkS) problem. We propose an algorithm based on graph decomposition, and it is likely to give a solution that is at least 0.8 of the optimal density in our experiments, while the state-of-the-art method can only ensure a solution with a density of at least 0.5 of the optimal density. Our experiments show that our DalkS algorithm can achieve at least 0.99 of the optimal density for over one-third of all possible size constraints. In addition, we develop an approximation algorithm for the DalkS problem that can be more efficient than the state-of-the-art algorithm while keeping the same approximation ratio of \(\frac{1}{3}\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Algorithm 3
Algorithm 4
Fig. 4
Algorithm 5
Algorithm 6
Algorithm 7
Algorithm 8
Algorithm 9
Fig. 5
Algorithm 10
Algorithm 11
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. A solution with at least \(f \cdot OPT\) density (\(0 < f \le 1\)) means its density is at least f of the optimal density. For simplicity, we call this solution an \(f \cdot OPT\) solution.

  2. Empirically, the method based on [13] is slower than Greedy++ equipped with c-core location in our experiments.

  3. f-approximation method/algorithm means that for every input, the algorithm can guarantee a \(f \cdot OPT\) solution, \(0< f\le 1\). In general, all solutions output by the f-approximation algorithm are called f-approximation.

  4. We implement Algorithm 3 in our experiments. The algo has no significant influence on the time cost because the located core in Algorithm 2 is already very small.

  5. The process of adding a new vertex can be decomposed into the following: add an isoloated vertex; add its adjacent edges; increase its vertex weight.

  6. The coreness of a vertex is defined as the highest value of c for which the vertex is part of the corresponding c-core.

  7. This complexity is a version of Theorem 2.1 from [13] when their algo is equipped with our c-core location and density search strategy.

  8. Our code is available at https://github.com/Xyc-arch/Efficient-and-Effecive-algorithms-for-generalized-densest-subgraph-discovery

  9. We require \(0.909 \cdot OPT\) solution for Greedy++ because better solutions, e.g., \(0.99 \cdot OPT\) solution, cost too much time.

  10. Each vertical line symbolizes a distinct graph with overall graph size and core size compared.

  11. \(w_v = \frac{1}{2t}\) for v in left partition and \(w_v = \frac{t}{2}\) for v in right partition. We pick \(t = 2\) in our experiment.

  12. By combining techniques from Combinatorial-DalkSS it is a \(\text {max}(\frac{k}{|\tilde{K}^*|}, \frac{1}{2}) \cdot OPT\) solution.

  13. The reason we choose these datasets is that their sizes vary from small to large, and thus are representative.

References

  1. Ahuja, R., Orlin, J., Stein, C., Tarjan, R.: Improved algorithms for bipartite network flow. SIAM J. Comput. 23, 906–933 (1994)

    Article  MathSciNet  Google Scholar 

  2. Alper, B., Bach, B., Riche, N.H., Isenberg, T., Fekete, J.-D.: Weighted graph comparison techniques for brain connectivity analysis. In: CHI, pages 483–492. Association for Computing Machinery (2013)

  3. Andersen, R., Chellapilla, K.: Finding dense subgraphs with size bounds. In: Algorithms and Models for the Web-Graph, pages 25–37. Springer Berlin Heidelberg (2009)

  4. Anderson, R.J., Setubal, J.C.: On the parallel implementation of goldberg’s maximum flow algorithm. In: SPAA, page 168–177. ACM (1992)

  5. Aridhi, S., Brugnara, M., Montresor, A., Velegrakis, Y.: Distributed k-core decomposition and maintenance in large dynamic graphs. In: Proceedings of the 10th ACM International Conference on Distributed and Event-Based Systems, DEBS ’16, page 161–168, New York, NY, USA, (2016). Association for Computing Machinery

  6. Asahiro, Y., Iwama, K., Tamaki, H., Tokuyama, T.: Greedily finding a dense subgraph. J. Algorithms 34, 203–221 (2000)

    Article  MathSciNet  Google Scholar 

  7. Asahiro, Y., Hassin, R., Iwama, K.: Complexity of finding dense subgraphs. Discrete Appl. Math. 121(1–3), 15–26 (2002)

    Article  MathSciNet  Google Scholar 

  8. Asahiro, Y., Hassin, R., Iwama, K.: Complexity of finding dense subgraphs. Discret. Appl. Math. 121(1), 15–26 (2002)

    Article  MathSciNet  Google Scholar 

  9. Bahmani, B., Kumar, R., Vassilvitskii, S.: Densest subgraph in streaming and mapreduce. PVLDB 5(5), 454–465 (2012)

    Google Scholar 

  10. Bera, S.K., Bhattacharya, S., Choudhari, J., Ghosh, P.: A new dynamic algorithm for densest subhypergraphs. In: WWW, pages 1093–1103 (2022)

  11. Boob, D., Gao, Y., Peng, R., Sawlani, S., Tsourakakis, C.E., Wang, D., Wang, J.: Flowless: extracting densest subgraphs without flow computations. In: WWW, pages 573–583. ACM / IW3C2 (2020)

  12. Charikar, M.: Greedy approximation algorithms for finding dense components in a graph. In: Approximation Algorithms for Combinatorial Optimization, pages 84–95. Springer Berlin Heidelberg (2000)

  13. Chekuri, C., Quanrud, Kent, T., Manuel R.: Densest Subgraph: Supermodularity, Iterative Peeling, and Flow, pages 1531–1555 (2022)

  14. Chen, J., Saad, Y.: Dense subgraph extraction with application to community detection. TKDE 24, 1216–1230 (2012)

    Google Scholar 

  15. Chen, T., Tsourakakis, C.: Antibenford subgraphs: Unsupervised anomaly detection in financial networks. In: KDD, pages 2762–2770 (2022)

  16. Conti, E., Cao, S., Thomas, A.J.: Disruptions in the us airport network. arXiv, (2013)

  17. Danisch, M., Hubert Chan, T.-H., Sozio, M.: Large scale density-friendly graph decomposition via convex programming. In: WWW, pages 233–242. ACM (2017)

  18. Dinitz, Y: Dinitz’algorithm: The original version and even’s version. In: Theoretical Computer Science: Essays in Memory of Shimon Even, pages 218–240. Springer (2006)

  19. Du, X., Jin, R., Ding, L., Lee, V.E., Thornton Jr., J.H.: Migration motif: a spatial - temporal pattern mining approach for financial markets. In: KDD, pages 1135–1144. ACM (2009)

  20. Eidsaa, M., Almaas, E.: \(s\)-core network decomposition: a generalization of \(k\)-core analysis to weighted networks. Phys. Rev. E 88, 062819 (2013)

    Article  Google Scholar 

  21. Epasto, A., Lattanzi, S., Sozio, M.: Efficient densest subgraph computation in evolving graphs. In: WWW, pages 300–310. ACM (2015)

  22. Fang, Y., Kaiqiang, Yu., Cheng, R., Lakshmanan, L.V.S., Lin, X.: Efficient algorithms for densest subgraph discovery. PVLDB 12(11), 1719–1732 (2019)

    Google Scholar 

  23. Fang, Y., Luo, W., Ma, C.: Densest subgraph discovery on large graphs: applications, challenges, and techniques. PVLDB 15(12), 3766–3769 (2022)

    Google Scholar 

  24. Feige, U., Kortsarz, G., Peleg, D.: The dense k-subgraph problem. Algorithmica 29(3), 12 (2001)

    Article  MathSciNet  Google Scholar 

  25. Feng, W., Liu, S., Koutra, D., Shen, H., Cheng, X.: Specgreedy: unified dense subgraph detection. In: ECML/PKDD, pages 181–197. Springer-Verlag (2020)

  26. Fratkin, E., Naughton, B.T., Brutlag, D.L., Batzoglou, S.: Motifcut: regulatory motifs finding with maximum density subgraphs. Bioinformatics 22(14), e150–e7 (2006)

    Article  Google Scholar 

  27. Gibson, D., Kumar, R., Tomkins, A.: Discovering large dense subgraphs in massive graphs. In: PVLDB, pages 721–732. ACM (2005)

  28. Gionis, A., Tsourakakis, C.E.: Dense subgraph discovery: Kdd 2015 tutorial. In: KDD, pages 2313–2314. ACM (2015)

  29. Goldberg, A.V.: Finding a maximum density subgraph (1984)

  30. Goldberg, A.V., Tarjan, R.E.: Finding minimum-cost circulations by successive approximation. Math. Oper. Res. 15(3), 430–466 (1990)

    Article  MathSciNet  Google Scholar 

  31. Goldberg, A.V., Tarjan, R.E.: Finding minimum-cost circulations by successive approximation. Math. Oper. Res. 15, 430–466 (1990)

    Article  MathSciNet  Google Scholar 

  32. He, Y., Wang, K., Zhang, W., Lin, X., Zhang, Y.: Scaling up k-clique densest subgraph detection. PACMMOD, 1(1), (2023)

  33. Hooi, B., Song, H.A.H., Beutel, A., Shah, N., Shin, K., Faloutsos, C.: FRAUDAR: bounding graph fraud in the face of camouflage. In: KDD, pages 895–904. ACM (2016)

  34. Hu, S., Wu, X., Hubert Chan, T.-H.: Maintaining densest subsets efficiently in evolving hypergraphs. In: CIKM, pages 929–938. ACM (2017)

  35. Huang, X., Cheng, H., Qin, L., Tian, W., Yu, J.X.: Querying k-truss community in large and dynamic graphs. In: SIGMOD, pages 1311–1322. ACM (2014)

  36. Khuller, S., Saha, B.: On finding dense subgraphs. In: Automata, Languages and Programming, pages 597–608. Springer Berlin Heidelberg (2009)

  37. Kumar, R., Raghavan, P., Rajagopalan, S., Sivakumar, D., Tompkins, A., Upfal, E.: The web as a graph. In: PODS, pages 1–10. ACM (2000)

  38. Leskovec, J., Krevl, A.: SNAP Datasets: stanford large network dataset collection, (2014)

  39. Li, N., Zhu, H., Wenhao, L., Cui, N., Liu, W., Yin, J., Jianliang, X., Lee, W.-C.: The most tenuous group query. Front. Comp. Sci. 17(2), 172605 (2023)

    Article  Google Scholar 

  40. Lin, Z., Zhang, F., Lin, X., Zhang, W., Tian, Z.: Hierarchical core maintenance on large dynamic graphs. PVLDB 14(5), 757–770 (2021)

    Google Scholar 

  41. Ma, C., Fang, Y., Cheng, R., Lakshmanan, L.V.S., Zhang, W., Lin, X.: Efficient algorithms for densest subgraph discovery on large directed graphs. In: SIGMOD, pages 1051–1066. ACM (2020)

  42. Ma, C., Fang, Y., Cheng, R., Lakshmanan, L.V.S., Zhang, W., Lin, X.: Efficient algorithms for densest subgraph discovery on large directed graphs. In: SIGMOD, pages 1051–1066 (2020)

  43. Ma, C., Fang, Y., Cheng, R., Lakshmanan, L.V.S., Zhang, W., Lin, X.: On directed densest subgraph discovery. TODS 46(4), 1–45 (2021)

    Article  MathSciNet  Google Scholar 

  44. Ma, C., Cheng, R., Lakshmanan, L.V.S., Han, X.: Finding locally densest subgraphs: A convex programming approach. PVLDB 15(11), 2719–2732 (2022)

    Google Scholar 

  45. Ma, C., Fang, Y., Cheng, R., Lakshmanan, L.V.S., Han, X.: A convex-programming approach for efficient directed densest subgraph discovery. In: SIGMOD, pages 845–859 (2022)

  46. Manurangsi, P.: Inapproximability of maximum biclique problems, minimum k-cut and densest at-least-k-subgraph from the small set expansion hypothesis. (2018). arXiv:1705.03581

  47. Mathieu, C., de Rougemont, M.: Large very dense subgraphs in a stream of edges (2020)

  48. McGregor, A., Tench, D., Vorotnikova, S., Vu, H.T.: Densest subgraph in dynamic graph streams. In: International Symposium on Mathematical Foundations of Computer Science, volume 9235, pages 472–482. Springer (2015)

  49. Newman, M.E.J.: The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. 98(2), 404–409 (2001)

    Article  MathSciNet  Google Scholar 

  50. Opsahl, T., Agneessens, F., Skvoretz, J.: Node centrality in weighted networks: generalizing degree and shortest paths. Social Netw. 32(3), 245–251 (2010)

    Article  Google Scholar 

  51. Opsahl, T.: Triadic closure in two-mode networks: redefining the global and local clustering coefficients. Social Netw. 35(2), 159–167 (2013)

    Article  Google Scholar 

  52. Qin, L., Li, R.-H., Chang, L., Zhang, C.: Locally densest subgraph discovery. In: KDD, pages 965–974 (2015)

  53. Rossi, R.A., Ahmed, N.K.: The network data repository with interactive graph analytics and visualization. In: AAAI (2015)

  54. Saha, A., Ke, X., Khan, A., Long, C.: Most probable densest subgraphs. In: ICDE, pages 1447–1460. IEEE (2023)

  55. Sarma, A.D., Lall, A., Nanongkai, D., Trehan, A.: Dense subgraphs on dynamic networks. arXiv (2012)

  56. Sawlani, S., Wang, J.: Near-optimal fully dynamic densest subgraph. In: STOC, pages 181–193. ACM (2020)

  57. Seidman, S.B.: Network structure and minimum degree. Social Netw. 5(3), 269–287 (1983)

    Article  MathSciNet  Google Scholar 

  58. Shiloach, Y., Vishkin, U.: An o(n2log n) parallel max-flow algorithm. J. Algorithms 3(2), 128–146 (1982)

    Article  MathSciNet  Google Scholar 

  59. Tang, J.K., Leontiadis, I., Scellato, S., Nicosia, V., Mascolo, C., Musolesi, M., Latora, V.: Applications of temporal graph metrics to real-world networks. CoRR, arXiv:1305.6974 (2013)

  60. Tarjan, R.E.: A simple version of karzanov’s blocking flow algorithm. Oper. Res. Lett. 2(6), 265–268 (1984)

    Article  MathSciNet  Google Scholar 

  61. Tatti, N.: Density-friendly graph decomposition. ACM TKDD 13(5), 54:1-54:29 (2019)

    Google Scholar 

  62. Tsourakakis, C.: The k-clique densest subgraph problem. In: WWW, page 1122–1132. IW3C2, (2015)

  63. Tsourakakis, C.E.: The k-clique densest subgraph problem. In: WWW, pages 1122–1132. ACM, (2015)

  64. Tsourakakis, C.E., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.A.: Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: KDD, pages 104–112. ACM, (2013)

  65. Ugander, J., Karrer, B., Backstrom, L., Marlow, C.: The anatomy of the facebook social graph. arXiv, (2011)

  66. Vishveshwara, S., Brinda, K.V., Kannan, N.: Protein structure: insights from graph theory. J. Theor. Comput. Chem. 01, 187–211 (2002)

    Article  Google Scholar 

  67. Shijie, X., Fang, J., Li, X.: Weighted laplacian method and its theoretical applications. IOP Conf. Ser. Mater. Sci. Eng. 768(7), 072032 (2020)

  68. Yichen, X., Ma, C., Fang, Y., Bao, Z.: Efficient and effective algorithms for generalized densest subgraph discovery. PACMMOD 1(2), 169:1-169:27 (2023)

    Google Scholar 

  69. Zhang, Y., Yu, J.X., Zhang, Y., Qin, L.: A fast order-based approach for core maintenance. In: ICDE, pages 337–348. IEEE Computer Society, (2017)

Download references

Acknowledgements

We would like to thank Wensheng Luo for his help during our revision. This work was partially supported by NSFC under Grant 62302421 and 62102341, Basic and Applied Basic Research Fund in Guangdong Province under Grant 2023A1515011280 and 2022A1515010166, and Shenzhen Science and Technology Program under Grants JCYJ20220530143602006 and ZDSYS 20211021111415025. Zhifeng Bao is supported in part by ARC DP220101434 and DP240101211. This paper was also supported by Guangdong Key Lab of Mathematical Foundations for Artificial Intelligence.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chenhao Ma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Y., Ma, C., Fang, Y. et al. Efficient and effective algorithms for densest subgraph discovery and maintenance. The VLDB Journal (2024). https://doi.org/10.1007/s00778-024-00855-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00778-024-00855-y

Keywords

Navigation