Skip to main content
Log in

A survey of community search over big graphs

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

A Correction to this article was published on 11 November 2019

This article has been updated

Abstract

With the rapid development of information technologies, various big graphs are prevalent in many real applications (e.g., social media and knowledge bases). An important component of these graphs is the network community. Essentially, a community is a group of vertices which are densely connected internally. Community retrieval can be used in many real applications, such as event organization, friend recommendation, and so on. Consequently, how to efficiently find high-quality communities from big graphs is an important research topic in the era of big data. Recently, a large group of research works, called community search, have been proposed. They aim to provide efficient solutions for searching high-quality communities from large networks in real time. Nevertheless, these works focus on different types of graphs and formulate communities in different manners, and thus, it is desirable to have a comprehensive review of these works. In this survey, we conduct a thorough review of existing community search works. Moreover, we analyze and compare the quality of communities under their models, and the performance of different solutions. Furthermore, we point out new research directions. This survey does not only help researchers to have better understanding of existing community search solutions, but also provides practitioners a better judgment on choosing the proper solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

Change history

  • 11 November 2019

    In the original article, the Table��1 was published with incorrect figures. The correct Table��1 is given below

Notes

  1. Here, we only consider algorithms that assume the graph can be kept in the memory of a single machine.

  2. Email-Enron, Google, Livejournal are downloaded from https://snap.stanford.edu/data/index.html, and Wise is downloaded from http://www.wise2012.cs.ucy.ac.cy/challenge.html.

  3. ACM CCS: http://www.acm.org/publications/class-2012.

  4. Available at http://snap.stanford.edu/data/index.html.

References

  1. Amazon mechanical turk. https://www.mturk.com/

  2. Clique (graph theory). https://en.wikipedia.org/wiki/Clique_(graph_theory)

  3. Acquisti, A., Gross, R.: Imagined communities: awareness, information sharing, and privacy on the facebook. In: International Workshop on Privacy Enhancing Technologies, pp. 36–58 (2006)

    Google Scholar 

  4. Adamcsek, B., Palla, G., Farkas, I.J., Derényi, I., Vicsek, T.: Cfinder: locating cliques and overlapping modules in biological networks. Bioinformatics 22(8), 1021–1023 (2006)

    Google Scholar 

  5. Afrati, F.N., Fotakis, D., Ullman, J.D.: Enumerating subgraph instances using map-reduce. In: ICDE, pp. 62–73. IEEE (2013)

  6. Akbas, E., Zhao, P.: Truss-based community search: a truss-equivalence based indexing approach. PVLDB 10(11), 1298–1309 (2017)

    Google Scholar 

  7. Akiba, T., Iwata, Y., Yoshida, Y.: Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In: CIKM, pp. 909–918 (2013)

  8. Amelio, A., Pizzuti, C.: Overlapping community discovery methods: A survey. In: Social Networks: Analysis and Case Studies, pp. 105–125 (2014)

    Google Scholar 

  9. Andersen, R., Lang, K.J.: Communities from seed sets. In: WWW, pp. 223–232 (2006)

  10. Angadi, A., Varma, P.S.: Overlapping community detection in temporal networks. Indian J. Sci. Technol. 8(31), 1–6 (2015)

    Google Scholar 

  11. Archer, A., Lattanzi, S., Likarish, P., Vassilvitskii, S.: Indexing public-private graphs. In: WWW, pp. 1461–1470 (2017)

  12. Armenatzoglou, N., Papadopoulos, S., Papadias, D.: A general framework for geo-social query processing. PVLDB 6(10), 913–924 (2013)

    Google Scholar 

  13. Baeza-Yates, R., Hurtado, C., Mendoza, M. : Query recommendation using query logs in search engines. In: International Conference on Extending Database Technology, pp. 588–596. Springer (2004)

  14. Balasundaram, B., Butenko, S., Hicks, I.V.: Clique relaxations in social network analysis: the maximum k-plex problem. Oper. Res. 59(1), 133–142 (2011)

    MathSciNet  MATH  Google Scholar 

  15. Barbieri, N., Bonchi, F., Galimberti, E., Gullo, F.: Efficient and effective community search. DMKD 29(5), 1406–1433 (2015)

    MathSciNet  MATH  Google Scholar 

  16. Barthélemy, M.: Spatial networks. Phys. Rep. 499(1), 1–101 (2011)

    MathSciNet  Google Scholar 

  17. Batagelj, V., Zaversnik, M.: An o(m) algorithm for cores decomposition of networks. arXiv:cs/0310049 (2003)

  18. Batarfi, O., Shawi, R.E., Fayoumi, A.G., Nouri, R., Beheshti, S.-M.-R., Barnawi, A., Sakr, S.: Large scale graph processing systems: survey and an experimental evaluation. Clust. Comput. 18(3), 1189–1213 (2015)

    Google Scholar 

  19. Bazzi, M., Porter, M.A., Williams, S., McDonald, M., Fenn, D.J., Howison, S.D.: Community detection in temporal multilayer networks, with an application to correlation networks. Multiscale Model. Simul. 14(1), 1–41 (2016)

    MathSciNet  MATH  Google Scholar 

  20. Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword searching and browsing in databases using banks. In: ICDE, pp. 431–440. IEEE (2002)

  21. Bi, F., Chang, L., Lin, X., Zhang, W.: An optimal and progressive approach to online search of top-k influential communities. PVLDB 11(9), 1056–1068 (2018)

    Google Scholar 

  22. Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.: Graph structure in the web. Comput. Netw. 33(1–6), 309–320 (2000)

    Google Scholar 

  23. Brunato, M., Hoos, H. H., Battiti, R.: On effectively finding maximal quasi-cliques in graphs. In: International Conference on Learning and Intelligent Optimization, pp. 41–55 (2007)

    Google Scholar 

  24. Cai, L., Meng, T., He, T., Chen, L., Deng, Z.: K-hop community search based on local distance dynamics. In: International Conference on Neural Information Processing, pp. 24–34 (2017)

    Google Scholar 

  25. Chang, L., Lin, X., Qin, L., Yu, J. X., Zhang, W.: Index-based optimal algorithms for computing Steiner components with maximum connectivity. In: SIGMOD, pp. 459–474 (2015)

  26. Chang, L., Yu, J. X., Qin, L., Lin, X., Liu, C., Liang, W.: Efficiently computing k-edge connected components via graph decomposition. In: SIGMOD, pp. 205–216 (2013)

  27. Charikar, M.: Greedy approximation algorithms for finding dense components in a graph. In: International Workshop on Approximation Algorithms for Combinatorial Optimization, pp. 84–95 (2000)

    MATH  Google Scholar 

  28. Chen, L., Liu, C., Zhou, R., Li, J., Yang, X., Wang, B.: Maximum co-located community search in large scale social networks. PVLDB 11(10), 1233–1246 (2018)

    Google Scholar 

  29. Chen, P.-L., Chou, C.-K., Chen, M.-S. : Distributed algorithms for k-truss decomposition. In: International Conference on Big Data, pp. 471–480 (2014)

  30. Chen, S., Wei, R., Popova, D., Thomo, A.: Efficient computation of importance based communities in web-scale networks using a single machine. In: CIKM, pp. 1553–1562 (2016)

  31. Chen, Y., Fang, Y., Cheng, R., Li, Y., Chen, X., Zhang, J.: Exploring communities in large profiled graphs. TKDE 31(8), 1624–1629 (2019)

    Google Scholar 

  32. Chen, Y., Xu, J., Xu, M.: Finding community structure in spatially constrained complex networks. Int. J. Geogr. Inf. Sci. 29(6), 889–911 (2015)

    MathSciNet  Google Scholar 

  33. Cheng, H., Zhou, Y., Huang, X., Yu, J.X.: Clustering large attributed information networks: an efficient incremental computing approach. Data Min. Knowl. Discov. 25(3), 450–477 (2012)

    MathSciNet  MATH  Google Scholar 

  34. Cheng, J., Ke, Y., Chu, S., Özsu, M.T.: Efficient core decomposition in massive networks. In: ICDE, pp. 51–62 (2011)

  35. Cheng, J., Zeng, X., Yu, J. X.: Top-k graph pattern matching over large graphs. In: ICDE, pp. 1033–1044. IEEE (2013)

  36. Cheng, J., Zhu, L., Ke, Y., Chu, S.: Fast algorithms for maximal clique enumeration with limited memory. In: SIGKDD, pp. 1240–1248 (2012)

  37. Chiba, N., Nishizeki, T.: Arboricity and subgraph listing algorithms. SIAM J. Comput. 14(1), 210–223 (1985)

    MathSciNet  MATH  Google Scholar 

  38. Chierichetti, F., Epasto, A., Kumar, R., Lattanzi, S., Mirrokni, V.: Efficient algorithms for public-private social networks. In: SIGKDD, pp. 139–148. ACM (2015)

  39. Chu, S., Cheng, J.: Triangle listing in massive networks and its applications. In: SIGKDD, pp. 672–680. ACM (2011)

  40. Clauset, A.: Finding local community structure in networks. Phys. Rev. E 72(2), 026132 (2005)

    Google Scholar 

  41. Cohen, J.: Trusses: cohesive subgraphs for social network analysis. Natl. Secur. Agency Tech. Rep. 16, 3 (2008)

    Google Scholar 

  42. Conte, A., De Matteis, T., De Sensi, D., Grossi, R., Marino, A., Versari, L.: D2k: scalable community detection in massive networks via small-diameter k-plexes. In: SIGKDD, pp. 1272–1281 (2018)

  43. Cook, S.A.: The complexity of theorem-proving procedures. In: Proceedings of the Third Annual ACM Symposium on Theory of Computing, pp. 151–158. ACM (1971)

  44. Coscia, M., Giannotti, F., Pedreschi, D.: A classification for community discovery methods in complex networks. Stat. Anal. Data Min. 4(5), 512–546 (2011)

    MathSciNet  Google Scholar 

  45. Cui, W., Xiao, Y., Wang, H., Lu, Y., Wang, W.: Online search of overlapping communities. In: SIGMOD, pp. 277–288 (2013)

  46. Cui, W., Xiao, Y., Wang, H., Wang, W.: Local search of communities in large graphs. In: SIGMOD, pp. 991–1002 (2014)

  47. Danisch et al, M.: Listing k-cliques in sparse real-world graphs. In: WWW, pp. 589–598 (2018)

  48. Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. J. Stat. Mech. Theory Exp. 2005(09), P09008 (2005)

    MATH  Google Scholar 

  49. Ding, B., Yu, J. X., Wang, S., Qin, L., Zhang, X., Lin, X.: Finding top-k min-cost connected trees in databases. In: ICDE (2007)

  50. Ding, L., Xie, Y., Shan, X., Song, B.: Search of center-core community in large graphs. In: CCF Conference on Big Data, pp. 94–107 (2018)

    Google Scholar 

  51. DiTursi, D. J., Ghosh, G., Bogdanov, P.: Local community detection in dynamic networks. arXiv preprint arXiv:1709.04033 (2017)

  52. Edachery, J., Sen, A., Brandenburg, F.J.: Graph clustering using distance-k cliques. In: Proceedings of the 7th International Symposium on Graph Drawing, pp. 98–106 (1999)

    Google Scholar 

  53. Elzinga, J., Hearn, D.W.: Geometrical solutions for some minimax location problems. Transp. Sci. 6(4), 379–394 (1972)

    MathSciNet  Google Scholar 

  54. Expert, P., et al.: Uncovering space-independent communities in spatial networks. Proc. Natl. Acad. Sci. USA 108(19), 7663–7668 (2011)

    MATH  Google Scholar 

  55. Fan, W., Li, J., Ma, S., Tang, N., Wu, Y., Wu, Y.: Graph pattern matching: from intractable to polynomial time. PVLDB 3(1–2), 264–275 (2010)

    Google Scholar 

  56. Fan, W., Wang, X., Wu, Y., Xu, J.: Association rules with graph patterns. PVLDB 8(12), 1502–1513 (2015)

    Google Scholar 

  57. Fang, Y., Cheng, R.: On attributed community search. In: International Workshop on Mobility Analytics for Spatio-temporal and Social Data, PVLDB, pp. 1–21 (2017)

    Google Scholar 

  58. Fang, Y., Cheng, R., Chen, Y., Luo, S., Hu, J.: Effective and efficient attributed community search. VLDB J. 26(6), 803–828 (2017)

    Google Scholar 

  59. Fang, Y., Cheng, R., Cong, G., Mamoulis, N., Li, Y.: On spatial pattern matching. In: ICDE, pp. 293–304 (2018)

  60. Fang, Y., Cheng, R., Li, X., Luo, S., Hu, J.: Effective community search over large spatial graphs. PVLDB 10(6), 709–720 (2017)

    Google Scholar 

  61. Fang, Y., Cheng, R., Luo, S., Hu, J.: Effective community search for large attributed graphs. PVLDB 9(12), 1233–1244 (2016)

    Google Scholar 

  62. Fang, Y., Cheng, R., Luo, S., Hu, J., Huang, K.: C-explorer: browsing communities in large graphs. PVLDB 10(12), 1885–1888 (2017)

    Google Scholar 

  63. Fang, Y., Cheng, R., Tang, W., Maniu, S., Yang, X.: Scalable algorithms for nearest-neighbor joins on big trajectory data. TKDE 28(3), 785–800 (2016)

    Google Scholar 

  64. Fang, Y., Cheng, R., Wang, J., Budiman, L., Cong, G., Mamoulis, N.: Spacekey: exploring patterns in spatial databases. In: ICDE, pp. 1577–1580 (2018)

  65. Fang, Y., Wang, Z., Cheng, R., Li, X., Luo, S., Hu, J., Chen, X.: On spatial-aware community search. TKDE 31(4), 783–798 (2019)

    Google Scholar 

  66. Fang, Y., Wang, Z., Cheng, R., Wang, H., Hu, J.: Effective and efficient community search over large directed graphs. In: TKDE, p. 1 (2018)

  67. Fang, Y., Yu, K., Cheng, R., Lakshmanan, L.V., Lin, X.: Efficient algorithms for densest subgraph discovery. In: PVLDB (2019)

  68. Fang, Y., Zhang, H., Ye, Y., Li, X.: Detecting hot topics from twitter: a multiview approach. J. Inf. Sci. 40(5), 578–593 (2014)

    Google Scholar 

  69. Fei Fan, W., Wang, X., Wu, Y.: Expfinder: finding experts by graph pattern matching. In: ICDE, pp. 1316–1319. IEEE (2013)

  70. Flake, G.W., Lawrence, S., Giles, C.L. : Efficient identification of web communities. In: SIGKDD, pp. 150–160 (2000)

  71. Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3), 75–174 (2010)

    MathSciNet  Google Scholar 

  72. Gabow, H.N., Tarjan, R.E.: A linear-time algorithm for a special case of disjoint set union. In: STOC, pp. 246–251 (1983)

  73. Galbrun, E., Gionis, A., Tatti, N.: Top-k overlapping densest subgraphs. Data Min. Knowl. Discov. 30(5), 1134–1165 (2016)

    MathSciNet  MATH  Google Scholar 

  74. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)

    MATH  Google Scholar 

  75. Giatsidis, C., Thilikos, D. M., Vazirgiannis, M.: D-cores: measuring collaboration of directed graphs based on degeneracy. In: ICDM, pp. 201–210 (2011)

  76. Gibbons, A.: Algorithmic Graph Theory. Cambridge University Press, Cambridge (1985)

    MATH  Google Scholar 

  77. Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99(12), 7821–7826 (2002)

    MathSciNet  MATH  Google Scholar 

  78. Goldberg, A.V.: Finding a Maximum Density Subgraph. University of California, Berkeley (1984)

    Google Scholar 

  79. Golenberg, K., Kimelfeld, B., Sagiv, Y.: Keyword proximity search in complex data graphs. In: SIGMOD, pp. 927–940. ACM (2008)

  80. Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: graph processing in a distributed dataflow framework. OSDI 14, 599–613 (2014)

    Google Scholar 

  81. Gregory, S.: Finding overlapping communities in networks by label propagation. New J. Phys. 12(10), 103018 (2010)

    Google Scholar 

  82. Guimera, R., Amaral, L.A.N.: Functional cartography of complex metabolic networks. Nature 433(7028), 895 (2005)

    Google Scholar 

  83. Gulbahce, N., Lehmann, S.: The art of community detection. BioEssays 30(10), 934–938 (2008)

    Google Scholar 

  84. Guo, D.: Regionalization with dynamically constrained agglomerative clustering and partitioning (REDCAP). Int. J. Geogr. Inf. Sci. 22(7), 801–823 (2008)

    Google Scholar 

  85. Guo, T., Cao, X., Cong, G.: Efficient algorithms for answering the m-closest keywords query. In: SIGMOD, pp. 405–418 (2015)

  86. Guttman, A.: R-trees: a dynamic index structure for spatial searching, volume 14 (1984)

  87. Hajibagheri, A., Alvari, H., Hamzeh, A., Hashemi, S.: Community detection in social networks using information diffusion. In: ASONAM, pp. 702–703 (2012)

  88. Harenberg, S., Bello, G., Gjeltema, L., Ranshous, S., Harlalka, J., Seay, R., Padmanabhan, K., Samatova, N.: Community detection in large-scale networks: a survey and empirical evaluation. Wiley Interdiscip. Rev. Comput. Stat. 6(6), 426–439 (2014)

    Google Scholar 

  89. Hastings, M.B.: Community detection as an inference problem. Phys. Rev. E 74(3), 035102 (2006)

    Google Scholar 

  90. He, H., Wang, H., Yang, J., Yu, P. S.: Blinks: ranked keyword searches on graphs. In: SIGMOD, pp. 305–316. ACM (2007)

  91. Henderson, K., Eliassi-Rad, T., Papadimitriou, S., Faloutsos, C.: HCDF: a hybrid community discovery framework. In: SDM, pp. 754–765 (2010)

  92. Hopcroft, J.E., Ullman, J.D.: Data Structures and Algorithms (1983)

  93. Hu, J., Cheng, R., Chang, K. C., Sankar, A., Fang, Y., Lam, B.Y.H.: Discovering maximal motif cliques in large heterogeneous information networks. In: ICDE, pp. 746–757 (2019)

  94. Hu, J., Cheng, R., Huang, Z., Fang, Y., Luo, S.: On embedding uncertain graphs. In: CIKM, pp. 157–166. ACM (2017)

  95. Hu, J., Wu, X., Cheng, R., Luo, S., Fang, Y.: Querying minimal Steiner maximum-connected subgraphs in large graphs. In: CIKM, pp. 1241–1250 (2016)

  96. Hu, J., Wu, X., Cheng, R., Luo, S., Fang, Y.: On minimal Steiner maximum-connected subgraph queries. In: TKDE, pp. 2455–2469 (2017)

  97. Hu, X., Tao, Y., Chung, C.-W.: I/o-efficient algorithms on triangle listing and counting. ACM Trans. Database Syst. (TODS) 39(4), 27 (2014)

    MathSciNet  Google Scholar 

  98. Huang, X., Cheng, H., Qin, L., Tian, W., Yu, J.X.: Querying k-truss community in large and dynamic graphs. In: SIGMOD, pp. 1311–1322 (2014)

  99. Huang, X., Cheng, H., Yu, J.X.: Attributed community analysis: global and ego-centric views. IEEE Data Eng. Bull. 39(3), 29–40 (2016)

    Google Scholar 

  100. Huang, X., Jiang, J., Choi, B., Xu, J., Zhang, Z., Song, Y.: PP-DBLP: modeling and generating attributed public-private networks with DBLP. In: IEEE International Conference on Data Mining Workshops (ICDMW), pp. 986–989 (2018)

  101. Huang, X., Lakshmanan, L.V., Yu, J.X., Cheng, H.: Approximate closest community search in networks. PVLDB 9(4), 276–287 (2015)

    Google Scholar 

  102. Huang, X., Lakshmanan, L.V.S.: Attribute-driven community search. PVLDB 10(9), 949–960 (2017)

    Google Scholar 

  103. Huang, X., Lakshmanan, L.V.S., Xu, J.: Community search over big graphs: models, algorithms, and opportunities. In: ICDE, pp. 1451–1454 (2017)

  104. Huang, X., Lu, W., Lakshmanan, L.V.: Truss decomposition of probabilistic graphs: semantics and algorithms. In: SIGMOD, pp. 77–90 (2016)

  105. Jayaram, N., Goyal, S., Li, C.: VIIQ: auto-suggestion enabled visual interface for interactive graph query formulation. PVLDB 8(12), 1940–1943 (2015)

    Google Scholar 

  106. Jiang, Y., Huang, X., Cheng, H., Yu, J. X.: VizCS: online searching and visualizing communities in dynamic graphs. In: ICDE, pp. 1585–1588 (2018)

  107. Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: VLDB, pp. 505–516. VLDB Endowment (2005)

  108. Kargar, M., An, A.: Keyword search in graphs: finding r-cliques. PVLDB 4(10), 681–692 (2011)

    Google Scholar 

  109. Karypis, G., Kumar, V.: Metis-unstructured graph partitioning and sparse matrix ordering system, version 2.0. (1995)

  110. Khan, B.S., Niazi, M.A.: Network community detection: a review and visual survey. arXiv:1708.00977 (2017)

  111. Khaouid, W., Barsky, M., Srinivasan, V., Thomo, A.: K-core decomposition of large networks on a single PC. PVLDB 9(1), 13–23 (2015)

    Google Scholar 

  112. Kim, J., Lee, J.-G.: Community detection in multi-layer graphs: a survey. SIGMOD Rec. 44(3), 37–48 (2015)

    Google Scholar 

  113. Kim, Y., Son, S.-W., Jeong, H.: Finding communities in directed networks. Phys. Rev. E 81(1), 016103 (2010)

    Google Scholar 

  114. Kloumann, I.M., Kleinberg, J.M.: Community membership identification from small seed sets. In: SIGKDD, pp. 1366–1375 (2014)

  115. Kou, L., Markowsky, G., Berman, L.: A fast algorithm for Steiner trees. Acta Inf. 15(2), 141–145 (1981)

    MathSciNet  MATH  Google Scholar 

  116. Kuncheva, Z., Montana, G.: Multi-scale community detection in temporal networks using spectral graph wavelets. In: International Workshop on Personal Analytics and Privacy, pp. 139–154 (2017)

    Google Scholar 

  117. Lai, L., Qin, L., Lin, X., Chang, L.: Scalable subgraph enumeration in mapreduce. PVLDB 8(10), 974–985 (2015)

    Google Scholar 

  118. Lai, L., Qin, L., Lin, X., Zhang, Y., Chang, L., Yang, S.: Scalable distributed subgraph enumeration. PVLDB 10(3), 217–228 (2016)

    Google Scholar 

  119. Lancichinetti, A., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E 80(1), 016118 (2009)

    Google Scholar 

  120. Lee, J., Chung, C.: A query approach for influence maximization on specific users in social networks. TKDE 27(2), 340–353 (2015)

    Google Scholar 

  121. Leicht, E.A., Newman, M.E.: Community structure in directed networks. Phys. Rev. Lett. 100(11), 118703 (2008)

    Google Scholar 

  122. Leighton, T., Rao, S.: An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms. In: FOCS, pp. 422–431 (1988)

  123. Leskovec, J., Lang, K.J., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: WWW, pp. 631–640 (2010)

  124. Li, G., Ooi, B.C., Feng, J., Wang, J., Zhou, L.: Ease: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In: SIGMOD, pp. 903–914. ACM (2008)

  125. Li, J., Wang, X., Deng, K., Yang, X., Sellis, T., Yu, J.X.: Most influential community search over large social networks. In: ICDE, pp. 871–882 (2017)

  126. Li, R.-H., Qin, L., Ye, F., Yu, J. X., Xiao, X., Xiao, N., Zheng, Z.: Skyline community search in multi-valued networks. In: SIGMOD, pp. 457–472 (2018)

  127. Li, R.-H., Qin, L., Yu, J.X., Mao, R.: Influential community search in large networks. PVLDB 8(5), 509–520 (2015)

    Google Scholar 

  128. Li, R.-H., Qin, L., Yu, J.X., Mao, R.: Finding influential communities in massive networks. VLDB J. 26(6), 751–776 (2017)

    Google Scholar 

  129. Li, R.-H., Su, J., Qin, L., Yu, J. X., Dai, Q.: Persistent community search in temporal networks. In: ICDE, pp. 797–808 (2018)

  130. Li, R.-H., Yu, J.X., Mao, R.: Efficient core maintenance in large dynamic graphs. TKDE 26(10), 2453–2465 (2014)

    Google Scholar 

  131. Li, X., Cheng, R., Fang, Y., Hu, J., Maniu, S.: Scalable evaluation of k-NN queries on large uncertain graphs. In: EDBT, pp. 181–192 (2018)

  132. Li, Y., Sha, C., Huang, X., Zhang, Y.: Community detection in attributed graphs: an embedding approach. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

  133. Li, Z., Fang, Y., Liu, Q., Cheng, J., Cheng, R., Lui, J.: Walking in the cloud: parallel SimRank at scale. PVLDB 9(1), 24–35 (2015)

    Google Scholar 

  134. Liu, S., Wang, S., Krishnan, R.: Persistent community detection in dynamic social networks. In: PAKDD, pp. 78–89 (2014)

  135. Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link LDA: joint models of topic and author community. In: International Conference on Machine Learning, pp. 665–672 (2009)

  136. Luo, F., Wang, J.Z., Promislow, E.: Exploring local community structures in large networks. In: ICWI, pp. 233–239 (2006)

  137. Macropol, K., Singh, A.: Scalable discovery of best clusters on large graphs. PVLDB 3(1–2), 693–702 (2010)

    Google Scholar 

  138. Malewicz, G., Austern, M. H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD, pp. 135–146. ACM (2010)

  139. Malliaros, F.D., Vazirgiannis, M.: Clustering and community detection in directed networks: a survey. Phys. Rep. 533(4), 95–142 (2013)

    MathSciNet  MATH  Google Scholar 

  140. Marcel, P., Negre, E.: A survey of query recommendation techniques for data warehouse exploration. In: EDA, pp. 119–134 (2011)

  141. Matsuda, H., Ishihara, T., Hashimoto, A.: Classifying molecular sequences using a linkage graph with their pairwise similarities. Theor. Comput. Sci. 210(2), 305–325 (1999)

    MathSciNet  MATH  Google Scholar 

  142. Mehler, A., Skiena, S.: Expanding network communities from representative examples. TKDD 3(2), 7 (2009)

    Google Scholar 

  143. Mehlhorn, K.: A faster approximation algorithm for the steiner problem in graphs. Inf. Process. Lett. 27, 125–128 (1988)

    MathSciNet  MATH  Google Scholar 

  144. Meng, T., Cai, L., He, T., Chen, L., Deng, Z.: K-hop community search based on local distance dynamics. KSII Trans. Internet Inf. Syst. 12(7) (2018)

  145. Montresor, A., De Pellegrini, F., Miorandi, D.: Distributed k-core decomposition. IEEE Trans. Parallel Distrib. Syst. 24(2), 288–300 (2013)

    Google Scholar 

  146. Moradi, F., Olovsson, T., Tsigas, P.: A local seed selection algorithm for overlapping community detection. In: ASONAM, pp. 1–8 (2014)

  147. Nallapati, R.M., Ahmed, A., Xing, E.P., Cohen, W.W.: Joint latent topic models for text and citations. In: SIGKDD, pp. 542–550 (2008)

  148. Newman, M.E.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69(6), 066133 (2004)

    Google Scholar 

  149. Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)

    Google Scholar 

  150. Ning, X., Liu, Z., Zhang, S.: Local community extraction in directed networks. Phys. A Stat. Mech. Appl. 452, 258–265 (2016)

    Google Scholar 

  151. Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818 (2005)

    Google Scholar 

  152. Papadopoulos, S., Kompatsiaris, Y., Vakali, A., Spyridonos, P.: Community detection in social media. DMKD 24(3), 515–554 (2012)

    Google Scholar 

  153. Park, H.-M., Myaeng, S.-H., Kang, U.: Pte: enumerating trillion triangles on distributed systems. In: SIGKDD, pp. 1115–1124. ACM (2016)

  154. Parthasarathy, S., Ruan, Y., Satuluri, V.: Community discovery in social networks: applications, methods and emerging trends. In: Social Network Data Analytics, pp. 79–113 (2011)

    Google Scholar 

  155. Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: SIGKDD, pp. 701–710 (2014)

  156. Plantié, M., Crampes, M.: Survey on social community detection. In: Social Media Retrieval, pp. 65–85 (2013)

    Google Scholar 

  157. Pons, P., Latapy, M.: Computing communities in large networks using random walks. In: International Symposium on Computer and Information Sciences, pp. 284–293 (2005)

    Google Scholar 

  158. Porter, M.A., Onnela, J.-P., Mucha, P.J.: Communities in networks. Not. AMS 56(9), 1082–1097 (2009)

    MathSciNet  MATH  Google Scholar 

  159. Qi, G.-J., Aggarwal, C.C., Huang, T.S.: Online community detection in social sensing. In: WSDM, pp. 617–626 (2013)

  160. Qiao, M., Zhang, H., Cheng, H.: Subgraph matching: on compression and computation. Proc. VLDB Endow. 11(2), 176–188 (2017)

    Google Scholar 

  161. Qin, L., Li, R.-H., Chang, L., Zhang, C.: Locally densest subgraph discovery. In: SIGKDD, pp. 965–974 (2015)

  162. Qin, L., Yu, J. X., Chang, L., Tao, Y.: Querying communities in relational databases. In: ICDE (2009)

  163. Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey. ACM Comput. Surv. 51(2), 35:1–35:37 (2018)

    Google Scholar 

  164. Ruan, Y., Fuhry, D., Parthasarathy, S.: Efficient community detection in large networks using content and links. In: WWW, pp. 1089–1098 (2013)

  165. Sachan, M., Contractor, D., Faruquie, T.A., Subramaniam, L.V.: Using content and interactions for discovering communities in social networks. In: WWW, pp. 331–340 (2012)

  166. Saito, K., Yamada, T., Kazama, K.: Extracting communities from complex networks by the k-dense method. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 91(11), 3304–3311 (2008)

    Google Scholar 

  167. Sarıyüce, A.E., Gedik, B., Jacques-Silva, G., Wu, K.-L., Çatalyürek, Ü.V.: Incremental k-core decomposition: algorithms and evaluation. VLDB J. 25(3), 425–447 (2016)

    Google Scholar 

  168. Sariyüce, A.E., Pinar, A.: Fast hierarchy construction for dense subgraphs. PVLDB 10(3), 97–108 (2016)

    Google Scholar 

  169. Sariyuce, A.E., Seshadhri, C., Pinar, A., Catalyurek, U.V.: Finding the hierarchy of dense subgraphs using nucleus decompositions. In: WWW, pp. 927–937 (2015)

  170. Seidman, S.B.: Network structure and minimum degree. Soc. Netw. 5(3), 269–287 (1983)

    MathSciNet  Google Scholar 

  171. Seidman, S.B., Foster, B.L.: A graph-theoretic generalization of the clique concept. J. Math. Sociol. 6(1), 139–154 (1978)

    MathSciNet  MATH  Google Scholar 

  172. Shakarian, P., Roos, P., Callahan, D., Kirk, C.: Mining for geographically disperse communities in social networks by leveraging distance modularity. In: SIGKDD, pp. 1402–1409 (2013)

  173. Shang, J., Wang, C., Wang, C., Guo, G., Qian, J.: An attribute-based community search method with graph refining. J. Supercomput. 1–28 (2017)

  174. Shi, C., Li, Y., Zhang, J., Sun, Y., Philip, S.Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 29(1), 17–37 (2017)

    Google Scholar 

  175. Sozio, M., Gionis, A.: The community-search problem and how to plan a successful cocktail party. In: SIGKDD, pp. 939–948 (2010)

  176. Subbian, K., Aggarwal, C.C., Srivastava, J., Yu, P.S.: Community detection with prior knowledge. In: SDM, pp. 405–413 (2013)

  177. Tamimi, I., El Kamili, M.: Literature survey on dynamic community detection and models of social networks. In: International Conference on Wireless Networks and Mobile Communications, pp. 1–5 (2015)

  178. Tang, L., Liu, H.: Scalable learning of collective behavior based on sparse social dimensions. In: CIKM, pp. 1107–1116 (2009)

  179. Tong, H., Faloutsos, C., Gallagher, B., Eliassi-Rad, T.: Fast best-effort pattern matching in large attributed graphs. In: KDD, pp. 737–746. ACM (2007)

  180. Tsourakakis, C., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.: Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: SIGKDD, pp. 104–112 (2013)

  181. Ullmann, J.R.: An algorithm for subgraph isomorphism. J. ACM (JACM) 23(1), 31–42 (1976)

    MathSciNet  Google Scholar 

  182. Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)

    MathSciNet  Google Scholar 

  183. Wang, H., Aggarwal, C.C.: A survey of algorithms for keyword search on graph data. In: Managing and Mining Graph Data, pp. 249–273. Springer (2010)

  184. Wang, J., Cheng, J.: Truss decomposition in massive networks. PVLDB 5(9), 812–823 (2012)

    Google Scholar 

  185. Wang, K., Cao, X., Lin, X., Zhang, W., Qin, L.: Efficient computing of radius-bounded k-cores. In: ICDE, pp. 233–244 (2018)

  186. Wang, N., Zhang, J., Tan, K.-L., Tung, A.K.: On triangulation-based dense neighborhood graph discovery. PVLDB 4(2), 58–68 (2010)

    Google Scholar 

  187. Wang, Y., Jian, X., Yang, Z., Li, J.: Query optimal k-plex based community in graphs. Data Sci. Eng. 2(4), 257–273 (2017)

    Google Scholar 

  188. Wen, D., Qin, L., Zhang, Y., Lin, X., Yu, J.X.: I/o efficient core graph decomposition: application to degeneracy ordering. IEEE Trans. Data Eng. 31(1), 75–90 (2019)

    Google Scholar 

  189. Wu, F.-Y.: The potts model. Rev. Mod. Phys. 54(1), 235 (1982)

    MathSciNet  Google Scholar 

  190. Wu, Y., Jin, R., Li, J., Zhang, X.: Robust local community detection: on free rider effect and its elimination. PVLDB 8(7), 798–809 (2015)

    Google Scholar 

  191. Wu, Y., Jin, R., Zhu, X., Zhang, X.: Finding dense and connected subgraphs in dual networks. In: ICDE, pp. 915–926 (2015)

  192. Xu, Z., Ke, Y., Wang, Y., Cheng, H., Cheng, J.: A model-based approach to attributed graph clustering. In: SIGMOD, pp. 505–516 (2012)

  193. Yang, B., Cheung, W., Liu, J.: Community mining from signed social networks. IEEE Trans. Knowl. Data Eng. 19(10), 1333–1348 (2007)

    Google Scholar 

  194. Yang, B., Liu, D., Liu, J.: Discovering communities from social networks: methodologies and applications, pp. 331–346 (2010)

    Google Scholar 

  195. Yang, D.-N., Chen, Y.-L., Lee, W.-C., Chen, M.-S.: On social–temporal group query with acquaintance constraint. PVLDB 4(6), 397–408 (2011)

    Google Scholar 

  196. Yang, D.-N., Shen, C.-Y., Lee, W.-C., Chen, M.-S.: On socio-spatial group query for location-based social networks. In: SIGKDD, pp. 949–957 (2012)

  197. Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. Knowl. Inf. Syst. 42(1), 181–213 (2015)

    Google Scholar 

  198. Yang, J., McAuley, J., Leskovec, J.: Community detection in networks with node attributes. In: ICDM, pp. 1151–1156 (2013)

  199. Yang, J., McAuley, J., Leskovec, J.: Detecting cohesive and 2-mode communities indirected and undirected networks. In: WSDM, pp. 323–332 (2014)

  200. Yang, L., Cao, X., He, D., Wang, C., Wang, X., Zhang, W.: Modularity based community detection with deep learning. In: IJCAI, pp. 2252–2258 (2016)

  201. Yang, T., Chi, Y., Zhu, S., Gong, Y., Jin, R.: Directed network community detection: a popularity and productivity link model. In: SDM, pp. 742–753 (2010)

  202. Yang, T., Jin, R., Chi, Y., Zhu, S.: Combining link and content for community detection: a discriminative approach. In: SIGKDD, pp. 927–936 (2009)

  203. Yi, P., Choi, B., Bhowmick, S.S., Xu, J.: AutoG: a visual query autocompletion framework for graph databases. VLDB J. 26(3), 347–372 (2017)

    Google Scholar 

  204. Yu, J.X., Qin, L., Chang, L.: Keyword Search in Databases. Synthesis Lectures on Data Management (2009)

    MATH  Google Scholar 

  205. Yuan, L., Qin, L., Zhang, W., Chang, L., Yang, J.: Index-based densest clique percolation community search in networks. TKDE 30(5), 922–935 (2018)

    Google Scholar 

  206. Yuan, Y., Lian, X., Chen, L., Yu, J.X., Wang, G., Sun, Y.: Keyword search over distributed graphs with compressed signature. TKDE 29(6), 1212–1225 (2017)

    Google Scholar 

  207. Yuan, Y., Wang, G., Chen, L., Wang, H.: Efficient subgraph similarity search on large probabilistic graph databases. PVLDB 5(9), 800–811 (2012)

    Google Scholar 

  208. Yuan, Y., Wang, G., Chen, L., Wang, H.: Efficient keyword search on uncertain graph data. TKDE 25(12), 2767–2779 (2013)

    Google Scholar 

  209. Yuan, Y., Wang, G., Wang, H., Chen, L.: Efficient subgraph search over large uncertain graphs. PVLDB 4(11), 876–886 (2011)

    Google Scholar 

  210. Zhang, F., Yuan, L., Zhang, Y., Qin, L., Lin, X., Zhou, A.: Discovering strong communities with user engagement and tie strength. In: DASFAA, pp. 425–441 (2018)

  211. Zhang, F., Zhang, Y., Qin, L., Zhang, W., Lin, X.: When engagement meets similarity: efficient (k, r)-core computation on social networks. PVLDB 10(10), 998–1009 (2017)

    Google Scholar 

  212. Zhang, Y., Parthasarathy, S.: Extracting analyzing and visualizing triangle k-core motifs within networks. In: ICDE, pp. 1049–1060 (2012)

  213. Zhang, Y., Yu, J. X., Zhang, Y., Qin, L.: A fast order-based approach for core maintenance. In: ICDE, pp. 337–348 (2017)

  214. Zhao, F., Tung, A.K.: Large scale cohesive subgraphs discovery for social network visual analysis. PVLDB 6, 85–96 (2012)

    Google Scholar 

  215. Zheng, D., Liu, J., Li, R.-H., Aslay, C., Chen, Y.-C., Huang, X.: Querying intimate-core groups in weighted graphs. In: IEEE International Conference on Semantic Computing, pp. 156–163. IEEE (2017)

  216. Zheng, Z., Ye, F., Li, R.-H., Ling, G., Jin, T.: Finding weighted k-truss communities in large networks. Inf. Sci. 417(C), 344–360 (2017)

    Google Scholar 

  217. Zhou, D., Councill, I., Zha, H., Giles, C.L.: Discovering temporal communities from social network documents. In: ICDM, pp. 745–750 (2007)

  218. Zhou, R., Liu, C., Yu, J. X., Liang, W., Chen, B., Li, J.: Finding maximal k-edge-connected subgraphs from a large graph. In: EDBT, pp. 480–491 (2012)

  219. Zhou, R., Liu, C., Yu, J. X., Liang, W., Zhang, Y.: Efficient truss maintenance in evolving networks. arXiv preprint arXiv:1402.2807 (2014)

  220. Zhou, Y., Cheng, H., Yu, J.X.: Graph clustering based on structural/attribute similarities. PVLDB 2(1), 718–729 (2009)

    Google Scholar 

  221. Zhu, Q., Hu, H., Xu, C., Xu, J., Lee, W.-C.: Geo-social group queries with minimum acquaintance constraints. VLDB J. 26(5), 709–727 (2017)

    Google Scholar 

  222. Zhu, R., Zou, Z., Li, J.: Diversified coherent core search on multi-layer graphs. In: ICDE, pp. 701–712. IEEE (2018)

  223. Zou, L., Chen, L., Özsu, M.T.: Distance-join: pattern match query in a large graph database. PVLDB 2(1), 886–897 (2009)

    Google Scholar 

Download references

Acknowledgements

We would like to thank Jiafeng Hu and Kai Wang for their helpful discussions, Dan Yin for the proof-reading, and Jinbin Huang for conducting experimental comparisons. Xin Huang is supported by the NSFC Project No. 61702435, and Hong Kong General Research Fund (GRF) Project No. HKBU 12200917. Lu Qin is supported by DP160101513. Ying Zhang is supported by FT170100128 and DP180103096. Wenjie Zhang is supported by DP180103096. Reynold Cheng is supported by the Research Grants Council of Hong Kong (RGC Projects HKU 17229116 and 17205115) and HKU (Projects 102009508 and 104004129). Xuemin Lin is supported by 2019DH0ZX01, 2018YFB1003504, NSFC61232006, DP180103096, and DP170101628.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yixiang Fang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, Y., Huang, X., Qin, L. et al. A survey of community search over big graphs. The VLDB Journal 29, 353–392 (2020). https://doi.org/10.1007/s00778-019-00556-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-019-00556-x

Keywords

Navigation