Skip to main content
Log in

Efficient and effective community search

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Community search is the problem of finding a good community for a given set of query vertices. One of the most studied formulations of community search asks for a connected subgraph that contains all query vertices and maximizes the minimum degree. All existing approaches to min-degree-based community search suffer from limitations concerning efficiency, as they need to visit (large part of) the whole input graph, as well as accuracy, as they output communities quite large and not really cohesive. Moreover, some existing methods lack generality: they handle only single-vertex queries, find communities that are not optimal in terms of minimum degree, and/or require input parameters. In this work we advance the state of the art on community search by proposing a novel method that overcomes all these limitations: it is in general more efficient and effective—one/two orders of magnitude on average, it can handle multiple query vertices, it yields optimal communities, and it is parameter-free. These properties are confirmed by an extensive experimental analysis performed on various real-world graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. To be precise, the GS algorithm runs in \(\mathcal {O}(m\times \alpha (n))\) time, where \(\alpha (\cdot )\) denotes the inverse Ackermann function.

  2. In the implementation all levels corresponding to non-distinct cores can be omitted; thus, the actual number of levels of the tree is h.

  3. Flickr is available at http://socialnetworks.mpi-sws.org/datasets, while all other graphs at https://snap.stanford.edu/data.

References

  • Batagelj V, Zaveršnik M (2011) Fast algorithms for determining (generalized) core groups in social networks. Adv Data Anal Classif 5(2):129–145

    Article  MathSciNet  MATH  Google Scholar 

  • Bogdanov P, Baumer B, Basu P, Bar-Noy A, Singh AK (2013) As strong as the weakest link: mining diverse cliques in weighted graphs. In: European Conference on Machine Learning and Knowledge Discovery (ECML/PKDD), pp 525–540

  • Charikar M (2000) Greedy approximation algorithms for finding dense components in a graph. In: International Workshop on Approximation Algorithms for Combinatorial Optimization (APPROX), pp 84–95

  • Cui W, Xiao Y, Wang H, Lu Y, Wang W (2013) Online search of overlapping communities. In: ACM SIGMOD International Conference on Management of Data, pp 277–288

  • Cui W, Xiao Y, Wang H, Wang W (2014) Local search of communities in large graphs. In: ACM SIGMOD International Conference on Management of Data, pp 991–1002

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res (JMLR) 7:1–30

    MATH  Google Scholar 

  • Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174

    Article  MathSciNet  Google Scholar 

  • Gabow HN, Tarjan RE (1983) A linear-time algorithm for a special case of disjoint set union. In: ACM Symposium on Theory of Computing (STOC), pp 246–251

  • Goldberg AV (1984) Finding a maximum density subgraph. Technical report, University of California at Berkeley

  • Huang X, Cheng H, Qin L, Tian W, Yu JX (2014) Querying k-truss community in large and dynamic graphs. In: ACM SIGMOD International Conference on Management of Data, pp 1311–1322

  • Koren Y, North SC, Volinsky C (2007) Measuring and extracting proximity graphs in networks. ACM Trans Knowl Discov Data (TKDD) 1(3):12

    Article  Google Scholar 

  • Kou L, Markowsky G, Berman L (1981) A fast algorithm for Steiner trees. Acta Inform 15(2):141–145

    Article  MathSciNet  MATH  Google Scholar 

  • Lee VE, Ruan N, Jin R, Aggarwal CC (2010) A survey of algorithms for dense subgraph discovery. In: Managing and Mining Graph Data, pp 303–336

  • Li R-H, Qin L, Yu JX, Mao R (2015) Influential community search in large networks. Proc VLDB Endow (PVLDB) 8(5):509–520

    Article  Google Scholar 

  • Mehlhorn K (1988) A faster approximation algorithm for the Steiner problem in graphs. Inf Process Lett 27(3):125–128

    Article  MathSciNet  MATH  Google Scholar 

  • Seidman SB (1983) Network structure and minimum degree. Soc Netw 5(3):269–287

    Article  MathSciNet  Google Scholar 

  • Sozio M, Gionis A (2010) The community-search problem and how to plan a successful cocktail party. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 939–948

  • Tong H, Faloutsos C (2006) Center-piece subgraphs: problem definition and fast solutions. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 404–413

  • Wu Y, Jin R, Li J, Zhang X (2015) Robust local community detection: on free rider effect and its elimination. Proc VLDB Endow (PVLDB) 8(5):798–809

    Article  Google Scholar 

  • Xie J, Szymanski BK (2013) Labelrank: a stabilized label propagation algorithm for community detection in networks. In: IEEE Network Science Workshop

Download references

Conflict of interest

The authors declare that they have no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Gullo.

Additional information

Responsible editors: Joao Gama, Indre Zliobaite , Alipio Jorge, Concha Bielza.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Barbieri, N., Bonchi, F., Galimberti, E. et al. Efficient and effective community search. Data Min Knowl Disc 29, 1406–1433 (2015). https://doi.org/10.1007/s10618-015-0422-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-015-0422-1

Keywords

Navigation