Skip to main content
Log in

Finding lasting dense subgraphs

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Graphs form a natural model for relationships and interactions between entities, for example, between people in social and cooperation networks, servers in computer networks, or tags and words in documents and tweets. But, which of these relationships or interactions are the most lasting ones? In this paper, we study the following problem: given a set of graph snapshots, which may correspond to the state of an evolving graph at different time instances, identify the set of nodes that are the most densely connected in all snapshots. We call this problem the Best Friends Forever (\(\text {BFF}\)) problem. We provide definitions for density over multiple graph snapshots, that capture different semantics of connectedness over time, and we study the corresponding variants of the \(\text {BFF}\) problem. We then look at the On–Off\(\text {BFF}\) (\(\textsc {O}^{\textsc {2}}\text {BFF}\)) problem that relaxes the requirement of nodes being connected in all snapshots, and asks for the densest set of nodes in at least k of a given set of graph snapshots. We show that this problem is NP-complete for all definitions of density, and we propose a set of efficient algorithms. Finally, we present experiments with synthetic and real datasets that show both the efficiency of our algorithms and the usefulness of the \(\text {BFF}\) and the \(\textsc {O}^{\textsc {2}}\text {BFF}\) problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://dblp.uni-trier.de/.

  2. https://snap.stanford.edu/data/oregon1.html.

  3. https://snap.stanford.edu/data/oregon2.html.

  4. http://www.caida.org/data/as-relationships/.

  5. https://snap.stanford.edu/data/as.html.

  6. https://github.com/ksemer/BestFriendsForever-BFF-.

References

  • Alvarez-Hamelin JI, Dall’Asta L, Barrat A, Vespignani A (2005) Large scale networks fingerprinting and visualization using the k-core decomposition. In: Advances in neural information processing systems, [neural information processing systems, NIPS 2005, December 5–8, 2005, Vancouver, British Columbia, Canada], 2005, vol 18. MIT Press Cambridge, MA, USA, pp 41–50

  • Araujo M, Günnemann S, Papadimitriou S, Faloutsos C, Basu P, Swami A, Papalexakis EE, Koutra D (2016) Discovery of “comet” communities in temporal and labeled graphs com\(^{2}\). Knowl Inf Syst 46(3):657–677. https://doi.org/10.1007/s10115-015-0847-2

    Article  Google Scholar 

  • Asahiro Y, Iwama K, Tamaki H, Tokuyama T (2000) Greedily finding a dense subgraph. J Algorithms 34:203–221. https://doi.org/10.1006/jagm.1999.1062

    Article  MathSciNet  MATH  Google Scholar 

  • Bahmani B, Kumar R, Vassilvitskii S (2012) Densest subgraph in streaming and mapreduce. PVLDB 5(5):454–465. https://doi.org/10.14778/2140436.2140442

    Google Scholar 

  • Bhattacharya S, Henzinger M, Nanongkai D, Tsourakakis CE (2015) Space- and time-efficient algorithm for maintaining dense subgraphs on one-pass dynamic streams. In: Proceedings of the forty-seventh annual ACM on symposium on theory of computing, STOC 2015, Portland, OR, USA, June 14–17, 2015, pp 173–182. https://doi.org/10.1145/2746539.2746592

  • Bogdanov P, Mongiovì M, Singh AK (2011) Mining heavy subgraphs in time-evolving networks. In: 11th IEEE international conference on data mining, ICDM 2011, Vancouver, BC, Canada, December 11–14, 2011, pp 81–90. https://doi.org/10.1109/ICDM.2011.101

  • Bourjolly J-M, Laporte G, Pesant G (2002) An exact algorithm for the maximum k-club problem in an undirected graph. Eur J Oper Res 138(1):21–28. https://doi.org/10.1016/S0377-2217(01)00133-3

    Article  MathSciNet  MATH  Google Scholar 

  • Cerf L, Besson J, Robardet C, Boulicaut J-F (2008) Data peeler: contraint-based closed pattern mining in n-ary relations. In: Proceedings of the SIAM international conference on data mining, SDM 2008, April 24–26, 2008, Atlanta, Georgia, USA, pp 37–48. https://doi.org/10.1137/1.9781611972788.4

  • Charikar M (2000) Greedy approximation algorithms for finding dense components in a graph. In: Approximation algorithms for combinatorial optimization, third international workshop, APPROX 2000, Saarbrücken, Germany, September 5–8, 2000, proceedings, pp 84–95. https://doi.org/10.1007/3-540-44436-X_10

  • Epasto A, Lattanzi S, Sozio M (2015) Efficient densest subgraph computation in evolving graphs. In: Proceedings of the 24th international conference on world wide web, WWW 2015, Florence, Italy, May 18–22, 2015, pp 300–310. https://doi.org/10.1145/2736277.2741638

  • Fortunato S (2009) Community detection in graphs. CoRR. arXiv:0906.0612

  • Goldberg AV (1984) Finding a maximum density subgraph. Technical report

  • Jethava V, Beerenwinkel N (2015) Finding dense subgraphs in relational graphs. In: Machine learning and knowledge discovery in databases—European conference, ECML PKDD 2015, Porto, Portugal, September 7–11, 2015, Proceedings, Part II, pp 641–654. https://doi.org/10.1007/978-3-319-23525-7_39

  • Khuller S, Saha B (2009) On finding dense subgraphs. In: Automata, languages and programming, 36th international colloquium, ICALP 2009, Rhodes, Greece, July 5–12, 2009, Proceedings, Part I, pp 597–608. https://doi.org/10.1007/978-3-642-02927-1_50

  • Khurana U, Deshpande A (2013) Efficient snapshot retrieval over historical graph data. In: 29th IEEE international conference on data engineering, ICDE 2013, Brisbane, Australia, April 8–12, 2013, pp 997–1008. https://doi.org/10.1109/ICDE.2013.6544892

  • Leskovec J, Kleinberg JM, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. TKDD 1(1):2. https://doi.org/10.1145/1217299.1217301

    Article  Google Scholar 

  • Ma S, Hu R, Wang L, Lin X, Huai J (2017) Fast computation of dense temporal subgraphs. In: 33rd IEEE international conference on data engineering, ICDE 2017, San Diego, CA, USA, April 19–22, 2017, pp 361–372. https://doi.org/10.1109/ICDE.2017.95

  • Makino K, Uno T (2004) New algorithms for enumerating all maximal cliques. In: Algorithm theory—SWAT 2004, 9th Scandinavian workshop on algorithm theory, Humlebaek, Denmark, July 8–10, 2004, Proceedings, pp 260–272. https://doi.org/10.1007/978-3-540-27810-8_23

  • McClosky B, Hicks IV (2012) Combinatorial algorithms for the maximum k-plex problem. J. Comb. Optim. 23(1):29–49. https://doi.org/10.1007/s10878-010-9338-2

    Article  MathSciNet  MATH  Google Scholar 

  • Moffitt VZ, Stoyanovich J (2016) Towards a distributed infrastructure for evolving graph analytics. In: Proceedings of the 25th international conference on world wide web, WWW 2016, Montreal, Canada, April 11–15, 2016, Companion Volume, pp 843–848. https://doi.org/10.1145/2872518.2889290

  • Myra S (2011) Evolution in social networks: a survey. In: Social network data analytics, pp 149–175. https://doi.org/10.1007/978-1-4419-8462-3_6

  • Nguyen K-N, Cerf L, Plantevit M, Boulicaut J-F (2011) Multidimensional association rules in boolean tensors. In: Proceedings of the eleventh SIAM international conference on data mining, SDM 2011, April 28–30, Mesa, Arizona, USA, pp 570–581. https://doi.org/10.1137/1.9781611972818.49

  • Nguyen K-N, Cerf L, Plantevit M, Boulicaut J-F (2013) Discovering descriptive rules in relational dynamic graphs. Intell. Data Anal. 17(1):49–69. https://doi.org/10.3233/IDA-120567

    Article  Google Scholar 

  • Ren C, Lo E, Kao B, Zhu X, Cheng R (2011) On querying historical evolving graph sequences. PVLDB 4(11):726–737

    Google Scholar 

  • Rozenshtein P, Nikolaj T, Aristides G (2017) Finding dynamic dense subgraphs. TKDD 11(3):27:1–27:30. https://doi.org/10.1145/3046791

    Article  Google Scholar 

  • Rozenshtein P, Tatti N, Gionis A (2014) Discovering dynamic communities in interaction networks. In: Machine learning and knowledge discovery in databases—European conference, ECML PKDD 2014, Nancy, France, September 15–19, 2014. Proceedings, Part II, pp 678–693. https://doi.org/10.1007/978-3-662-44851-9_43

  • Semertzidis K, Pitoura E (2018) Top-k durable graph pattern queries on temporal graphs. IEEE Trans Knowl Data Eng PP(99):1–1. https://doi.org/10.1109/TKDE.2018.2823754

    Google Scholar 

  • Semertzidis K, Pitoura E (2016) Durable graph pattern queries on historical graphs. In: 32nd IEEE international conference on data engineering, ICDE 2016, Helsinki, Finland, May 16–20, 2016, pp 541–552. https://doi.org/10.1109/ICDE.2016.7498269

  • Semertzidis K, Pitoura E (2017) Historical traversals in native graph databases. In: Advances in databases and information systems—21st European conference, ADBIS 2017, Nicosia, Cyprus, September 24–27, 2017, proceedings, pp 167–181. https://doi.org/10.1007/978-3-319-66917-5_12

  • Semertzidis K, Pitoura E, Lillis K (2015) Timereach: historical reachability queries on evolving graphs. In: Proceedings of the 18th international conference on extending database technology, EDBT 2015, Brussels, Belgium, March 23–27, 2015, pp 121–132. https://doi.org/10.5441/002/edbt.2015.12

  • Semertzidis K, Pitoura E, Terzi E, Tsaparas P (2016) Best friends forever (BFF): finding lasting dense subgraphs. In: CoRR. arXiv:1612.05440

  • Sozio M, Gionis A (2010) The community-search problem and how to plan a successful cocktail party. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, USA, July 25–28, 2010, pp 939–948. https://doi.org/10.1145/1835804.1835923

  • Tsantarliotis P, Pitoura E (2015) Topic detectionusing a critical term graph on news-related tweets. In: Proceedings of the workshops of the EDBT/ICDT 2015 joint conference (EDBT/ICDT), Brussels, Belgium, March 27th, 2015, pp 177–182

  • Tsourakakis CE, Bonchi F, Gionis A, Gullo F, Tsiarli MA (2013) Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: The 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD 2013, Chicago, IL, USA, August 11–14, 2013, pp 104–112. https://doi.org/10.1145/2487575.2487645

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Semertzidis.

Additional information

Responsible editor: Evimaria Terzi.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Semertzidis, K., Pitoura, E., Terzi, E. et al. Finding lasting dense subgraphs. Data Min Knowl Disc 33, 1417–1445 (2019). https://doi.org/10.1007/s10618-018-0602-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-018-0602-x

Keywords

Navigation