Advertisement

Data Mining and Knowledge Discovery

, Volume 33, Issue 5, pp 1417–1445 | Cite as

Finding lasting dense subgraphs

  • Konstantinos SemertzidisEmail author
  • Evaggelia Pitoura
  • Evimaria Terzi
  • Panayiotis Tsaparas
Article
Part of the following topical collections:
  1. Journal Track of ECML PKDD 2019

Abstract

Graphs form a natural model for relationships and interactions between entities, for example, between people in social and cooperation networks, servers in computer networks, or tags and words in documents and tweets. But, which of these relationships or interactions are the most lasting ones? In this paper, we study the following problem: given a set of graph snapshots, which may correspond to the state of an evolving graph at different time instances, identify the set of nodes that are the most densely connected in all snapshots. We call this problem the Best Friends Forever (\(\text {BFF}\)) problem. We provide definitions for density over multiple graph snapshots, that capture different semantics of connectedness over time, and we study the corresponding variants of the \(\text {BFF}\) problem. We then look at the On–Off\(\text {BFF}\) (\(\textsc {O}^{\textsc {2}}\text {BFF}\)) problem that relaxes the requirement of nodes being connected in all snapshots, and asks for the densest set of nodes in at least k of a given set of graph snapshots. We show that this problem is NP-complete for all definitions of density, and we propose a set of efficient algorithms. Finally, we present experiments with synthetic and real datasets that show both the efficiency of our algorithms and the usefulness of the \(\text {BFF}\) and the \(\textsc {O}^{\textsc {2}}\text {BFF}\) problems.

Keywords

Dense subgraphs Graph history Social networks 

Notes

References

  1. Alvarez-Hamelin JI, Dall’Asta L, Barrat A, Vespignani A (2005) Large scale networks fingerprinting and visualization using the k-core decomposition. In: Advances in neural information processing systems, [neural information processing systems, NIPS 2005, December 5–8, 2005, Vancouver, British Columbia, Canada], 2005, vol 18. MIT Press Cambridge, MA, USA, pp 41–50Google Scholar
  2. Araujo M, Günnemann S, Papadimitriou S, Faloutsos C, Basu P, Swami A, Papalexakis EE, Koutra D (2016) Discovery of “comet” communities in temporal and labeled graphs com\(^{2}\). Knowl Inf Syst 46(3):657–677.  https://doi.org/10.1007/s10115-015-0847-2 CrossRefGoogle Scholar
  3. Asahiro Y, Iwama K, Tamaki H, Tokuyama T (2000) Greedily finding a dense subgraph. J Algorithms 34:203–221.  https://doi.org/10.1006/jagm.1999.1062 MathSciNetCrossRefzbMATHGoogle Scholar
  4. Bahmani B, Kumar R, Vassilvitskii S (2012) Densest subgraph in streaming and mapreduce. PVLDB 5(5):454–465.  https://doi.org/10.14778/2140436.2140442 Google Scholar
  5. Bhattacharya S, Henzinger M, Nanongkai D, Tsourakakis CE (2015) Space- and time-efficient algorithm for maintaining dense subgraphs on one-pass dynamic streams. In: Proceedings of the forty-seventh annual ACM on symposium on theory of computing, STOC 2015, Portland, OR, USA, June 14–17, 2015, pp 173–182.  https://doi.org/10.1145/2746539.2746592
  6. Bogdanov P, Mongiovì M, Singh AK (2011) Mining heavy subgraphs in time-evolving networks. In: 11th IEEE international conference on data mining, ICDM 2011, Vancouver, BC, Canada, December 11–14, 2011, pp 81–90.  https://doi.org/10.1109/ICDM.2011.101
  7. Bourjolly J-M, Laporte G, Pesant G (2002) An exact algorithm for the maximum k-club problem in an undirected graph. Eur J Oper Res 138(1):21–28.  https://doi.org/10.1016/S0377-2217(01)00133-3 MathSciNetCrossRefzbMATHGoogle Scholar
  8. Cerf L, Besson J, Robardet C, Boulicaut J-F (2008) Data peeler: contraint-based closed pattern mining in n-ary relations. In: Proceedings of the SIAM international conference on data mining, SDM 2008, April 24–26, 2008, Atlanta, Georgia, USA, pp 37–48.  https://doi.org/10.1137/1.9781611972788.4
  9. Charikar M (2000) Greedy approximation algorithms for finding dense components in a graph. In: Approximation algorithms for combinatorial optimization, third international workshop, APPROX 2000, Saarbrücken, Germany, September 5–8, 2000, proceedings, pp 84–95.  https://doi.org/10.1007/3-540-44436-X_10
  10. Epasto A, Lattanzi S, Sozio M (2015) Efficient densest subgraph computation in evolving graphs. In: Proceedings of the 24th international conference on world wide web, WWW 2015, Florence, Italy, May 18–22, 2015, pp 300–310.  https://doi.org/10.1145/2736277.2741638
  11. Fortunato S (2009) Community detection in graphs. CoRR. arXiv:0906.0612
  12. Goldberg AV (1984) Finding a maximum density subgraph. Technical reportGoogle Scholar
  13. Jethava V, Beerenwinkel N (2015) Finding dense subgraphs in relational graphs. In: Machine learning and knowledge discovery in databases—European conference, ECML PKDD 2015, Porto, Portugal, September 7–11, 2015, Proceedings, Part II, pp 641–654.  https://doi.org/10.1007/978-3-319-23525-7_39
  14. Khuller S, Saha B (2009) On finding dense subgraphs. In: Automata, languages and programming, 36th international colloquium, ICALP 2009, Rhodes, Greece, July 5–12, 2009, Proceedings, Part I, pp 597–608.  https://doi.org/10.1007/978-3-642-02927-1_50
  15. Khurana U, Deshpande A (2013) Efficient snapshot retrieval over historical graph data. In: 29th IEEE international conference on data engineering, ICDE 2013, Brisbane, Australia, April 8–12, 2013, pp 997–1008.  https://doi.org/10.1109/ICDE.2013.6544892
  16. Leskovec J, Kleinberg JM, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. TKDD 1(1):2.  https://doi.org/10.1145/1217299.1217301 CrossRefGoogle Scholar
  17. Ma S, Hu R, Wang L, Lin X, Huai J (2017) Fast computation of dense temporal subgraphs. In: 33rd IEEE international conference on data engineering, ICDE 2017, San Diego, CA, USA, April 19–22, 2017, pp 361–372.  https://doi.org/10.1109/ICDE.2017.95
  18. Makino K, Uno T (2004) New algorithms for enumerating all maximal cliques. In: Algorithm theory—SWAT 2004, 9th Scandinavian workshop on algorithm theory, Humlebaek, Denmark, July 8–10, 2004, Proceedings, pp 260–272.  https://doi.org/10.1007/978-3-540-27810-8_23
  19. McClosky B, Hicks IV (2012) Combinatorial algorithms for the maximum k-plex problem. J. Comb. Optim. 23(1):29–49.  https://doi.org/10.1007/s10878-010-9338-2 MathSciNetCrossRefzbMATHGoogle Scholar
  20. Moffitt VZ, Stoyanovich J (2016) Towards a distributed infrastructure for evolving graph analytics. In: Proceedings of the 25th international conference on world wide web, WWW 2016, Montreal, Canada, April 11–15, 2016, Companion Volume, pp 843–848.  https://doi.org/10.1145/2872518.2889290
  21. Myra S (2011) Evolution in social networks: a survey. In: Social network data analytics, pp 149–175.  https://doi.org/10.1007/978-1-4419-8462-3_6
  22. Nguyen K-N, Cerf L, Plantevit M, Boulicaut J-F (2011) Multidimensional association rules in boolean tensors. In: Proceedings of the eleventh SIAM international conference on data mining, SDM 2011, April 28–30, Mesa, Arizona, USA, pp 570–581.  https://doi.org/10.1137/1.9781611972818.49
  23. Nguyen K-N, Cerf L, Plantevit M, Boulicaut J-F (2013) Discovering descriptive rules in relational dynamic graphs. Intell. Data Anal. 17(1):49–69.  https://doi.org/10.3233/IDA-120567 CrossRefGoogle Scholar
  24. Ren C, Lo E, Kao B, Zhu X, Cheng R (2011) On querying historical evolving graph sequences. PVLDB 4(11):726–737Google Scholar
  25. Rozenshtein P, Nikolaj T, Aristides G (2017) Finding dynamic dense subgraphs. TKDD 11(3):27:1–27:30.  https://doi.org/10.1145/3046791 CrossRefGoogle Scholar
  26. Rozenshtein P, Tatti N, Gionis A (2014) Discovering dynamic communities in interaction networks. In: Machine learning and knowledge discovery in databases—European conference, ECML PKDD 2014, Nancy, France, September 15–19, 2014. Proceedings, Part II, pp 678–693.  https://doi.org/10.1007/978-3-662-44851-9_43
  27. Semertzidis K, Pitoura E (2018) Top-k durable graph pattern queries on temporal graphs. IEEE Trans Knowl Data Eng PP(99):1–1.  https://doi.org/10.1109/TKDE.2018.2823754 Google Scholar
  28. Semertzidis K, Pitoura E (2016) Durable graph pattern queries on historical graphs. In: 32nd IEEE international conference on data engineering, ICDE 2016, Helsinki, Finland, May 16–20, 2016, pp 541–552.  https://doi.org/10.1109/ICDE.2016.7498269
  29. Semertzidis K, Pitoura E (2017) Historical traversals in native graph databases. In: Advances in databases and information systems—21st European conference, ADBIS 2017, Nicosia, Cyprus, September 24–27, 2017, proceedings, pp 167–181.  https://doi.org/10.1007/978-3-319-66917-5_12
  30. Semertzidis K, Pitoura E, Lillis K (2015) Timereach: historical reachability queries on evolving graphs. In: Proceedings of the 18th international conference on extending database technology, EDBT 2015, Brussels, Belgium, March 23–27, 2015, pp 121–132.  https://doi.org/10.5441/002/edbt.2015.12
  31. Semertzidis K, Pitoura E, Terzi E, Tsaparas P (2016) Best friends forever (BFF): finding lasting dense subgraphs. In: CoRR. arXiv:1612.05440
  32. Sozio M, Gionis A (2010) The community-search problem and how to plan a successful cocktail party. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, USA, July 25–28, 2010, pp 939–948.  https://doi.org/10.1145/1835804.1835923
  33. Tsantarliotis P, Pitoura E (2015) Topic detectionusing a critical term graph on news-related tweets. In: Proceedings of the workshops of the EDBT/ICDT 2015 joint conference (EDBT/ICDT), Brussels, Belgium, March 27th, 2015, pp 177–182Google Scholar
  34. Tsourakakis CE, Bonchi F, Gionis A, Gullo F, Tsiarli MA (2013) Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: The 19th ACM SIGKDD international conference on knowledge discovery and data mining, KDD 2013, Chicago, IL, USA, August 11–14, 2013, pp 104–112.  https://doi.org/10.1145/2487575.2487645

Copyright information

© The Author(s) 2018

Authors and Affiliations

  • Konstantinos Semertzidis
    • 1
    Email author
  • Evaggelia Pitoura
    • 1
  • Evimaria Terzi
    • 2
  • Panayiotis Tsaparas
    • 1
  1. 1.Department of Computer Science and EngineeringUniversity of IoanninaIoanninaGreece
  2. 2.Department of Computer ScienceBoston UniversityBostonUSA

Personalised recommendations