D-cores: measuring collaboration of directed graphs based on degeneracy

Abstract

Community detection and evaluation is an important task in graph mining. In many cases, a community is defined as a subgraph characterized by dense connections or interactions between its nodes. A variety of measures are proposed to evaluate different quality aspects of such communities—in most cases ignoring the directed nature of edges. In this paper, we introduce novel metrics for evaluating the collaborative nature of directed graphs—a property not captured by the single node metrics or by other established community evaluation metrics. In order to accomplish this objective, we capitalize on the concept of graph degeneracy and define a novel D-core framework, extending the classic graph-theoretic notion of \(k\)-cores for undirected graphs to directed ones. Based on the D-core, which essentially can be seen as a measure of the robustness of a community under degeneracy, we devise a wealth of novel metrics used to evaluate graph collaboration features of directed graphs. We applied the D-core approach on large synthetic and real-world graphs such as Wikipedia, DBLP, and ArXiv and report interesting results at the graph as well at the node level.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

References

  1. 1.

    Alba RD (1973) A graph-theoretic definition of a sociometric clique. J Math Sociol 3:113–126

    MathSciNet  MATH  Article  Google Scholar 

  2. 2.

    Alvarez-Hamelin JI, Dall’Asta L, Barrat A, Vespignani A (2005) \(k\)-core decomposition: a tool for the visualization of large scale networks. CoRR, cs.NI/0504107

  3. 3.

    Alvarez-Hamelin JI, Dall’Asta L, Barrat A, Vespignani A (2006) Large scale networks fingerprinting and visualization using the \(k\)-core decomposition. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, pp 41–50

    Google Scholar 

  4. 4.

    An Y, Janssen J, Milios EE (2004) Characterizing and mining the citation graph of the computer science literature. Knowl Inf Syst 6:664–678. doi:10.1007/s10115-003-0128-3

    Article  Google Scholar 

  5. 5.

    Bader GD, Hogue CWV (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformat 4:1–1

    Google Scholar 

  6. 6.

    Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512

    MathSciNet  Article  Google Scholar 

  7. 7.

    Barabási A-L, Albert R, Jeong H (2000) Scale-free characteristics of random networks: the topology of the world-wide web. Phys A Stat Mech Appl 281:69–77

    Google Scholar 

  8. 8.

    Batagelj V, Mrvar A (2002) Pajek—analysis and visualization of large networks. In: Mutzel P, Jünger M, Leipert S (eds) Graph Drawing, volume 2265 of Lecture Notes in Computer Science. Springer, Berlin, pp 8–11

    Google Scholar 

  9. 9.

    Batagelj V, Zaversnik M (2002) Generalized cores. CoRR, cs.DS/0202039

  10. 10.

    Baur M, Gaertler M, Görke R, Krug M, Wagner D (2007) Generating graphs with predefined \(k\)-core structure. In: Proceedings of the European conference of complex systems (ECCS’07), Oct. 2007

  11. 11.

    Bollobas B, Borgs C, Chayes J, Riordan O (2003) Directed scale-free graph. In: Proceedings of 14th ACM-SIAM symposium on discrete algorithms, pp 132–139

  12. 12.

    Bollobás B, Riordan O (2004) The diameter of a scale-free random graph. Combinatorica 24:5–34

    MathSciNet  MATH  Article  Google Scholar 

  13. 13.

    Bollobs B, Riordan O, Spencer J, Tusnády G (2001) The degree sequence of a scale-free random graph process. Random Struct Algorithms 18(3):279–290

    Article  Google Scholar 

  14. 14.

    Buckley PG, Osthus D (2001) Popularity based random graph models leading to a scale-free degree sequence. Discrete Math 282:53–68

    MathSciNet  Article  Google Scholar 

  15. 15.

    Carmi S, Havlin S, Kirkpatrick S, Shavitt Y, Shir E (2006) MEDUSA—new model of internet topology using k-shell decomposition, arXiv:cond-mat/0601240

  16. 16.

    Charikar M, (2000) Greedy approximation algorithms for finding dense components in a graph. In: Approximation algorithms for combinatorial optimization (Saarbrücken), (2000) volume 1913 of Lecture Notes in Computer Science. Springer, Berlin, pp 84–95

  17. 17.

    Cooper C, Frieze A (2003) A general model of web graphs. Random Struct Algorithms 22:311–335

    MathSciNet  MATH  Article  Google Scholar 

  18. 18.

    Diestel R (2005) Graph theory, volume 173 of Graduate texts in mathematics. Springer, Berlin

    Google Scholar 

  19. 19.

    Dorogovtsev SN, Goltsev AV, Mendes JFF (2006) \(k\)-core organization of complex networks. Phys Rev Lett 96:040601

    Article  Google Scholar 

  20. 20.

    Dorogovtsev SN, Mendes JFF, Samukhin AN (2000) Structure of growing networks with preferential linking. Phys Rev Lett 85(21):4633–4636

    Article  Google Scholar 

  21. 21.

    Drinea E, Enachescu M, Mitzenmacher M (2001) Variations on random graph models for the web. Computer Science Group Harvard University, Cambridge

  22. 22.

    Erdős P (1963) On the structure of linear graphs. Israel J Math 1:156–160

    MathSciNet  Article  Google Scholar 

  23. 23.

    Erdős P, Rényi A (1960) On the evolution of random graphs. Magyar Tud Akad Mat Kutató Int Közl 5:17–61

    Google Scholar 

  24. 24.

    Fershtman M (1997) Cohesive group detection in a social network by the segregation matrix index. Social Netw 19:193–207

    Article  Google Scholar 

  25. 25.

    Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174

    MathSciNet  Article  Google Scholar 

  26. 26.

    Frank KA (1995) Identifying cohesive subgroups. Social Netw 17:27–56

    Article  Google Scholar 

  27. 27.

    Freuder EC (1982) A sufficient condition for backtrack-free search. J Assoc Comput Mach 29(1):24–32

    MathSciNet  MATH  Article  Google Scholar 

  28. 28.

    Giatsidis C, Thilikos DM, Vazirgiannis M (2011) D-cores: measuring collaboration of directed graphs based on degeneracy. In: ICDM, pp 201–210

  29. 29.

    Giatsidis C, Thilikos DM, Vazirgiannis M (2011) Evaluating cooperation in communities with the \(k\)-core structure. In: ASONAM. IEEE Computer Society, pp 87–93

  30. 30.

    Healy J, Janssen J, Milios E, Aiello W (2008) Characterization of graphs using degree cores. In: Algorithms and models for the Web-Graph: fourth international workshop, WAW 2006, volume LNCS-4936 of Lecture notes in computer science. Springer, Banff, Nov. 30–Dec. 1, 2008

  31. 31.

    Kandylas V, Upham S, Ungar L (2008) Finding cohesive clusters for analyzing knowledge communities. Knowl Inf Syst 17:335–354. doi:10.1007/s10115-008-0135-5

    Article  Google Scholar 

  32. 32.

    Kirousis LM, Thilikos DM (1996) The linkage of a graph. SIAM J Comput 25(3):626–647

    MathSciNet  MATH  Article  Google Scholar 

  33. 33.

    Kumar R, Raghavan P, Rajagopalan S, Sivakumar D, Tomkins A, Upfal E (2000) Stochastic models for the web graph. In: Proceedings of the 41st annual symposium on foundations of computer science. IEEE Computer Society . Washington, DC, USA, p 57

  34. 34.

    Kumar R, Raghavan P, Rajagopalan S, Tomkins A (1999) Extracting large-scale knowledge bases from the web. In: VLDB ’99: proceedings of the 25th international conference on very large data bases. Morgan Kaufmann, San Francisco, pp 639–650

  35. 35.

    Luce D (1950) Connectivity and generalized cliques in sociometric group structure. Psychometrika 15:169–190

    MathSciNet  Article  Google Scholar 

  36. 36.

    Matula DW (1968) A min-max theorem for graphs with application to graph coloring. SIAM Rev 10:481–482

    Google Scholar 

  37. 37.

    Matula DW, Marble G, Isaacson JD (1972) Graph coloring algorithms. In: Graph theory and computing. Academic Press, New York, pp 109–122

  38. 38.

    Moody J, White DR (2007) Structural cohesion and embeddedness: a hierarchical concept of social groups. Am Sociol Rev 68(1):103–127

    Article  Google Scholar 

  39. 39.

    Papadimitriou S, Sun J, Faloutsos C, Yu PS (2008) Hierarchical, parameter-free community discovery. In: ECML/PKDD (2), pp 170–187

  40. 40.

    Pittel B, Spencer J, Wormald N (1996) Sudden emergence of a giant \(k\)-core in a random graph. J Combinatorial Theory Ser B 67(1):111–151

    MathSciNet  MATH  Article  Google Scholar 

  41. 41.

    Seidman SB (1983) Network structure and minimum degree. Social Netw 5(3):269–287

    MathSciNet  Article  Google Scholar 

  42. 42.

    Szekeres G, Wilf HS (1968) An inequality for the chromatic number of a graph. J Combinatorial Theory 4:1–3

    MathSciNet  Article  Google Scholar 

  43. 43.

    Wasserman S, Faust K (1994) Social networks analysis: methods and applications. Cambridge University Press, Cambridge

    Book  Google Scholar 

  44. 44.

    Wuchty S, Almaas E (2005) Peeling the yeast protein network. Proteomics 5(2):444–449

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Michalis Vazirgiannis.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Giatsidis, C., Thilikos, D.M. & Vazirgiannis, M. D-cores: measuring collaboration of directed graphs based on degeneracy. Knowl Inf Syst 35, 311–343 (2013). https://doi.org/10.1007/s10115-012-0539-0

Download citation

Keywords

  • Graph mining
  • Community evaluation metrics
  • Degeneracy
  • Directed cores