Knowledge and Information Systems

, Volume 35, Issue 2, pp 311–343 | Cite as

D-cores: measuring collaboration of directed graphs based on degeneracy

  • Christos Giatsidis
  • Dimitrios M. Thilikos
  • Michalis Vazirgiannis
Regular Paper

Abstract

Community detection and evaluation is an important task in graph mining. In many cases, a community is defined as a subgraph characterized by dense connections or interactions between its nodes. A variety of measures are proposed to evaluate different quality aspects of such communities—in most cases ignoring the directed nature of edges. In this paper, we introduce novel metrics for evaluating the collaborative nature of directed graphs—a property not captured by the single node metrics or by other established community evaluation metrics. In order to accomplish this objective, we capitalize on the concept of graph degeneracy and define a novel D-core framework, extending the classic graph-theoretic notion of \(k\)-cores for undirected graphs to directed ones. Based on the D-core, which essentially can be seen as a measure of the robustness of a community under degeneracy, we devise a wealth of novel metrics used to evaluate graph collaboration features of directed graphs. We applied the D-core approach on large synthetic and real-world graphs such as Wikipedia, DBLP, and ArXiv and report interesting results at the graph as well at the node level.

Keywords

Graph mining Community evaluation metrics Degeneracy Directed cores 

References

  1. 1.
    Alba RD (1973) A graph-theoretic definition of a sociometric clique. J Math Sociol 3:113–126MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Alvarez-Hamelin JI, Dall’Asta L, Barrat A, Vespignani A (2005) \(k\)-core decomposition: a tool for the visualization of large scale networks. CoRR, cs.NI/0504107Google Scholar
  3. 3.
    Alvarez-Hamelin JI, Dall’Asta L, Barrat A, Vespignani A (2006) Large scale networks fingerprinting and visualization using the \(k\)-core decomposition. In: Weiss Y, Schölkopf B, Platt J (eds) Advances in neural information processing systems, vol 18. MIT Press, Cambridge, pp 41–50Google Scholar
  4. 4.
    An Y, Janssen J, Milios EE (2004) Characterizing and mining the citation graph of the computer science literature. Knowl Inf Syst 6:664–678. doi: 10.1007/s10115-003-0128-3 CrossRefGoogle Scholar
  5. 5.
    Bader GD, Hogue CWV (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformat 4:1–1Google Scholar
  6. 6.
    Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512MathSciNetCrossRefGoogle Scholar
  7. 7.
    Barabási A-L, Albert R, Jeong H (2000) Scale-free characteristics of random networks: the topology of the world-wide web. Phys A Stat Mech Appl 281:69–77Google Scholar
  8. 8.
    Batagelj V, Mrvar A (2002) Pajek—analysis and visualization of large networks. In: Mutzel P, Jünger M, Leipert S (eds) Graph Drawing, volume 2265 of Lecture Notes in Computer Science. Springer, Berlin, pp 8–11Google Scholar
  9. 9.
    Batagelj V, Zaversnik M (2002) Generalized cores. CoRR, cs.DS/0202039Google Scholar
  10. 10.
    Baur M, Gaertler M, Görke R, Krug M, Wagner D (2007) Generating graphs with predefined \(k\)-core structure. In: Proceedings of the European conference of complex systems (ECCS’07), Oct. 2007Google Scholar
  11. 11.
    Bollobas B, Borgs C, Chayes J, Riordan O (2003) Directed scale-free graph. In: Proceedings of 14th ACM-SIAM symposium on discrete algorithms, pp 132–139Google Scholar
  12. 12.
    Bollobás B, Riordan O (2004) The diameter of a scale-free random graph. Combinatorica 24:5–34MathSciNetMATHCrossRefGoogle Scholar
  13. 13.
    Bollobs B, Riordan O, Spencer J, Tusnády G (2001) The degree sequence of a scale-free random graph process. Random Struct Algorithms 18(3):279–290CrossRefGoogle Scholar
  14. 14.
    Buckley PG, Osthus D (2001) Popularity based random graph models leading to a scale-free degree sequence. Discrete Math 282:53–68MathSciNetCrossRefGoogle Scholar
  15. 15.
    Carmi S, Havlin S, Kirkpatrick S, Shavitt Y, Shir E (2006) MEDUSA—new model of internet topology using k-shell decomposition, arXiv:cond-mat/0601240Google Scholar
  16. 16.
    Charikar M, (2000) Greedy approximation algorithms for finding dense components in a graph. In: Approximation algorithms for combinatorial optimization (Saarbrücken), (2000) volume 1913 of Lecture Notes in Computer Science. Springer, Berlin, pp 84–95Google Scholar
  17. 17.
    Cooper C, Frieze A (2003) A general model of web graphs. Random Struct Algorithms 22:311–335MathSciNetMATHCrossRefGoogle Scholar
  18. 18.
    Diestel R (2005) Graph theory, volume 173 of Graduate texts in mathematics. Springer, BerlinGoogle Scholar
  19. 19.
    Dorogovtsev SN, Goltsev AV, Mendes JFF (2006) \(k\)-core organization of complex networks. Phys Rev Lett 96:040601CrossRefGoogle Scholar
  20. 20.
    Dorogovtsev SN, Mendes JFF, Samukhin AN (2000) Structure of growing networks with preferential linking. Phys Rev Lett 85(21):4633–4636CrossRefGoogle Scholar
  21. 21.
    Drinea E, Enachescu M, Mitzenmacher M (2001) Variations on random graph models for the web. Computer Science Group Harvard University, CambridgeGoogle Scholar
  22. 22.
    Erdős P (1963) On the structure of linear graphs. Israel J Math 1:156–160MathSciNetCrossRefGoogle Scholar
  23. 23.
    Erdős P, Rényi A (1960) On the evolution of random graphs. Magyar Tud Akad Mat Kutató Int Közl 5:17–61Google Scholar
  24. 24.
    Fershtman M (1997) Cohesive group detection in a social network by the segregation matrix index. Social Netw 19:193–207CrossRefGoogle Scholar
  25. 25.
    Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174MathSciNetCrossRefGoogle Scholar
  26. 26.
    Frank KA (1995) Identifying cohesive subgroups. Social Netw 17:27–56CrossRefGoogle Scholar
  27. 27.
    Freuder EC (1982) A sufficient condition for backtrack-free search. J Assoc Comput Mach 29(1):24–32MathSciNetMATHCrossRefGoogle Scholar
  28. 28.
    Giatsidis C, Thilikos DM, Vazirgiannis M (2011) D-cores: measuring collaboration of directed graphs based on degeneracy. In: ICDM, pp 201–210Google Scholar
  29. 29.
    Giatsidis C, Thilikos DM, Vazirgiannis M (2011) Evaluating cooperation in communities with the \(k\)-core structure. In: ASONAM. IEEE Computer Society, pp 87–93Google Scholar
  30. 30.
    Healy J, Janssen J, Milios E, Aiello W (2008) Characterization of graphs using degree cores. In: Algorithms and models for the Web-Graph: fourth international workshop, WAW 2006, volume LNCS-4936 of Lecture notes in computer science. Springer, Banff, Nov. 30–Dec. 1, 2008Google Scholar
  31. 31.
    Kandylas V, Upham S, Ungar L (2008) Finding cohesive clusters for analyzing knowledge communities. Knowl Inf Syst 17:335–354. doi: 10.1007/s10115-008-0135-5 CrossRefGoogle Scholar
  32. 32.
    Kirousis LM, Thilikos DM (1996) The linkage of a graph. SIAM J Comput 25(3):626–647MathSciNetMATHCrossRefGoogle Scholar
  33. 33.
    Kumar R, Raghavan P, Rajagopalan S, Sivakumar D, Tomkins A, Upfal E (2000) Stochastic models for the web graph. In: Proceedings of the 41st annual symposium on foundations of computer science. IEEE Computer Society . Washington, DC, USA, p 57Google Scholar
  34. 34.
    Kumar R, Raghavan P, Rajagopalan S, Tomkins A (1999) Extracting large-scale knowledge bases from the web. In: VLDB ’99: proceedings of the 25th international conference on very large data bases. Morgan Kaufmann, San Francisco, pp 639–650Google Scholar
  35. 35.
    Luce D (1950) Connectivity and generalized cliques in sociometric group structure. Psychometrika 15:169–190MathSciNetCrossRefGoogle Scholar
  36. 36.
    Matula DW (1968) A min-max theorem for graphs with application to graph coloring. SIAM Rev 10:481–482Google Scholar
  37. 37.
    Matula DW, Marble G, Isaacson JD (1972) Graph coloring algorithms. In: Graph theory and computing. Academic Press, New York, pp 109–122Google Scholar
  38. 38.
    Moody J, White DR (2007) Structural cohesion and embeddedness: a hierarchical concept of social groups. Am Sociol Rev 68(1):103–127CrossRefGoogle Scholar
  39. 39.
    Papadimitriou S, Sun J, Faloutsos C, Yu PS (2008) Hierarchical, parameter-free community discovery. In: ECML/PKDD (2), pp 170–187Google Scholar
  40. 40.
    Pittel B, Spencer J, Wormald N (1996) Sudden emergence of a giant \(k\)-core in a random graph. J Combinatorial Theory Ser B 67(1):111–151MathSciNetMATHCrossRefGoogle Scholar
  41. 41.
    Seidman SB (1983) Network structure and minimum degree. Social Netw 5(3):269–287MathSciNetCrossRefGoogle Scholar
  42. 42.
    Szekeres G, Wilf HS (1968) An inequality for the chromatic number of a graph. J Combinatorial Theory 4:1–3MathSciNetCrossRefGoogle Scholar
  43. 43.
    Wasserman S, Faust K (1994) Social networks analysis: methods and applications. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  44. 44.
    Wuchty S, Almaas E (2005) Peeling the yeast protein network. Proteomics 5(2):444–449Google Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  • Christos Giatsidis
    • 1
  • Dimitrios M. Thilikos
    • 2
  • Michalis Vazirgiannis
    • 1
    • 3
    • 4
  1. 1.LIXÉcole PolytechniquePalaiseauFrance
  2. 2.Department of MathematicsNational and Kapodistrian University of AthensAthensGreece
  3. 3.Department of InformaticsAthens University of EconomicsAthensGreece
  4. 4.Télécom ParisTech, LTCIParisFrance

Personalised recommendations