Coloring large complex networks

Original Article

Abstract

Given a large social or information network, how can we partition the vertices into sets (i.e., colors) such that no two vertices linked by an edge are in the same set while minimizing the number of sets used. Despite the obvious practical importance of graph coloring, existing works have not systematically investigated or designed methods for large complex networks. In this work, we develop a unified framework for coloring large complex networks that consists of two main coloring variants that effectively balances the tradeoff between accuracy and efficiency. Using this framework as a fundamental basis, we propose coloring methods designed for the scale and structure of complex networks. In particular, the methods leverage triangles, triangle-cores, and other egonet properties and their combinations. We systematically compare the proposed methods across a wide range of networks (e.g., social, web, biological networks) and find a significant improvement over previous approaches in nearly all cases. Additionally, the solutions obtained are nearly optimal and sometimes provably optimal for certain classes of graphs (e.g., collaboration networks). We also propose a parallel algorithm for the problem of coloring neighborhood subgraphs and make several key observations. Overall, the coloring methods are shown to be (1) accurate with solutions close to optimal, (2) fast and scalable for large networks, and (3) flexible for use in a variety of applications.

Keywords

Network coloring Unified framework Greedy methods Neighborhood coloring Triangle-core ordering Social networks 

References

  1. Adamic LA, Lukose RM, Puniyani AR, Huberman BA (2001) Search in power-law networks. Phys Rev E 64(4):046–135CrossRefGoogle Scholar
  2. Aggarwal CC, Zhao Y, Yu PS (2011) Outlier detection in graph streams. In: ICDE, pp 399–409Google Scholar
  3. Ahmed NK, Neville J, Kompella R (2013) Network sampling: from static to streaming graphs. Trans Knowl Discov Data (TKDD) 8(2):7:1–7:56Google Scholar
  4. Ahmed NK, Duffield N, Neville J, Kompella R (2014) Graph sample and hold: a framework for big graph analytics. In: SIGKDD, pp 1–10Google Scholar
  5. Akoglu L, McGlohon M, Faloutsos C (2010) Oddball: spotting anomalies in weighted graphs. In: Advances in knowledge discovery and data mining, Springer, pp 410–421Google Scholar
  6. Al Hasan M, Zaki MJ (2009) Musk: uniform sampling of k maximal patterns. In: SDM, pp 650–661Google Scholar
  7. Banerjee D, Mukherjee B (1996) A practical approach for routing and wavelength assignment in large wavelength-routed optical networks. Sel Areas Commun 14(5):903–908CrossRefGoogle Scholar
  8. Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5(2):101–113CrossRefGoogle Scholar
  9. Batagelj V, Zaversnik M (2003) An o(m) algorithm for cores decomposition of networks. arXiv: cs/0310049
  10. Berlingerio M, Koutra D, Eliassi-Rad T, Faloutsos C (2013) Network similarity via multiple social theories. In: International conference on advances in social networks analysis and mining, pp 1439–1440Google Scholar
  11. Bilgic M, Mihalkova L, Getoor L (2010) Active learning for networked data. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 79–86Google Scholar
  12. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech: Theory Exp (10):P10008Google Scholar
  13. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU (2006) Complex networks: structure and dynamics. Phys Rep 424(4):175–308CrossRefMathSciNetGoogle Scholar
  14. Boldi P, Vigna S (2004) The webgraph framework: compression techniques. In: WWW, pp 595–602Google Scholar
  15. Bomze I, Budinich M, Pardalos P, Pelillo M et al (1999) The maximum clique problem. Handb Comb Optim 4(1):1–74MathSciNetGoogle Scholar
  16. Budiono TA, Wong KW (2012) A pure graph coloring constructive heuristic in timetabling. ICCIS 1:307–312Google Scholar
  17. Capar C, Goeckel D, Liu B, Towsley D (2012) Secret communication in large wireless networks without eavesdropper location information. In: INFOCOM, pp 1152–1160Google Scholar
  18. Carraghan R, Pardalos PM (1990) An exact algorithm for the maximum clique problem. Oper Res Lett 9(6):375–382CrossRefMATHGoogle Scholar
  19. Chaitin GJ (1982) Register allocation and spilling via graph coloring. ACM Sigplan Not 17(6):98–105CrossRefMathSciNetGoogle Scholar
  20. Chaoji V, Al Hasan M (2008) An integrated, generic approach to pattern mining: data mining template library. Data Min Knowl Discov 17(3):457–495CrossRefMathSciNetGoogle Scholar
  21. Chaudhuri K, Graham FC, Jamall MS (2008) A network coloring game. In: Internet and network economics, Springer, pp 522–530Google Scholar
  22. Cohen J (2009) Graph twiddling in a mapreduce world. Comput Sci Eng 11(4):29–41CrossRefGoogle Scholar
  23. Colbourn CJ, Dinitz JH (2010) Handbook of combinatorial designs. CRC Press, Boca Raton, FL USAGoogle Scholar
  24. Coleman TF, Moré JJ (1983) Estimation of sparse jacobian matrices and graph coloring blems. SIAM J Numer Anal 20(1):187–209CrossRefMathSciNetMATHGoogle Scholar
  25. Davidson I, Gilpin S, Carmichael O, Walker P (2013) Network discovery via constrained tensor analysis of fmri data. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 194–202Google Scholar
  26. De Raedt L, Kersting K (2008) Probabilistic inductive logic programming. In: Probabilistic inductive logic programming—theory and applications, Springer (LNCS), pp 1–27Google Scholar
  27. Enemark DP, McCubbins MD, Paturi R, Weller N (2011) Does more connectivity help groups to solve social problems. In: EC, pp 21–26Google Scholar
  28. Erdős P, Hajnal A (1966) On chromatic number of graphs and set-systems. Acta Math Hung 17(1):61–99CrossRefGoogle Scholar
  29. Erdös P, Füredi Z, Hajnal A, Komjáth P, Rödl V, Seress Á (1986) Coloring graphs with locally few colors. Discrete Math 59(1):21–34CrossRefMathSciNetMATHGoogle Scholar
  30. Everett MG, Borgatti S (1991) Role colouring a graph. Math Soc Sci 21(2):183–188CrossRefMathSciNetMATHGoogle Scholar
  31. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3):75–174CrossRefMathSciNetGoogle Scholar
  32. Garey MR, Johnson DS (1979) Computers and intractability, vol 174. Freeman, New YorkMATHGoogle Scholar
  33. Gebremedhin AH, Nguyen D, Patwary MA, Pothen A (2013) Colpack: software for graph coloring and related problems in scientific computing. ACM Trans Math Softw 40(1):1–30Google Scholar
  34. Gjoka M, Smith E, Butts CT (2013) Estimating clique composition and size distributions from sampled network data. arXiv:13083297
  35. Godsil CD, Royle G, Godsil C (2001) Algebraic graph theory, vol 8. Springer, New YorkCrossRefMATHGoogle Scholar
  36. Grohe M, Kersting K, Mladenov M, Selman E (2013) Dimension reduction via colour refinement. arXiv:13075697
  37. Jiang M, Fu AWC, Wong RCW, Cheng J, Xu Y (2014) Hop doubling label indexing for point-to-point distance querying on scale-free networks. arXiv:14030779
  38. Kang U, Meeder B, Faloutsos C (2011) Spectral analysis for billion-scale graphs: Discoveries and implementation. Advances in knowledge discovery and data mining. Springer, Berlin, Heidelberg, pp 13–25CrossRefGoogle Scholar
  39. Kearns M, Suri S, Montfort N (2006) An experimental study of the coloring problem on human subject networks. Science 313(5788):824–827CrossRefGoogle Scholar
  40. Kleinberg JM (2000) Navigation in a small world. Nature 406(6798):845–845CrossRefGoogle Scholar
  41. Konc J, Janezic D (2007) An improved branch and bound algorithm for the maximum clique problem. Proteins 4:5Google Scholar
  42. Leighton FT (1979) A graph coloring algorithm for large scheduling problems. J Res Natl Bur Stand 84(6):489–506CrossRefMathSciNetMATHGoogle Scholar
  43. Malliaros FD, Megalooikonomou V, Faloutsos C (2012) Fast robustness estimation in large social graphs: communities and anomaly detection. In: SDM, pp 942–953Google Scholar
  44. Matula DW, Beck LL (1983) Smallest-last ordering and clustering and graph coloring algorithms. J ACM 30(3):417–427CrossRefMathSciNetMATHGoogle Scholar
  45. McCormick ST (1983) Optimal approximation of sparse hessians and its equivalence to a graph coloring problem. Math Program 26(2):153–171CrossRefMathSciNetMATHGoogle Scholar
  46. McCreesh C, Prosser P (2013) Multi-threading a state-of-the-art maximum clique algorithm. Algorithms 6(4):618–635CrossRefMathSciNetGoogle Scholar
  47. Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B (2007) Measurement and analysis of online social networks. In: IMCGoogle Scholar
  48. Moscibroda T, Wattenhofer R (2008) Coloring unstructured radio networks. Distrib Comput 21(4):271–284CrossRefMATHGoogle Scholar
  49. Mossel E, Schoenebeck G (2010) Reaching consensus on social networks. In: ICS, pp 214–229Google Scholar
  50. Newman ME, Park J (2003) Why social networks are different from other types of networks. Phys Rev E 68(3):036122CrossRefGoogle Scholar
  51. Ni J, Srikant R, Wu X (2011) Coloring spatial point processes with applications to peer discovery in large wireless networks. TON 19(2):575–588Google Scholar
  52. Prosser P (2012) Exact algorithms for maximum clique: a computational study. arXiv:12074616v1
  53. Pržulj N (2007) Biological network comparison using graphlet degree distribution. Bioinformatics 23(2):e177–e183CrossRefGoogle Scholar
  54. Rahman M, Bhuiyan M, Hasan MA (2012) Graft: an approximate graphlet counting algorithm for large graph analysis. In: Proceedings of the 21st ACM international conference on Information and knowledge management, ACM, pp 1467–1471Google Scholar
  55. Rossi RA (2014) Fast triangle core decomposition for mining large graphs. In: Advances in knowledge discovery and data mining. Springer, Berlin, Heidelberg, pp 1–12Google Scholar
  56. Rossi RA, Gleich DF, Gebremedhin AH, Patwary MA (2012) A fast parallel maximum clique algorithm for large sparse graphs and temporal strong components. arXiv:13026256:1–9
  57. Rossi RA, Gleich DF, Gebremedhin AH, Patwary MA (2014) Fast maximum clique algorithms for large graphs. In: WWW companionGoogle Scholar
  58. San Segundo P, Rodríguez-Losada D, Jiménez A (2011) An exact bit-parallel algorithm for the maximum clique problem. Comput Oper Res 38:571–581CrossRefMathSciNetMATHGoogle Scholar
  59. Schneider J, Wattenhofer R (2011) Distributed coloring depending on the chromatic number or the neighborhood growth. In: Structural information and communication complexity, Springer, pp 246–257Google Scholar
  60. Sen P, Namata G, Bilgic M, Getoor L, Galligher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93Google Scholar
  61. Sharara H, Singh L, Getoor L, Mann J (2012) Stability vs. diversity: Understanding the dynamics of actors in time-varying affiliation networks. In: Social informatics (SocialInformatics), 2012 international conference on, IEEE, pp 1–6Google Scholar
  62. Sharma M, Bilgic M (2013) Most-surely vs. least-surely uncertain. In: ICDM, pp 667–676Google Scholar
  63. Shervashidze N, Petri T, Mehlhorn K, Borgwardt KM, Vishwanathan S (2009) Efficient graphlet kernels for large graph comparison. In: International conference on artificial intelligence and statistics, pp 488–495Google Scholar
  64. Sivarajan KN, McEliece RJ, Ketchum J (1989) Channel assignment in cellular radio. In: Vehicular technology conference, 1989, IEEE 39th, IEEE, pp 846–850Google Scholar
  65. Sun J, Tsourakakis CE, Hoke E, Faloutsos C, Eliassi-Rad T (2008) Two heads better than one: pattern discovery in time-evolving multi-aspect data. Data Min Knowl Discov 17(1):111–128CrossRefMathSciNetGoogle Scholar
  66. Szekeres G, Wilf HS (1968) An inequality for the chromatic number of a graph. J Comb Theory 4(1):1–3CrossRefMathSciNetGoogle Scholar
  67. Tewarson RP (1973) Sparse matrices, vol 69. Academic Press, New YorkMATHGoogle Scholar
  68. Tomita E, Kameda T (2007) An efficient branch-and-bound algorithm for finding a maximum clique with computational experiments. J Glob Optim 37:95–111CrossRefMathSciNetMATHGoogle Scholar
  69. Tomita E, Sutani Y, Higashi T, Takahashi S, Wakatsuki M (2010) A simple and faster branch-and-bound algorithm for finding a maximum clique. In: WALCOM: algorithms and computation, Springer, pp 191–203Google Scholar
  70. Tomita E, Akutsu T, Matsunaga T (2011) Efficient algorithms for finding maximum and maximal cliques: effective tools for bioinformatics. Biomedical engineering, trends in electronics, communications and software, pp 978–953Google Scholar
  71. Ugander J, Backstrom L, Kleinberg J (2013a) Subgraph frequencies: mapping the empirical and extremal geography of large graph collections. In: WWW, pp 1307–1318Google Scholar
  72. Ugander J, Karrer B, Backstrom L, Kleinberg J (2013b) Graph cluster randomization: network exposure to multiple universes. arXiv:13056979
  73. Wang X, Davidson I (2010) Active spectral clustering. In: ICDM, pp 561–568Google Scholar
  74. Welsh DJ, Powell MB (1967) An upper bound for the chromatic number of a graph and its application to timetabling problems. Comput J 10(1):85–86CrossRefMATHGoogle Scholar
  75. Zhang Y, Parthasarathy S (2012) Extracting analyzing and visualizing triangle k-core motifs within networks. In: ICDE, pp 1049–1060Google Scholar

Copyright information

© Springer-Verlag Wien 2014

Authors and Affiliations

  1. 1.Department of Computer SciencePurdue UniversityWest LafayetteUSA

Personalised recommendations