Advertisement

Abstract

Finding communities in graphs is one of the most well- studied problems in data mining and social-network analysis. In many real applications, the underlying graph does not have a clear community structure. In those cases, selecting a single community turns out to be a fairly ill-posed problem, as the optimization criterion has to make a difficult choice between selecting a tight but small community or a more inclusive but sparser community.

In order to avoid the problem of selecting only a single community we propose discovering a sequence of nested communities. More formally, given a graph and a starting set, our goal is to discover a sequence of communities all containing the starting set, and each community forming a denser subgraph than the next. Discovering an optimal sequence of communities is a complex optimization problem, and hence we divide it into two subproblems: 1) discover the optimal sequence for a fixed order of graph vertices, a subproblem that we can solve efficiently, and 2) find a good order. We employ a simple heuristic for discovering an order and we provide empirical and theoretical evidence that our order is good.

Keywords

community discovery monotonic segmentation graph mining nested communities 

References

  1. 1.
    Agarwal, G., Kempe, D.: Modularity-maximizing network communities via mathematical programming. European Physics Journal B 66(3) (2008)Google Scholar
  2. 2.
    Ayer, M., Brunk, H., Ewing, G., Reid, W.: An empirical distribution function for sampling with incomplete information. The Annals of Mathematical Statistics 26(4) (1955)Google Scholar
  3. 3.
    Bellman, R.: On the approximation of curves by line segments using dynamic programming. Communications of the ACM 4(6) (1961)Google Scholar
  4. 4.
    Charikar, M.: Greedy approximation algorithms for finding dense components in a graph. In: Jansen, K., Khuller, S. (eds.) APPROX 2000. LNCS, vol. 1913, pp. 84–95. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  5. 5.
    Chung, F.R.K.: Spectral Graph Theory. American Mathematical Society (1997)Google Scholar
  6. 6.
    Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Physical Review E (2004)Google Scholar
  7. 7.
    Coscia, M., Rossetti, G., Giannotti, F., Pedreschi, D.: DEMON: a local-first discovery method for overlapping communities. In: KDD (2012)Google Scholar
  8. 8.
    Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: KDD (2000)Google Scholar
  9. 9.
    Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.M.: Self-organization and identification of web communities. Computer 35(3) (2002)Google Scholar
  10. 10.
    Fortunato, S.: Community detection in graphs. Physics Reports, 486 (2010)Google Scholar
  11. 11.
    Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. PNAS 99 (2002)Google Scholar
  12. 12.
    Gregory, S.: An algorithm to find overlapping community structure in networks. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 91–102. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Guha, S., Koudas, N., Shim, K.: Approximation and streaming algorithms for histogram construction problems. ACM TODS 31 (2006)Google Scholar
  14. 14.
    Haiminen, N., Gionis, A.: Unimodal segmentation of sequences. In: ICDM (2004)Google Scholar
  15. 15.
    Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: CDROM (1998)Google Scholar
  16. 16.
    Koren, Y., North, S.C., Volinsky, C.: Measuring and extracting proximity graphs in networks. TKDD 1(3) (2007)Google Scholar
  17. 17.
    Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Statistical properties of community structure in large social and information networks. In: WWW (2008)Google Scholar
  18. 18.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: NIPS (2001)Google Scholar
  19. 19.
    Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435 (2005)Google Scholar
  20. 20.
    Pinney, J., Westhead, D.: Betweenness-based decomposition methods for social and biological networks. In: Interdisciplinary Statistics and Bioinformatics (2006)Google Scholar
  21. 21.
    Pons, P., Latapy, M.: Computing communities in large networks using random walks. Journal of Graph Algorithms Applications 10(2) (2006)Google Scholar
  22. 22.
    Sozio, M., Gionis, A.: The community-search problem and how to plan a successful cocktail party. In: KDD (2010)Google Scholar
  23. 23.
    Tong, H., Faloutsos, C.: Center-piece subgraphs: problem definition and fast solutions. In: KDD (2006)Google Scholar
  24. 24.
    van Dongen, S.: Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht (2000)Google Scholar
  25. 25.
    von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4) (2007)Google Scholar
  26. 26.
    White, S., Smyth, P.: A spectral clustering approach to finding communities in graph. In: SDM (2005)Google Scholar
  27. 27.
    Zhang, S., Wang, R.-S., Zhang, X.-S.: Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Physica A (2007)Google Scholar
  28. 28.
    Zhou, H., Lipowsky, R.: Network brownian motion: A new method to measure vertex-vertex proximity and to identify communities and subcommunities. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3038, pp. 1062–1069. Springer, Heidelberg (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Nikolaj Tatti
    • 1
  • Aristides Gionis
    • 1
  1. 1.Helsinki Institute for Information Technology, Department of Information and Computer ScienceAalto UniversityFinland

Personalised recommendations