Finding Dense Subgraphs with Size Bounds

  • Reid Andersen
  • Kumar Chellapilla
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5427)

Abstract

We consider the problem of finding dense subgraphs with specified upper or lower bounds on the number of vertices. We introduce two optimization problems: the densest at-least-k-subgraph problem (dalks), which is to find an induced subgraph of highest average degree among all subgraphs with at least k vertices, and the densest at-most-k-subgraph problem (damks), which is defined similarly. These problems are relaxed versions of the well-known densest k-subgraph problem (dks), which is to find the densest subgraph with exactly k vertices. Our main result is that dalks can be approximated efficiently, even for web-scale graphs. We give a (1/3)-approximation algorithm for dalks that is based on the core decomposition of a graph, and that runs in time O(m + n), where n is the number of nodes and m is the number of edges. In contrast, we show that damks is nearly as hard to approximate as the densest k-subgraph problem, for which no good approximation algorithm is known. In particular, we show that if there exists a polynomial time approximation algorithm for damks with approximation ratio γ, then there is a polynomial time approximation algorithm for dks with approximation ratio γ2/8. In the experimental section, we test the algorithm for dalks on large publicly available web graphs. We observe that, in addition to producing near-optimal solutions for dalks, the algorithm also produces near-optimal solutions for dks for nearly all values of k.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abello, J., Resende, M.G.C., Sudarsky, R.: Massive quasi-clique detection. In: Rajsbaum, S. (ed.) LATIN 2002. LNCS, vol. 2286, pp. 598–612. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. Advances in Neural Information Processing Systems 18, 41–50 (2006)Google Scholar
  3. 3.
    Andersen, R.: A local algorithm for finding dense subgraphs. In: Proc. 19th ACM-SIAM Symposium on Discrete Algorithms (SODA 2008), pp. 1003–1009 (2008)Google Scholar
  4. 4.
    Arora, S., Karger, D., Karpinski, M.: Polynomial time approximation schemes for dense instances of NP-hard problems. In: Proc. 27th ACM Symposium on Theory of Computing (STOC 1995), pp. 284–293 (1995)Google Scholar
  5. 5.
    Asahiro, Y., Hassin, R., Iwama, K.: Complexity of finding dense subgraphs. Discrete Appl. Math. 121(1-3), 15–26 (2002)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Asahiro, Y., Iwama, K., Tamaki, H., Tokuyama, T.: Greedily finding a dense subgraph. J. Algorithms 34(2), 203–221 (2000)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Charikar, M.: Greedy approximation algorithms for finding dense components in a graph. In: Jansen, K., Khuller, S. (eds.) APPROX 2000. LNCS, vol. 1913, pp. 84–95. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  8. 8.
    Buehrer, G., Chellapilla, K.: A scalable pattern mining approach to web graph compression with communities. In: WSDM 2008: Proceedings of the international conference on web search and web data mining, pp. 95–106 (2008)Google Scholar
  9. 9.
    Dourisboure, Y., Geraci, F., Pellegrini, M.: Extraction and classification of dense communities in the web. In: WWW 2007: Proceedings of the 16th international conference on World Wide Web, pp. 461–470 (2007)Google Scholar
  10. 10.
    Feige, U., Langberg, M.: Approximation algorithms for maximization problems arising in graph partitioning. J. Algorithms 41(2), 174–211 (2001)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Feige, U., Peleg, D., Kortsarz, G.: The dense k-subgraph problem. Algorithmica 29(3), 410–421 (2001)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Feige, U., Seltser, M.: On the densest k-subgraph problem, Technical report, Department of Applied Mathematics and Computer Science, The Weizmann Institute, Rehobot (1997)Google Scholar
  13. 13.
    Gallo, G., Grigoriadis, M., Tarjan, R.: A fast parametric maximum flow algorithm and applications. SIAM J. Comput. 18(1), 30–55 (1989)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Gibson, D., Kumar, R., Tomkins, A.: Discovering large dense subgraphs in massive graphs. In: Proc. 31st VLDB Conference (2005)Google Scholar
  15. 15.
    Goldberg, A.: Finding a maximum density subgraph, Technical Report UCB/CSB 84/171, Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA (1984)Google Scholar
  16. 16.
    Kannan, R., Vinay, V.: Analyzing the structure of large graphs (manuscript) (1999)Google Scholar
  17. 17.
    Khot, S.: Ruling out PTAS for graph min-bisection, dense k-subgraph, and bipartite clique. SIAM Journal on Computing 36(4), 1025–1071 (2006)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Kortsarz, G., Peleg, D.: Generating sparse 2-spanners. J. Algorithms 17(2), 222–236 (1994)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Seidman, S.B.: Network structure and minimum degree. Social Networks 5, 269–287 (1983)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for emerging cyber-communities. In: Proc. 8th WWW Conference (WWW 1999) (1999)Google Scholar
  21. 21.
    Wuchty, S., Almaas, E.: Peeling the yeast protein network. Proteomics 5, 444 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Reid Andersen
    • 1
  • Kumar Chellapilla
    • 1
  1. 1.Microsoft Live LabsRedmondUSA

Personalised recommendations