Finding Dense Subgraphs with Size Bounds
We consider the problem of finding dense subgraphs with specified upper or lower bounds on the number of vertices. We introduce two optimization problems: the densest at-least-k-subgraph problem (dalks), which is to find an induced subgraph of highest average degree among all subgraphs with at least k vertices, and the densest at-most-k-subgraph problem (damks), which is defined similarly. These problems are relaxed versions of the well-known densest k-subgraph problem (dks), which is to find the densest subgraph with exactly k vertices. Our main result is that dalks can be approximated efficiently, even for web-scale graphs. We give a (1/3)-approximation algorithm for dalks that is based on the core decomposition of a graph, and that runs in time O(m + n), where n is the number of nodes and m is the number of edges. In contrast, we show that damks is nearly as hard to approximate as the densest k-subgraph problem, for which no good approximation algorithm is known. In particular, we show that if there exists a polynomial time approximation algorithm for damks with approximation ratio γ, then there is a polynomial time approximation algorithm for dks with approximation ratio γ 2/8. In the experimental section, we test the algorithm for dalks on large publicly available web graphs. We observe that, in addition to producing near-optimal solutions for dalks, the algorithm also produces near-optimal solutions for dks for nearly all values of k.
Unable to display preview. Download preview PDF.
- 2.Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. Advances in Neural Information Processing Systems 18, 41–50 (2006)Google Scholar
- 3.Andersen, R.: A local algorithm for finding dense subgraphs. In: Proc. 19th ACM-SIAM Symposium on Discrete Algorithms (SODA 2008), pp. 1003–1009 (2008)Google Scholar
- 4.Arora, S., Karger, D., Karpinski, M.: Polynomial time approximation schemes for dense instances of NP-hard problems. In: Proc. 27th ACM Symposium on Theory of Computing (STOC 1995), pp. 284–293 (1995)Google Scholar
- 8.Buehrer, G., Chellapilla, K.: A scalable pattern mining approach to web graph compression with communities. In: WSDM 2008: Proceedings of the international conference on web search and web data mining, pp. 95–106 (2008)Google Scholar
- 9.Dourisboure, Y., Geraci, F., Pellegrini, M.: Extraction and classification of dense communities in the web. In: WWW 2007: Proceedings of the 16th international conference on World Wide Web, pp. 461–470 (2007)Google Scholar
- 12.Feige, U., Seltser, M.: On the densest k-subgraph problem, Technical report, Department of Applied Mathematics and Computer Science, The Weizmann Institute, Rehobot (1997)Google Scholar
- 14.Gibson, D., Kumar, R., Tomkins, A.: Discovering large dense subgraphs in massive graphs. In: Proc. 31st VLDB Conference (2005)Google Scholar
- 15.Goldberg, A.: Finding a maximum density subgraph, Technical Report UCB/CSB 84/171, Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA (1984)Google Scholar
- 16.Kannan, R., Vinay, V.: Analyzing the structure of large graphs (manuscript) (1999)Google Scholar
- 20.Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for emerging cyber-communities. In: Proc. 8th WWW Conference (WWW 1999) (1999)Google Scholar