Advertisement

Algorithmica

, Volume 56, Issue 1, pp 3–16 | Cite as

An Efficient Algorithm for Solving Pseudo Clique Enumeration Problem

  • Takeaki UnoEmail author
Article

Abstract

The problem of finding dense structures in a given graph is quite basic in informatics including data mining and data engineering. Clique is a popular model to represent dense structures, and widely used because of its simplicity and ease in handling. Pseudo cliques are natural extension of cliques which are subgraphs obtained by removing small number of edges from cliques. We here define a pseudo clique by a subgraph such that the ratio of the number of its edges compared to that of the clique with the same number of vertices is no less than a given threshold value. In this paper, we address the problem of enumerating all pseudo cliques for a given graph and a threshold value. We first show that it seems to be difficult to obtain polynomial time algorithms using straightforward divide and conquer approaches. Then, we propose a polynomial time, polynomial delay in precise, algorithm based on reverse search. The time complexity for each pseudo clique is O(Δlog |V|+min {Δ 2,|V|+|E|}). Computational experiments show the efficiency of our algorithm for both randomly generated graphs and practical graphs.

Keywords

Dense subgraph Maximum subgraph Pseudo clique Quasi clique Dense structure Clustering Community discovering Enumeration Mining Algorithm 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328. MIT Press, Cambridge (1996) Google Scholar
  2. 2.
    Aslam, J., Pelekhov, K., Rus, D.: A practical clustering algorithms for static and dynamic information organization. In: Symposium on Discrete Algorithms (SODA), vol. 99, pp. 51–60. Assoc. Comput. Mach., New York (1999) Google Scholar
  3. 3.
    Arora, S., Karger, D., Karpinski, M.: Polynomial time approximation schemes for dense instances of NP-hard problems. In: Proceedings of ACM Symposium on Theory of Computing, pp. 284–293 (1995) Google Scholar
  4. 4.
    Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamoto, H., Arikawa, S.: Efficient substructure discovery from large semi-structured data. In: Proceedings of SDM 2002 (2002) Google Scholar
  5. 5.
    Avis, D., Fukuda, K.: Reverse search for enumeration. Discrete Appl. Math. 65, 21–46 (1996) zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Feige, U., Peleg, D., Kortsarz, G.: The dense k-subgraph problem. Algorithmica 29, 410–421 (2001) zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Fukuda, K., Matsui, T.: Finding all minimum-cost perfect matchings in bipartite graphs. Networks 22, 461–468 (1992) zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Fujisawa, K., Hamuro, Y., Katoh, N., Tokuyama, T., Yada, K.: Approximation of optimal two-dimensional association rules for categorical attributes using semidefinite programming. Lect. Notes Comput. Sci. 1721, 148–159 (1999) CrossRefGoogle Scholar
  9. 9.
    Gallo, G., Grigoriadis, M.D., Tarjan, R.E.: A fast parametric maximum flow algorithm and applications. SIAM J. Comput. 18, 30–55 (1989) zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Garey, M.R., Johnson, D.S., Stockmeyer, L.: Some simplified NP-complete problems, In: Proceedings of ACM Symposium on Theory of Computing, pp. 47–63 (1974) Google Scholar
  11. 11.
    Gibson, D., Kumar, R., Tomkins, A.: Discovering large dense subgraphs in massive graphs. In: Proceedings of Very Large Data Bases Conference, pp. 721–732 (2005) Google Scholar
  12. 12.
    Haraguchi, M., Okubo, Y.: A method for clustering of web pages with pseudo-clique search. Lect. Notes Artif. Intell. 3847, 59–78 (2006) Google Scholar
  13. 13.
    Hu, H., Yan, X., Huang, Y., Han, J., Zhou, X.J.: Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics 21, 213–221 (2005) CrossRefGoogle Scholar
  14. 14.
    Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Extracting large-scale knowledge bases from the Web. In: Proceedings of Very Large Data Bases Conference, pp. 639–650 (1999) Google Scholar
  15. 15.
    Kumar, S.R., Raphavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for emerging cyber communities. In: Proceedings of 8th International WWW Conference, pp. 1481–1493 (1999) Google Scholar
  16. 16.
    Makino, K., Uno, T.: New algorithms for enumerating all maximal cliques. Lect. Notes Comput. Sci. 3111, 260–272 (2004) MathSciNetGoogle Scholar
  17. 17.
    Nakano, S., Uno, T.: Constant time generation of trees with specified diameter. Lect. Notes Comput. Sci. 3353, 33–45 (2004) MathSciNetCrossRefGoogle Scholar
  18. 18.
    Palla, G., Derenyi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005) CrossRefGoogle Scholar
  19. 19.
    Read, R.C., Tarjan, R.E.: Bounds on backtrack algorithms for listing cycles, paths, and spanning trees. Networks 5, 237–252 (1975) zbMATHMathSciNetGoogle Scholar
  20. 20.
    Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363, 28–42 (2006) zbMATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    Uno, T.: Two general methods to reduce delay and change of enumeration algorithms. National Institute of Informatics (in Japan) Technical Report, 004E (2003) Google Scholar
  22. 22.
    Uno, T.: An efficient algorithm for enumerating pseudo cliques. Lect. Notes Comput. Sci. 4835, 402–414 (2007) CrossRefMathSciNetGoogle Scholar
  23. 23.
    Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Proceedings of IEEE ICDM’04 Workshop FIMI’04. Available at http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS//Vol-126/ (2004)
  24. 24.
    Warner, S.: E-prints and the open archives initiative. Libr. Hi Tech 21, 151–158 (2003) CrossRefGoogle Scholar
  25. 25.
    Zhang, Y., Chu, C.H., Ji, X., Zha, H.: Correlating summarization of multisource news with k way graph biclustering. SIGKDD Explor. 6, 34–42 (2004) CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.National Institute of InformaticsTokyoJapan

Personalised recommendations