An Efficient Algorithm for Enumerating Pseudo Cliques

  • Takeaki Uno
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4835)

Abstract

The problem of finding dense structures in a given graph is quite basic in informatics including data mining and data engineering. Clique is a popular model to represent dense structures, and widely used because of its simplicity and ease in handling. Pseudo cliques are natural extension of cliques which are subgraphs obtained by removing small number of edges from cliques. We here define a pseudo clique by a subgraph such that the ratio of the number of its edges compared to that of the clique with the same number of vertices is no less than a given threshold value. In this paper, we address the problem of enumerating all pseudo cliques for given a graph and a threshold value. We first show that it seems to be difficult to obtain polynomial time algorithms using straightforward divide and conquer approaches. Then, we propose a polynomial time, polynomial delay in precise, algorithm based on reverse search. We show the efficiency of our algorithm in practice by computational experiments.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aslam, J., Pelekhov, K., Rus, D.: A Practical Clustering Algorithms for Static and Dynamic Information Organization. In: SODA 1999, ACM Press, New York (1999)Google Scholar
  2. 2.
    Arora, S., Karger, D., Karpinski, M.: Polynomial time approximation schemes for dense instances of NP-hard problems. In: Proc. ACM Symp. on Theory of Comp., pp. 284–293 (1995)Google Scholar
  3. 3.
    Avis, D., Fukuda, K.: Reverse Search for Enumeration. Discrete Applied Mathematics 65, 21–46 (1996)MATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Feige, U., Peleg, D., Kortsarz, G.: The dense k-subgraph problem. Algorithmica 29, 410–421 (2001)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Fukuda, K., Matsui, T.: Finding All Minimum-Cost Perfect Matchings in Bipartite Graphs. Networks 22, 461–468 (1992)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Fujisawa, K., Hamuro, Y., Katoh, N., Tokuyama, T., Yada, K.: Approximation of Optimal Two-Dimensional Association Rules for Categorical Attributes Using Semidefinite Programming. In: Arikawa, S., Furukawa, K. (eds.) DS 1999. LNCS (LNAI), vol. 1721, pp. 148–159. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  7. 7.
    Garey, M.R., Johnson, D.S., Stockmeyer, L.: Some Simplified NP-complete Problems. In: Proc. ACM Symp. on Theory of Comp., pp. 47–63 (1974)Google Scholar
  8. 8.
    Gibson, D., Kumar, R., Tomkins, A.: Discovering Large Dense Subgraphs in Massive Graphs. In: Proc. VLDB, pp. 721–732 (2005)Google Scholar
  9. 9.
    Haraguchi, M., Okubo, Y.: A method for clustering of web pages with pseudo-clique search. In: Jantke, K.P., Lunzer, A., Spyratos, N., Tanaka, Y. (eds.) Federation over the Web. LNCS (LNAI), vol. 3847, pp. 59–78. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Hu, H., Yan, X., Huang, Y., Han, J., Zhou, X.J.: Mining Coherent Dense Subgraphs Across Massive Biological Networks for Functional Discovery. Bioinformatics 21, 213–221 (2005)CrossRefGoogle Scholar
  11. 11.
    Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Extracting Large-Scale Knowledge Bases from the Web. In: Proc. VLDB, pp. 639–650 (1999)Google Scholar
  12. 12.
    Makino, K., Uno, T.: New Algorithms for Enumerating All Maximal Cliques. In: Hagerup, T., Katajainen, J. (eds.) SWAT 2004. LNCS, vol. 3111, pp. 260–272. Springer, Heidelberg (2004)Google Scholar
  13. 13.
    Nakano, S., Uno, T.: Constant Time Generation of Trees with Specified Diameter. In: Hromkovič, J., Nagl, M., Westfechtel, B. (eds.) WG 2004. LNCS, vol. 3353, pp. 33–45. Springer, Heidelberg (2004)Google Scholar
  14. 14.
    Palla, G., Derenyi, I., Farkas, I., Vicsek, T.: Uncovering the Overlapping Community Structure of Complex Networks in Nature and Society. Nature 435, 814–818 (2005)CrossRefGoogle Scholar
  15. 15.
    Read, R.C., Tarjan, R.E.: Bounds on Backtrack Algorithms for Listing Cycles, Paths, and Spanning Trees. Networks 5, 237–252 (1975)MATHMathSciNetGoogle Scholar
  16. 16.
    Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theoretical Computer Science 363, 28–42 (2006)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Uno, T.: Algorithms for Enumerating All Perfect, Maximum and Maximal Matchings in Bipartite Graphs. In: Leong, H.-V., Jain, S., Imai, H. (eds.) ISAAC 1997. LNCS, vol. 1350, pp. 92–101. Springer, Heidelberg (1997)Google Scholar
  18. 18.
    Uno, T.: Two General Methods to Reduce Delay and Change of Enumeration Algorithms, National Institute of Informatics (in Japan) Technical Report, 004E (2003)Google Scholar
  19. 19.
    Warner, S.: E-prints and the Open Archives Initiative. Library Hi Tech 21, 151–158 (2003)CrossRefGoogle Scholar
  20. 20.
    Zhang, Y., Chu, C.H., Ji, X., Zha, H.: Correlating Summarization of Multisource News with k Way Graph Biclustering. ACM SIGKDD Explo. Newslett. arch. 6, 34–42 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Takeaki Uno
    • 1
  1. 1.National Institute of Informatics, 2-1-2, Hitotsubashi, Chiyoda-ku, Tokyo 101-8430Japan

Personalised recommendations