Efficient Mining of Large Maximal Bicliques

Liu, Guimei; Sim, Kelvin; Li, Jinyan

doi:10.1007/11823728_42

Guimei Liu¹⁸,
Kelvin Sim¹⁸ &
Jinyan Li¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4081))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

924 Accesses
31 Citations

Abstract

Many real world applications rely on the discovery of maximal biclique subgraphs (complete bipartite subgraphs). However, existing algorithms for enumerating maximal bicliques are not very efficient in practice. In this paper, we propose an efficient algorithm to mine large maximal biclique subgraphs from undirected graphs. Our algorithm uses a divide-and-conquer approach. It effectively uses the size constraints on both vertex sets to prune unpromising bicliques and to reduce the search space iteratively during the mining process. The time complexity of the proposed algorithm is O(nd N), where n is the number of vertices, d is the maximal degree of the vertices and N is the number of maximal bicliques. Our performance study shows that the proposed algorithm outperforms previous work significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proc of the 1993 ACM SIGMOD Conference, pp. 207–216 (1993)
Google Scholar
Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P.L., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discrete Applied Mathematics 145(1), 11–21 (2004)
Article MATH MathSciNet Google Scholar
Bayardo, R.J., Goethals, B., Zaki, M.J. (eds.): Proc. of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations. CEUR Workshop Proceedings, vol. 126 (2004), CEUR-WS.org
Besson, J., Robardet, C., Boulicaut, J.-F.: Constraint-based mining of formal concepts in transactional data. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 615–624. Springer, Heidelberg (2004)
Chapter Google Scholar
Bron, C., Kerbosch, J.: Algorithm 457: finding all cliques of an undirected graph. Communications of the ACM 16(9), 575–577 (1973)
Article MATH Google Scholar
Bu, D., Zhao, Y., Cai, L., Xue, H., Zhu, X., Lu, H., Zhang, J., Sun, S., Ling, L., Zhang, N., Li, G., Chen, R.: Topological structure analysis of the protein protein interaction network in budding yeast. Nucleic Acids Research 31(9), 2443–2450 (2003)
Article Google Scholar
Chung, F.: On the coverings of graphs. Discrete Applied Mathematics 30(2), 89–93 (1980)
Article MATH Google Scholar
Eppstein, D.: Arboricity and bipartite subgraph listing algorithms. Information Processing Letters 51(4) (1994)
Google Scholar
Floris Geerts, B.G., Mielikáínen, T.: Tiling databases. In: Proc. of the 7th International Conference on Discovery Science, pp. 278–289 (2004)
Google Scholar
Garey, M., Johnson, D.: Computers and Intractability: A guide to the theory of NP-completeness. Freeman, San Francisco (1979)
MATH Google Scholar
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the web for emerging cyber-communities. In: Proceeding of the 8th international conference on World Wide Web, pp. 1481–1493 (1999)
Google Scholar
Li, J., Li, H., Soh, D., Wong, L.: A Correspondence Between Maximal Complete Bipartite Subgraphs and Closed Patterns. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 146–156. Springer, Heidelberg (2005)
Chapter Google Scholar
Liu, G., Lu, H., Lou, W., Yu, J.X.: On computing, storing and querying frequent patterns. In: Proc. of the 9th ACM SIGKDD Conference, pp. 607–612 (2003)
Google Scholar
Makino, K., Uno, T.: New algorithms for enumerating all maximal cliques. In: Proc. of the 9th Scandinavian Workshop on Algorithm Theory, pp. 260–272 (2004)
Google Scholar
Mishra, N., Ron, D., Swaminathan, R.: A new conceptual clustering framework. Machine Learning 56(1-3) (2004)
Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering Frequent Closed Itemsets for Association Rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Chapter Google Scholar
Peeters, R.: The maximum edge biclique problem is np-complete. Research Memorandum 789, Tilburg University, Faculty of Economics and Business Administration (2000)
Google Scholar
Sanderson, M.J., Driskell, A.C., Ree, R.H., Eulenstein, O., Langley, S.: Obtaining maximal concatenated phylogenetic data sets from large sequence databases. Molecular Biology and Evolution 20(7), 1036–1042 (2003)
Article Google Scholar
Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques. In: Chwa, K.-Y., Munro, J.I.J. (eds.) COCOON 2004. LNCS, vol. 3106, pp. 161–170. Springer, Heidelberg (2004)
Chapter Google Scholar
Uno, T., Kiyomi, M., Arimura, H.: Lcm ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In: Proc. of the ICDM 2004 Workshop on Frequent Itemset Mining Implementations (2004)
Google Scholar
Uno, T., Kiyomi, M., Arimura, H.: Lcm ver. 3: Collaboration of array, bitmap and prefix tree for frequent itemset mining. In: Proc. of the ACM SIGKDD Open Source Data Mining Workshop on Frequent Pattern Mining Implementations (2005)
Google Scholar
Wang, J., Pei, J., Han, J.: Closet+: Searching for the best strategies for mining frequent closed itemsets. In: Proc. of the 9th ACM SIGKDD Conference, pp. 236–245 (2003)
Google Scholar
Zaki, M.J., Hsiao, C.-J.: Charm: An efficient algorithm for closed itemset mining. In: Proc. of SIAM International Conference on Data Mining, pp. 398–416 (2002)
Google Scholar
Zaki, M.J., Ogihara, M.: Theoretical foundations of association rules. In: Proc. of the 3rd SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Guimei Liu, Kelvin Sim & Jinyan Li

Authors

Guimei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Kelvin Sim
View author publications
You can also search for this author in PubMed Google Scholar
Jinyan Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstr. 9-11/188, A-1040, Wien, Austria
A Min Tjoa
Department of Software and Computing Systems, University of Alicante, Spain
Juan Trujillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, G., Sim, K., Li, J. (2006). Efficient Mining of Large Maximal Bicliques. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_42

Download citation

DOI: https://doi.org/10.1007/11823728_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37736-8
Online ISBN: 978-3-540-37737-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics