Skip to main content
Log in

A branch-and-bound approach for maximum quasi-cliques

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Detecting quasi-cliques in graphs is a useful tool for detecting dense clusters in graph-based data mining. Particularly in large-scale data sets that are error-prone, cliques are overly restrictive and impractical. Quasi-clique detection has been accomplished using heuristic approaches in various applications of graph-based data mining in protein interaction networks, gene co-expression networks, and telecommunication networks. Quasi-cliques are not hereditary, in the sense that every subset of a quasi-clique need not be a quasi-clique. This lack of heredity introduces interesting challenges in the development of exact algorithms to detect maximum cardinality quasi-cliques. The only exact approaches for this problem are limited to two mixed integer programming formulations that were recently proposed in the literature. The main contribution of this article is a new combinatorial branch-and-bound algorithm for the maximum quasi-clique problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1

Similar content being viewed by others

References

  • Abello, J., Pardalos, P. M., & Resende, M. G. C. (1999). On maximum clique problems in very large graphs. In J. Abello & J. Vitter (Eds.), DIMACS series on discrete mathematics and theoretical computer science: Vol. 50. External memory algorithms and visualization (pp. 119–130). Providence: American Mathematical Society.

    Google Scholar 

  • Abello, J., Resende, M. G. C., & Sudarsky, S. (2002). Massive quasi-clique detection. In S. Rajsbaum (Ed.), LATIN 2002: proceedings of the 5th Latin American symposium on theoretical informatics (pp. 598–612). London: Springer.

    Chapter  Google Scholar 

  • Adamic, L., & Huberman, B. (2000). Power-law distribution of the World Wide Web. Science, 287, 2115a.

    Article  Google Scholar 

  • Alba, R. D. (1973). A graph-theoretic definition of a sociometric clique. The Journal of Mathematical Sociology, 3(1), 113–126.

    Article  Google Scholar 

  • Almaas, E., & Barabási, A. L. (2006). Power laws in biological networks. In E. Koonin, Y. I. Wolf, & G. P. Karev (Eds.), Power laws, scale-free networks and genome biology (pp. 1–11). New York: Springer.

    Chapter  Google Scholar 

  • Balasundaram, B., Butenko, S., & Trukhanov, S. (2005). Novel approaches for analyzing biological networks. Journal of Combinatorial Optimization, 10(1), 23–39.

    Article  Google Scholar 

  • Balasundaram, B., Butenko, S., & Hicks, I. V. (2011). Clique relaxations in social network analysis: the maximum k-plex problem. Operations Research, 59(1), 133–142.

    Article  Google Scholar 

  • Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.

    Article  Google Scholar 

  • Barabási, A. L., Albert, R., & Jeong, H. (2000). Scale-free characteristics of random networks: the topology of the World Wide Web. Physica. A, 281(1–4), 69–77.

    Article  Google Scholar 

  • Batagelj, V., & Mrvar, A. (2006). Pajek datasets: Reuters terror news network. Online: http://vlado.fmf.uni-lj.si/pub/networks/data/CRA/terror.htm. Accessed March 2008.

  • Boginski, V., Butenko, S., & Pardalos, P. M. (2003). On structural properties of the market graph. In A. Nagurney (Ed.), Innovation in financial and economic networks. London: Edward Elgar.

    Google Scholar 

  • Boginski, V., Butenko, S., & Pardalos, P. (2006). Mining market data: a network approach. Computers & Operations Research, 33, 3171–3184.

    Article  Google Scholar 

  • Bomze, I. M., Budinich, M., Pardalos, P. M., & Pelillo, M. (1999). The maximum clique problem. In D. Z. Du & P. M. Pardalos (Eds.), Handbook of combinatorial optimization (pp. 1–74). Dordrecht: Kluwer Academic.

    Chapter  Google Scholar 

  • Broido, A., & Claffy, K. C. (2001). Internet topology: connectivity of IP graphs. In S. Fahmy & K. Park (Eds.), Scalability and traffic control in IP networks (pp. 172–187). Bellingham: SPIE.

    Chapter  Google Scholar 

  • Brunato, M., Hoos, H., & Battiti, R. (2008). On effectively finding maximal quasi-cliques in graphs. In V. Maniezzo, R. Battiti, & J. P. Watson (Eds.), Lecture notes in computer science: Vol. 5313. Learning and intelligent optimization (pp. 41–55). Berlin: Springer.

    Chapter  Google Scholar 

  • Bu, D., Zhao, Y., Cai, L., Xue, H., Zhu, X., Lu, H., Zhang, J., Sun, S., Ling, L., Zhang, N., Li, G., & Chen, R. (2003). Topological structure analysis of the protein-protein interaction network in budding yeast. Nucleic Acids Research, 31(9), 2443–2450.

    Article  Google Scholar 

  • Carlson, J. M., & Doyle, J. (1999). Highly optimized tolerance: a mechanism for power laws in designed systems. Physical Review E, 60(2), 1412–1427.

    Article  Google Scholar 

  • Carraghan, R., & Pardalos, P. (1990). An exact algorithm for the maximum clique problem. Operations Research Letters, 9, 375–382.

    Article  Google Scholar 

  • Chung, F., & Lu, L. (2006). CBMS lecture series. Complex graphs and networks. Providence: American Mathematical Society.

    Google Scholar 

  • Cook, D. J., & Holder, L. B. (2000). Graph-based data mining. IEEE Intelligent Systems, 15(2), 32–41.

    Article  Google Scholar 

  • Corman, S., Kuhn, T., McPhee, R., & Dooley, K. (2002). Studying complex discursive systems: centering resonance analysis of organizational communication. Human Communication Research, 28(2), 157–206.

    Google Scholar 

  • Corneil, D., & Perl, Y. (1984). Clustering and domination in perfect graphs. Discrete Applied Mathematics, 9, 27–39.

    Article  Google Scholar 

  • Dimacs (1995). Cliques, coloring, and satisfiability: second Dimacs implementation challenge. Online: http://dimacs.rutgers.edu/Challenges/. Accessed March 2007.

  • Erdös, P., & Rényi, A. (1959). On random graphs. Publicationes Mathematicae, 6, 290–297.

    Google Scholar 

  • Faloutsos, M., Faloutsos, P., & Faloutsos, C. (1999). On power-law relationships of the Internet topology. In Proceedings of the ACM-SIGCOMM conference on applications, technologies, architectures, and protocols for computer communication, Cambridge (pp. 251–262).

    Chapter  Google Scholar 

  • Feige, U., Kortsarz, G., & Peleg, D. (2001). The dense k-subgraph problem. Algorithmica, 29, 410–421.

    Article  Google Scholar 

  • Gagneur, J., Krause, R., Bouwmeester, T., & Casari, G. (2004). Modular decomposition of protein-protein interaction networks. Genome Biology, 5(8), R57.

    Article  Google Scholar 

  • Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99(12), 7821–7826.

    Article  Google Scholar 

  • Grossman, J., Ion, P., & Castro, R. D. (1995). The Erdös number project. Online: http://www.oakland.edu/enp/. Accessed March 2007.

  • IBM Corporation (2010). IBM ILOG CPLEX Optimizer 12.2. http://www.ibm.com/software/integration/optimization/cplex-optimizer/. IBM Academic Initiative. Accessed June 2011.

  • Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., & Sakaki, Y. (2001). A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proceedings of the National Academy of Sciences of the United States of America, 98(8), 4569–4574.

    Article  Google Scholar 

  • Jiang, D., & Pei, J. (2009). Mining frequent cross-graph quasi-cliques. ACM Transactions on Knowledge Discovery from Data, 2(4), 16.

    Article  Google Scholar 

  • Kortsarz, G., & Peleg, D. (1993). On choosing a dense subgraph. In Proceedings of the 34th annual IEEE symposium on foundations of computer science (pp. 692–701). Piscataway: IEEE Comput. Soc.

    Google Scholar 

  • Kreher, D. L., & Stinson, D. R. (1998). Combinatorial algorithms: generation, enumeration, and search (1st ed.). Boca Raton: CRC Press.

    Google Scholar 

  • Leskovec, J., & Horvitz, E. (2008). Planetary-scale views on a large instant-messaging network. In Proceeding of the 17th international conference on World Wide Web. WWW ’08 (pp. 915–924). New York: ACM.

    Chapter  Google Scholar 

  • Lu, H., Zhu, X., Liu, H., Skogerb, G., Zhang, J., Zhang, Y., Cai, L., Zhao, Y., Sun, S., Xu, J., Bu, D., & Chen, R. (2004). The interactome as a tree—an attempt to visualize the protein-protein interaction network in yeast. Nucleic Acids Research, 32(16), 4804–4811.

    Article  Google Scholar 

  • Luce, R. D. (1950). Connectivity and generalized cliques in sociometric group structure. Psychometrika, 15(2), 169–190.

    Article  Google Scholar 

  • Mokken, R. J. (1979). Cliques, clubs and clans. Quality and Quantity, 13(2), 161–173.

    Article  Google Scholar 

  • Östergård, P. R. J. (2002). A fast algorithm for the maximum clique problem. Discrete Applied Mathematics, 120, 197–207.

    Article  Google Scholar 

  • Patillo, J., Veremyev, A., Butenko, S., & Boginski, V. (2012). On the maximum quasi-clique problem. Discrete Applied Mathematics. doi:10.1016/j.dam.2012.07.019.

    Google Scholar 

  • Pei, J., Jiang, D., & Zhang, A. (2005a). Mining cross-graph quasi-cliques in gene expression and protein interaction data. In Proceedings of the 21st international conference on data engineering. ICDE 2005 (pp. 353–356).

    Google Scholar 

  • Pei, J., Jiang, D., & Zhang, A. (2005b). On mining cross-graph quasi-cliques. In Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining. KDD ’05 (pp. 228–238). New York: ACM.

    Chapter  Google Scholar 

  • Peng, X., Langston, M. A., Saxton, A. M., Baldwin, N. E., & Snoddy, J. R. (2007). Detecting network motifs in gene co-expression networks through integration of protein domain information. In P. McConnell, S. M. Lin, & P. Hurban (Eds.), Methods of microarray data analysis V (pp. 89–102). New York: Springer.

    Chapter  Google Scholar 

  • Seidman, S. B., & Foster, B. L. (1978). A graph theoretic generalization of the clique concept. The Journal of Mathematical Sociology, 6, 139–154.

    Article  Google Scholar 

  • Simonite, T. (2011). Bracing for the data deluge. http://www.technologyreview.com/business/37506/. Accessed May 2011.

  • Spirin, V., & Mirny, L. A. (2003). Protein complexes and functional modules in molecular networks. Proceedings of the National Academy of Sciences of the United States of America, 100(21), 12123–12128.

    Article  Google Scholar 

  • Washio, T., & Motoda, H. (2003). State of the art of graph-based data mining. ACM SIGKDD Explorations Newsletter, 5(1), 59–68.

    Article  Google Scholar 

  • West, D. (2001). Introduction to graph theory. Upper Saddle River: Prentice-Hall.

    Google Scholar 

  • Zeng, Z., Wang, J., Zhou, L., & Karypis, G. (2006). Coherent closed quasi-clique discovery from large dense graph databases. In Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’06 (pp. 797–802). New York: ACM.

    Chapter  Google Scholar 

  • Zeng, Z., Wang, J., Zhou, L., & Karypis, G. (2007). Out-of-core coherent closed quasi-clique mining from large dense graph databases. ACM Transactions on Database Systems, 32, 13.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the two anonymous referees for their comments that improved the content and presentation of this article, and would also like to thank Dr. Sergiy Butenko for sharing the manuscript of Patillo et al. (2012) with us. The computational experiments reported in this article were performed at the Oklahoma State University High Performance Computing Center (OSUHPCC). The authors are grateful to Dr. Dana Brunson for her support in conducting these experiments at OSUHPCC. This research was supported by the US Department of Energy Grant DE-SC0002051 and the Air Force Office of Scientific Research Grant FA9550-12-1-0103.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Balabhaskar Balasundaram.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mahdavi Pajouh, F., Miao, Z. & Balasundaram, B. A branch-and-bound approach for maximum quasi-cliques. Ann Oper Res 216, 145–161 (2014). https://doi.org/10.1007/s10479-012-1242-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-012-1242-y

Keywords

Navigation