Solving the maximum edge-weight clique problem in sparse graphs with compact formulations

Abstract

This paper studies the behavior of compact formulations for solving the maximum edge-weight clique (MEWC) problem in sparse graphs. The MEWC problem has long been discussed in the literature, but mostly addressing complete graphs, with or without a cardinality constraint on the clique. Yet, several real-world applications are defined on sparse graphs, where the missing edges are due to some threshold process or because they are not even supposed to be in the graph, at all. Such situations often arise in cell’s metabolic networks, where the amount of metabolites shared among reactions is an important issue to understand the cell’s prevalent elements. We propose new node-discretized formulations for the problem, which are more compact than other models known from the literature. Computational experiments on benchmark and real-world instances are conducted for discussing and comparing the models. These tests indicate that the node-discretized formulations are more efficient for solving large size sparse graphs. Additionally, we also address a new variant of the MEWC problem where the objective to be maximized includes the neighboring edges of the clique.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2

References

  1. Akutsu T, Hayashida M, Tomita E, Suzuki JI, Harimoto K (2004) Protein threading with profiles and constraints. In: Proceedings of the fourth IEEE symposium on bioinformatics and bioengineering (BIBE 2004), pp 537–544

  2. Batagelj V, Mrvar A. http://vlado.fmf.uni-lj.si/pub/networks/pajek/S. Accessed Aug 2009

  3. Bomze IM, Budinich M, Pardalos PM, Pelillo M (1999) The maximum clique problem. In: Du DZ, Pardalos PM (eds) Handbook of combinatorial optimization (suppl. Vol. A). Kluwer, Dordrecht, pp 1–74

    Google Scholar 

  4. Brélaz D (1979) New methods to color the vertices of a graph. Commun ACM 22(4):251–256

    Article  Google Scholar 

  5. Brown JB, Dukka Bahadur KC, Tomita E, Akutsu T (2006) Multiple methods for protein side chain packing using maximum weight cliques. Genome Inf Ser 17(1):3–12

    Google Scholar 

  6. Carlson MRJ, Zhang B, Fang Z, Mischel PS, Horvath S, Nelson SF (2006) Gene connectivity, function, and sequence conservation: predictions from modular yeast co-expression networks. BMC Genomics 7(1):40

    Article  Google Scholar 

  7. Cavique L (2007) A scalable algorithm for the market basket analysis. J Retail Consum Serv 14:400–407

    Article  Google Scholar 

  8. Corman S, Kuhn T, McPhee R, Dooney K (2002) Studying complex discursive systems: centering resonance analysis of organizational communication. Hum Commun Res 28(2):157–206

    Google Scholar 

  9. De Amorim SG, Barthélemy JP, Ribeiro CC (1992) Clustering and clique partitioning: simulated annealing and tabu search approaches. J Classif 9(1):17–41

    Article  Google Scholar 

  10. Della Croce F, Tadei R (1994) A multi-KP modeling for the maximum-clique problem. Eur J Oper Res 73:555–561

    Article  Google Scholar 

  11. Dijkhuizen G, Faigle U (1993) A cutting-plane to the edge-weighted maximal clique problem. Eur J Oper Res 69:121–130

    Article  Google Scholar 

  12. Dukka Bahadur KC, Tomita E, Suzuki JI, Horimoto K, Akutsu T (2005) Clique based algorithms for protein threading with profiles and constraints. In: Proceeding of the 3rd Asia Pacific bioinformatics conference (APBC2005), pp 51–64

  13. Förster J, Famili I, Fu P, Palsson BØ, Nielsen J (2003) Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res 13:244–253

    Article  Google Scholar 

  14. Gendron B, Hertz A, St-Louis P (2008) A sequential elimination algorithm for computing bounds on the clique number of a graph. Discret Optim 5(3):615–628

    Article  Google Scholar 

  15. Gouveia L (1995) A 2n-constraint formulation for the capacitated minimal spanning tree problem. Oper Res 43:130–141

    Article  Google Scholar 

  16. Gouveia L, Saldanha da Gama F (2006) On the capacitated concentrator location problem: a reformulation by discretization. Comput Oper Res 33:1242–1258

    Article  Google Scholar 

  17. Gouveia L, Moura P (2012) Enhancing discretized formulations: the knapsack reformulation and the star reformulation. Top 20(1):52–74

    Article  Google Scholar 

  18. Han JD, Bertin N, Hao T et al (2004) Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430:88–93

    Article  Google Scholar 

  19. Hunting M, Faigle U, Kern W (2001) A Lagrangian relaxation approach to the edge-weighted clique problem. Eur J Oper Res 131(1):119–131

    Article  Google Scholar 

  20. Klotz E, Newman M (2013) Practical guidelines for solving difficult mixed integer linear programs. Surv Oper Res Manag Sci 18:18–32

    Google Scholar 

  21. Korcsmáros T, Farkas I, Szalay MS et al (2010) Uniformly curated signaling pathways reveal tissue-specific cross-talks and support drug target discovery. Bioinformatics 26(16):2042–2050

    Article  Google Scholar 

  22. Lancia G (2008) Mathematical programming in computational biology: an annotated bibliography. Algorithms 1:100–129

    Article  Google Scholar 

  23. Lim J, Hao T, Shaw C et al (2006) A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration. Cell 125:801–814

    Article  Google Scholar 

  24. Macambira EM, de Souza CC (2000) The edge-weighted clique problem: valid inequalities, facets and polyhedral computations. Eur J Oper Res 123(2):346–371

    Article  Google Scholar 

  25. Martins P (2010) Extended and discretized formulations for the maximum clique problem. Comput Oper Res 37:1348–1358

    Article  Google Scholar 

  26. Martins P (2012) Cliques with maximum/minimum edge neighborhood and neighborhood density. Comput Oper Res 39:594–608

    Article  Google Scholar 

  27. Mascia F, Cilia E, Brunato M, Passerini A (2010) Predicting structural and functional sites in proteins by searching for maximum-weight cliques. In: Proceedings of the twenty-fourth AAAI conference on artificial intelligence (AAAI-10), pp 1274–1279

  28. Mazurie A, Bonchev D, Schwikowski B, Buck A (2008) Phylogenetic distances are encoded in networks of interacting pathways. Bioinformatics 24(22):2579–2585

    Article  Google Scholar 

  29. Mehrotra A, Trick MA (1998) Cliques and clustering: a combinatorial approach. Oper Res Lett 22(1):1–12

    Article  Google Scholar 

  30. Padberg M (1989) The Boolean quadric polytope: some characteristics, facets and relatives. Math Program 45(1):139–172

    Article  Google Scholar 

  31. Park K, Lee K, Park S (1996) An extended formulation approach to the edge-weighted maximal clique problem. Eur J Oper Res 95:671–682

    Article  Google Scholar 

  32. Pirim H, Ekşioğlu B, Perkins AD, Yüceer Ç (2012) Clustering of high throughput gene expression data. Comput Oper Res 39(12):3046–3061

    Article  Google Scholar 

  33. Pullan W (2008) Approximating the maximum vertex/edge weighted clique using local search. J Heuristics 14(2):117–134

    Article  Google Scholar 

  34. Sørensen MM (2004) New facets and a branch-and-cut algorithm for the weighted clique problem. Eur J Oper Res 154(1):57–70

    Article  Google Scholar 

  35. Spirin V, Mirny LA (2003) Protein complexes and functional modules in molecular networks. PNAS 100(21):12123–12128

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the referees for their comments and suggestions which led to a significantly improved version of the paper. Thanks are also due to the Editor for the suggested observations. This work has been partially supported by the Portuguese National Funding by FCT (project PEst-OE/MAT/UI0152).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Pedro Martins.

Appendix

Appendix

Table 2 summarizes information on the characteristics of the instances used in Sect. 5. It indicates the number of nodes, number of edges and the density of each graph. It also includes heuristic solution values for the MEWC problem (denoted by \(W_{e}(G))\), reported in Pullan (2008). The solutions for the RTN and SC instances were taken from the branch-and-bound executions. The \(W_{e}(G)\) values are lower bounds for the MEWC optimums.

Table 2 Instances characteristics and lower bounds for the optimums, reported in Pullan (2008)

Results for setting parameter \(q_{max}\), defining an upper bound for \(\omega (G)\), are shown in Table 3. As mentioned before, these upper bounds were calculated using the sequential elimination algorithm described in Gendron et al. (2008) with the DSATUR greedy procedure (proposed by Brélaz (1979)). These bounds were used in the tests conducted in Sect. 5. The clique number (\(\omega (G))\) of the graph is also shown. CPU times are reported in seconds.

Table 3 Upper bounds for the clique number \(\omega (G)\), generated by the sequential elimination algorithm with the DSATUR procedure, described in Gendron et al. (2008)

Table 4 gives the linear programming relaxation percent gaps. It reports the results taken over the DIMACS, RTN and SC instances, using the F1, F2, F5 and F6 based models described in Sects. 2 and 3, and discussed in Sect. 5. Lowest gaps are given in bold.

Table 4 LP relaxation percent gaps using the models under discussion

Table 5 indicates the CPU times (in seconds) for running the linear programming relaxation tests reported in Table 4. Lowest times are given in bold.

Table 5 CPU times (in seconds) associated to the LP relaxation tests reported in Table 4

Table 6 reports the branch-and-bound execution times (in seconds) for the models under discussion. Column 2, with heading “Opt.”, represents the optimum solution value returned by branch and bound. The tests for the RTN instances with more than 10,000 nodes were omitted. Smallest CPU time values are given in bold.

Table 6 Execution times (in seconds) of the branch and bound using the models under discussion

Table 7 provides the same information as in Table 6 but considers “Strong branching” variable selection within branch and bound, and the automatic generation of global cuts provided by CPLEX for strengthening at the root node of the branch-and-bound tree. The table only focuses on the RTN and NIP classes, to which improvements were observed.

Table 7 Branch-and-bound execution times (in seconds) using branch on variable with maximum infeasibility strategy and the automatic generation of global cuts, for the RTN and NIP classes

Table 8 presents MEWC problem optimum solutions attained for the RTN class instances.

Table 8 MEWC problem optimum solutions for the RTN class instances

Table 9 presents optimum solutions for the MEWNC problem for the RTN class instances.

Table 9 MEWNC problem optimum solutions for the RTN class instances

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gouveia, L., Martins, P. Solving the maximum edge-weight clique problem in sparse graphs with compact formulations. EURO J Comput Optim 3, 1–30 (2015). https://doi.org/10.1007/s13675-014-0028-1

Download citation

Keywords

  • Maximum edge-weight clique problem
  • Clique’s edge neighborhood
  • Integer formulations
  • Sparse graphs

Mathematics Subject Classification

  • 90C10
  • 90C35
  • 90C90