Abstract
Full strong-branching (henceforth referred to as strong-branching) is a well-known variable selection rule that is known experimentally to produce significantly smaller branch-and-bound trees in comparison to all other known variable selection rules. In this paper, we attempt an analysis of the performance of the strong-branching rule both from a theoretical and a computational perspective. On the positive side for strong-branching, we identify vertex cover as a class of instances where this rule provably works well. In particular, for vertex cover we present: (1) an upper bound on the size of the branch-and-bound tree using strong-branching as a function of the additive integrality gap; (2) show how the Nemhauser-Trotter property of persistency, which can be used as a pre-solve technique for vertex cover, is being recursively and consistently used throughout the strong-branching tree; (3) and finally provide an example of a vertex cover instance where not using strong-branching leads to a tree that has at least exponentially more nodes than the branch-and-bound tree based on strong-branching. On the negative side for strong-branching, we identify another class of instances where the strong-branching tree is exponentially larger than another branch-and-bound tree for solving these instances. On the computational side, we conduct experiments on various types of instances, like the lot-sizing problem and its variants, packing integer programs (IP), covering IPs, chance constrained IPs, vertex cover, etc., to understand how much larger is the size of the strong-branching based branch-and-bound tree in comparison to the optimal branch-and-bound tree. The main take-away from these experiments (on small instances) is that for all these instances the size of the strong-branching tree is within a factor of two of the size of the optimal branch-and-bound tree.
Similar content being viewed by others
Notes
We write “optimal” branch-and-bound tree with quotes since the size of the optimal branch-and-bound tree depends not only on the branching decisions taken at each node but also on the properties of the LP solver. We discuss this issue in detail in Sect. 4.
More precisely, let \(V_0\) be the set of vertices u of G corresponding to the variables \(x_u\) that have been fixed to 0 by the branchings that lead to N; define \(V_1\) analogously. Also let \(N(V_0)\) be the neighbors of vertices \(V_0\) in the graph. Then the IP sub-problem of node N is just (an embedding of) vertex cover on the graph where all nodes in \(V_0 \cup N(V_0) \cup V_1\) are removed (along with their incident edges).
References
Achterberg, T. Constraint integer programming. PhD Thesis (2007)
Achterberg, T., Koch, T., Martin, A.: Branching rules revisited. Oper. Res. Lett. 33(1), 42–54 (2005)
Marcos Alvarez, A., Louveaux, Q., Wehenkel, L.: A machine learning-based approximation of strong branching. INFORMS J. Comput. 29(1), 185–195 (2017)
Applegate, D., Bixby, R., Chvátal, V., Cook, W.: Finding cuts in the tsp (a preliminary report). Technical report, Citeseer (1995)
Balcan, M.-F., Dick, T., Sandholm, T., Vitercik, E.: Learning to branch. In International conference on machine learning, pp. 344–353. PMLR (2018)
Basu, A., Conforti, M., Di Summa, M., Jiang, H.: Complexity of cutting planes and branch-and-bound in mixed-integer optimization. arXiv preprint arXiv:2003.05023 (2020)
Basu, A., Conforti, M., Di Summa, M., Jiang, H.: Complexity of branch-and-bound and cutting planes in mixed-integer optimization-ii. In: Integer Programming and Combinatorial Optimization: 22nd International Conference, IPCO 2021, Atlanta, GA, USA, May 19–21, 2021, Proceedings 22, pp. 383–398. Springer (2021)
Bénichou, M., Gauthier, J.-M., Girodet, P., Hentges, G., Ribière, G., Vincent, O.: Experiments in mixed-integer linear programming. Math. Program. 1(1), 76–94 (1971)
Bertsimas, D., Tsitsiklis, J.N.: Introduction to linear optimization, volume 6. Athena Scientific, Belmont, MA (1997)
Bodur, M., Dash, S., Günlük, O.: Cutting planes from extended lp formulations. Math. Program. 161(1–2), 159–192 (2017)
Borst, S., Dadush, D., Huiberts, S., Tiwari, S.: On the integrality gap of binary integer programs with gaussian data. In IPCO, pp. 427–442 (2021)
Breu, R., Burdet, C.-A.: Branch and bound experiments in zero-one programming. In: Approaches to Integer Programming, pp. 1–50. Springer (1974)
Chvátal, V.: Hard knapsack problems. Oper. Res. 28(6), 1402–1411 (1980)
Conforti, M., Cornuéjols, G., Zambelli, G.: Integer programming, volume 271. Springer (2014)
Cygan, M., Fomin, F.V., Kowalik, Ł., Lokshtanov, D., Marx, D., Pilipczuk, M., Pilipczuk, M., Saurabh, S.: Parameterized algorithms, volume 5. Springer (2015)
Dadush, D., Tiwari, S.: On the complexity of branching proofs. arXiv preprint arXiv:2006.04124 (2020)
Dakin, R.J.: A tree-search algorithm for mixed integer programming problems. Comput. J. 8(3), 250–255 (1965)
Dey, S.S., Dubey, Y., Molinaro, M.: Branch-and-bound solves random binary ips in polytime. In: Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 579–591. SIAM (2021)
Dey, S.S., Dubey, Y., Molinaro, M.: Lower bounds on the size of general branch-and-bound trees. arXiv preprint arXiv:2103.09807 (2021)
Driebeek, N.J.: An algorithm for the solution of mixed integer programming problems. Manage. Sci. 12(7), 576–587 (1966)
Eckstein, J.: Parallel branch-and-bound algorithms for general mixed integer programming on the cm-5. SIAM J. Optim. 4(4), 794–814 (1994)
Forrest, J.J.H., Hirst, J.P.H., Tomlin, J.A.: Practical solution of large mixed integer programming problems with umpire. Manage. Sci. 20(5), 736–773 (1974)
Gasse, M., Chételat, D., Ferroni, N., Charlin, L., Lodi, A.: Exact combinatorial optimization with graph convolutional neural networks. arXiv preprint arXiv:1906.01629 (2019)
Gauthier, J.-M., Ribière, G.: Experiments in mixed-integer linear programming using pseudo-costs. Math. Program. 12(1), 26–47 (1977)
Gupta, P., Gasse, M., Khalil, E.B., Kumar, M.P. Lodi, A., Bengio, Y.: Hybrid models for learning to branch. arXiv preprint arXiv:2006.15212 (2020)
Healy, W.C., Jr.: Multiple choice programming (a procedure for linear programming with zero-one variables). Oper. Res. 12(1), 122–138 (1964)
Jeroslow, R.G.: Trivial integer programs unsolvable by branch-and-bound. Math. Program. 6(1), 105–109 (1974)
Khalil, E., Le Bodic, P., Song, L., Nemhauser, G., Dilkina, B.: Learning to branch in mixed integer programming. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30 (2016)
Land, A.H., Doig, A.G.: An automatic method of solving discrete programming problems. Econometrica 28, 497–520 (1960)
Land, A.H., Powell, S.: Computer codes for problems of integer programming. In Annals of Discrete Mathematics. Elsevier, 5, 221–269 (1979)
Le Bodic, P., Nemhauser, G.: An abstract model for branching and its application to mixed integer programming. Math. Program. 166(1), 369–405 (2017)
Linderoth, J.T., Savelsbergh, M.W.P.: A computational study of search strategies for mixed integer programming. INFORMS J. Comput. 11(2), 173–187 (1999)
Lodi, A., Zarpellon, G.: On learning and branching: a survey. TOP 25(2), 207–236 (2017)
Mitra, G.: Investigation of some branch and bound strategies for the solution of mixed integer linear programs. Math. Program. 4(1), 155–170 (1973)
Vinod, N., Sergey, B., Felix, G., Ingrid, von G., Pawel, L., Ivan, L., Brendan, O., Nicolas, S., Christian, T., Pengming, W., et al.: Solving mixed integer programs using neural networks. arXiv preprint arXiv:2012.13349 (2020)
Nemhauser, G.L., Trotter, L.E.: Properties of vertex packing and independence system polyhedra. Math. Program. 6(1), 48–61 (1974)
Nemhauser, G.L., Trotter, L.E.: Vertex packings: structural properties and algorithms. Math. Program. 8(1), 232–248 (1975)
Nemhauser, G.L., Wolsey, L.A.: Integer and combinatorial optimization, vol. 55. Wiley, Hoboken (1999)
Pagnoncelli, B.K., Ahmed, S., Shapiro, A.: Computational study of a chance constrained portfolio selection problem. J. Optim. Theory Appl. 142(2), 399–416 (2009)
Picard, J.-C., Queyranne, M.: On the integer-valued variables in the linear vertex packing problem. Math. Program. 12(1), 97–101 (1977)
Qiu, F., Ahmed, S., Dey, S.S., Wolsey, L.A.: Covering linear programming with violations. INFORMS J. Comput. 26(3), 531–546 (2014)
Quadt, D., Kuhn, H.: Capacitated lot-sizing with extensions. A review. 4OR 6(1), 61–83 (2008)
Roughgarden, T.: Beyond worst-case analysis. Commun. ACM 62(3), 88–96 (2019)
Acknowledgements
We would like to thank the reviewers for their comments, that significantly helped in improving the quality of presentation. Santanu S. Dey would like to acknowledge the support given by the Airforce office of scientific research, award number FA9550-22-1-0052. Marco Molinaro was supported in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES, Brasil) - Finance Code 001, by Bolsa de Produtividade em Pesquisa \(\#3\)12751/2021-4 from CNPq, FAPERJ grant “Jovem Cientista do Nosso Estado”.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A: Instance generation
A: Instance generation
In this section, we discuss the problems considered for computational experiments and the details of generating randomized instances.
1.1 A.1 General packing and covering IPs
We consider packing problems, covering problems and general problems with multiple covering and packing inequalities. Let \(I_p\) and \(I_c\) represent the set of indices corresponding to packing and covering constraints respectively. The general formulation is as follows,
For the experiments in this paper, \(n = 20\) is considered. The constraint matrix is generated randomly while incorporating sparsity. Each element \(a_{ij}\) is 0 with probability \(p = 0.25\). Otherwise, a random integer selected uniformly from the set \(\{1, 2, \dots , 200\}\) otherwise. The capacity parameter \(b_i\) is \(50\%\) of the sum of weights of the constraint, rounded down to integer value. The objective function is dependent on the number of packing and covering constraints as follows,
-
P5 is a purely packing type problem with 5 constraints (\(I_p = 5, I_c = 0\)). We thus choose a non negative vector for the objective function, and each component \(c_i\), is independently selected from the set \(\{1, 2, \dots , 200\}\) with uniform probability.
-
C5 is a purely covering type problem with 5 constraints (\(I_p = 0, I_c = 5\)). Each component of the objective, \(c_i\), is independently selected from the set \(\{-200, \dots , -2, -1\}\) with uniform probability.
-
G22 is a general MILP with 2 packing-type and 2 covering-type constraints (\(I_p~=~I_c~=~2\)). Each component of the objective, \(c_i\), is independently selected from the set \(\{-100, \ldots , 100\}\) with uniform probability.
1.2 A.2 Big-bucket Lot-sizing
We consider a variant of the lot-sizing problem referred to in literature as the big-bucket lot-sizing problem. Here, multiple products are produced using shared resources [42]. We do not consider unit cost of production here. On the other hand, set up time and processing time are considered to be constrained. The following are the parameters of the corresponding MILP model,
P | Number of products, \({\mathcal {P}} = \{1, \ldots , P\}\) |
T | Number of time periods, \({\mathcal {T}} = \{1, \ldots , T\}\) |
\(f_i^p\) | Fixed cost of producing product p in period i |
\(h_i^p\) | Inventory holding cost of product p in period i |
\(t_i^{s, p}\) | Set up time of product p in period i |
\(t_i^{u, p}\) | Processing time per unit of product p in period i |
\(C_i\) | Time available in period i |
\(z^p\) | Initial inventory of product p at the beginning of planning horizon. |
The variables used to model the problem are following,
\(x_i^p\) | Quantity product p produced in period i |
\(s_i^p\) | Quantity of product p stored as inventory at the end of period i |
\(y_i^p\) | Binary variable indicating if product p was produced in period i (\(y_i^p = 1\) if \(x_i^p > 0\)) |
The MILP model used for the big-bucket lot-sizing problem is described below.
In our experiments, we consider problems with 9 time periods (\(T=9\)) and 2 products (\(P=2\)). Parameters corresponding to demand, fixed cost of production and inventory holding cost are generated as described in the context of previous variants. Set up time for a product in a each period, \(t^{s, p}_i\) is independently sampled from \(\{200, \ldots , 500\}\) with equal probability, unit processing time \(t^{u, p}_i\) from \(\{1, \ldots , 10\}\) and time limitation \(C_i\) from \(\{1000 P, \ldots , 2000 P\}\). Initial inventory for each product \(z^p\) is similarly sampled from \(\{0, \ldots , 200\}\).
1.3 A.3 Minimum vertex cover
The vertex cover problem on graphs \(G = (V, E)\) concerns with identifying the smallest possible subset \(V'\) of V, such that for every edge in E, at least one of its endpoints is included in \(V'\). It is formulated as follows,
We generated random graphs for the vertex cover problem using the Erdős-Rényi model, i.e., the graph \(G = (V, E)\) is constructed using two parameters; N indicating the number of nodes and p representing the probability with which each edge of the complete graph on V is independently included in the set E. For our computational experiments, we consider graphs with \(N=20\) and \(p=0.75.\)
1.4 A.4 Chance constraint programming: multiperiod power planning
We consider the problem of expanding the electric power capacity [9] of a state by constructing new coal and nuclear power plant to meet with the electricity demand of the state for a time horizon of T periods. Once constructed, coal plants are operational for \(T_c\) time periods and nuclear plants for \(T_n\) time periods. Legal restrictions mandate that fraction of nuclear power should be at most f of the total capacity. Capital cost incurred for the construction of coal and nuclear power plants operational from the beginning of time period t are \(c_t\) and \(n_t\) respectively per megawatt of power capacity. The objective is to minimize the total capital cost of construction. Further, the demand is stochastic and defined on probability space \((\Omega ,{\mathcal {F}}, {\mathbb {P}})\). Approximated by the sample approximation approach, \(\Omega = \{\omega _1, \ldots , \omega _N\}\) is assumed to be a finite sample space. It is required that the probability of the event where the demand is not satisfied be at most \(\epsilon \). The deterministic formulation of the problem as an MILP is as follows,
Parameters
T | Number of time periods, \({\mathcal {T}} = \{1, \ldots , T\}\) |
N | Size of sample space \(\Omega \), \({\mathcal {N}} = \{1, \ldots , N\}\) |
\(c_t\) | Capital cost per MW for coal plant operational from period t |
\(n_t\) | Capital cost per MW for nuclear plant operational from period t |
\(T_c\) | Lifespan of a coal power plant |
\(T_n\) | Lifespan of a nuclear power plant |
f | Upper bound on nuclear capacity as a fraction of total capacity |
\(e_t\) | Electric capacity from existing resources in period t |
\(d^i_t\) | Electricity demand (in MW) in period t corresponding to outcome \(\omega _i\) |
\(p_i\) | Probability of outcome \(\omega _i\) |
\(\epsilon \) | Upper bound on the probability that the demand is satisfied |
Variables
\(x_t\) | Power capacity (in MW) of coal plants operational starting at period t |
\(y_t\) | Power capacity (in MW) of nuclear plants operational starting at period t |
\(u_t\) | Total coal power capacity (in MW) in period t |
\(v_t\) | Total nuclear power capacity (in MW) in period t |
\(z_i\) | Binary variable indicating if demand is not satisfied for outcome \(\omega _i\) |
Model: CCP power planning
The objective function (1) minimizes total capital expenditure of constructing power plants. Equations (2) and (3) compute total coal and nuclear power capacity for a given time period from active power plants based on their lifespan. Equation (4) enforces the regulatory limit on nuclear capacity is satisfied. Equations (5) and (6) ensure that the outcomes for which the demand is not satisfied has probability at most \(\epsilon \).
For the experiments in Sect. 5, we generate instances with \(T = 30\) and \(N = 20\) which corresponds to 30 time periods and 20 outcomes in sample space. Parameters \(d^i_t\) are independent random integers uniformly distributed in \(\{300, \ldots , 700\}\). Similarly, \(c_t\) are uniformly distributed in \(\{100, \ldots , 300\}\) and \(n_t\) in \(\{100, \ldots , 200\}\). Electric capacity from existing resources for the first period, \(e_1\) is a random integer in \(\{100, \ldots , 500 \}\). Capacity from existing resources is then modelled to decline by a factor of r in every subsequent period where r is uniformly distributed in [0.7, 1), ie. \(e_i = e_1\,r^{i-1}\). Lifespan of coal and nuclear power plants are 15 and 10 periods respectively. Nuclear capacity is constrained to be at most 20% of total capacity. All outcomes in \(\Omega \) are equally probable with \(p_i = 0.05\) and demand satisfiability can be violated with a probability of at most 0.2.
1.5 A.5 Chance constraint programming: portfolio optimization
We consider the probabilistically-constrained portfolio optimization problem for n asset types, approximated by the sample approximation approach [39], where the constraint on overall return may be violated for at most k out of the m samples. The MILP formulation of this problem is as follows:
We sample scenarios from the distribution presented in [41], which is shown to be computationally difficult to solve. Each component of the constraint matrix, \(a_{ij}\) is independently sampled from a uniform distribution in [0.8, 1.5] and r is equal to 1.1. For our experiments, we set \(n=30\), \(m=20\) and \(k=4\).
1.6 A.6 Stable set polytope on bipartite graph with knapsack constraint
Stable set polytope corresponding to a bipartite graph is known to have a totally unimodular matrix and thus integral vertices. We consider the problem of solving a maximization problem on the stable set polytope of a bipartite graph where the optimal extreme point is cut off with a knapsack constraint with the same coefficients as the objective function. The details of model are explained below. A bipartite graph \(G = (N, E)\) is generated for a n nodes and m edges as follows. The partition of \(N = N_1 \cup N_2\) is generated, by setting \(N_1\) as a randomly selected subset of \(\lfloor f n \rfloor \) nodes and \(N_2\) as its complement, where f is sampled from a uniform distribution over [0.3, 0.5]. From the \(N_1 \times N_2\) possible edges, m are then randomly selected to form set E. Lastly, each component \(c_i\) of the objective function is a randomly selected integer from 1 to 50. In our experiments, we consider instances with 20 nodes and 30 edges.
Let \(\delta ^*\) be the objective function value of the corresponding maximum weight stable set problem,
The following constraint is then added to (9–11),
where r is uniformly distributed in [0.75, 0.9].
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dey, S.S., Dubey, Y., Molinaro, M. et al. A theoretical and computational analysis of full strong-branching. Math. Program. 205, 303–336 (2024). https://doi.org/10.1007/s10107-023-01977-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-023-01977-x