1 Introduction

The currently dominant strategy to solve mixed-integer linear programs is the branch-and-cut method. Besides the addition of cutting planes, the core is formed by branch-and-bound, based on linear programming (LP) relaxations. The nodes are usually created by variable branching, i.e., a disjunction on the variable bounds. Many branching rules, i.e., methods to choose the branching variable at each node, are known, see, e.g., [2, 3, 10] to mention a few.

This raises the desire to evaluate the performance of branching rules in comparison to the best possible, i.e., to the smallest size of a tree. In this article, we investigate properties of binary programs that allow us to deduce lower bounds on the length of a smallest branch-and-bound tree. Furthermore, we investigate the computational complexity of computing the size of a smallest branch-and-bound tree based on variable branching. We concentrate on proving optimality by assuming that an objective cut using the optimal value is integrated into the system. Then the goal is to estimate the size of a smallest tree proving infeasibility of a linear inequality system with integrality requirements on the variables. Moreover, we restrict attention to pure branch-and-bound algorithms, i.e., no cutting plane separation, domain reduction, presolving is used.

In more detail, the results of this article are the following. In Sect. 3.1 we use the notion of hiding sets as introduced by Kaibel and Weltge [22] to give a general technique for obtaining lower bounds on the size of a smallest branch-and-bound tree using general disjunctions. We employ our method to analyze examples investigated by Dadush and Tiwari [12] and Dey et al. [14], which require large branch-and-bound trees.

Hiding sets yield slightly stronger bounds than the technique of Dey et al. (Lemma 7 in [14]) obtained by generalizing an argument by Dadush and Tiwari [12]. We then highlight a limitation of using hiding sets (Proposition 3): the lower bound for the number of leaves of any branch-and-bound tree that can be derived by hiding sets is upperbounded by the number of facets of the underlying LP-relaxation. Thus, this does not provide a general method to give superpolynomial lower bounds on the size of a branch-and-bound tree using arbitrary disjunctions for formulations with a polynomial number of constraints. Beame et al. [8] discuss the barriers which prevent such bounds to be derived from communication complexity of certain functions.

To produce lower bounds on the size of branch-and-bound trees using variable disjunctions, in Sect. 3.2 we introduce VB-hiding sets, which do not have the above-mentioned limitation, i.e., they are also capable of producing bounds exceeding the number of facets of the LP-relaxation. We then prove a characterization for the size of branch-and-bound trees using variable disjunctions for a formulation of the SubsetSum problem. In particular, this gives an easy and tight analysis of a well-known example by Jeroslow [21].

Moreover, we show in Sect. 4 that certain binary programs arising as disjoint composition of smaller programs require exponentially large (in the number of variables) branch-and-bound trees; these results cannot in general be proven using VB-hiding sets. Examples of combinatorial optimization problems for which these bounds hold are the maximum weight matching, the minimum weight edge cover, the Steiner tree and Hamiltonian path/cycle problems.

In Sect. 5, we turn towards the complexity of computing small branch-and-bound trees. We show that it is not possible to approximate the size of a smallest branch-and-bound tree using variable branching within a factor of \(2^{\frac{1}{5}n}\) in time \(O(2^{\delta n})\) with \(\delta <\frac{1}{5}\), where n is the number of variables, unless the strong exponential time hypothesis fails (Theorem 25). The same argument can be used to show that unless \(\text {P} = \text {NP} \), no polynomial time algorithm can approximate the size of a smallest tree within \(2^{(\frac{1}{2} - \varepsilon )n}\) for any \(\varepsilon > 0\) (Theorem 24). This result significantly strengthens the hardness of approximation within a factor of 2 in Hendel et al. [19].

In Sect. 6, we prove that computing the size of a smallest branch-and-bound tree exactly is weakly \({\#P} \)-hard (Theorem 30), even when restricting to binary programs based on the fractional matching polytope. To the best of our knowledge, such a hardness result is novel, even across all commonly considered proof systems. Furthermore, for a binary program it is weakly #P-hard to compute the size of the tree produced by many branching rules from practice (e.g., the most-infeasible branching rule or full strong branching), given suitable tie-breaking. Since one can construct a SAT-formula for which satisfying assignments correspond to leaves of the resulting tree [23, 26], this task is actually in #P (when solving LPs with a polynomial time algorithm). Due to the famous theorem of Toda [27], neither of the two aforementioned estimation problems can be solved in polynomial time, even when given access to an oracle from the polynomial hierarchy, unless the polynomial hierarchy collapses. We note that Le Bodic and Nemhauser [25] consider an abstract model in which branching on a particular variable improves the value of the LP-bound for the two children by a fixed amount and show that computing the size of a smallest tree in their model is (weakly) #P-hard. Our proof is based on the fact that for every instance of the problem of finding a smallest proof in their abstract setting, there is a (non-abstract) integer program which behaves similarly. In the conference version of this paper [17], we have shown that it is actually strongly \({\#P} \)-hard to compute size of a smallest branch-and-bound tree exactly via a different proof, which does not rely on the results of Le Bodic and Nemhauser. However, the corresponding binary programs are more general.

Despite the two previous results, can the size of a smallest tree be approximated, if polynomial time in the size of the instance is spent for every node of the smallest tree? This is captured by the notion of automatizability of proof systems, i.e., a proof can be produced in polynomial time in the size of the instance and of a shortest proof. In Sect. 7, we transfer hardness results from the literature to show that branch-and-bound with variable branching is not automatizable under reasonable hardness assumptions. However, it is quasi-automatizable, i.e., a proof can be produced in quasi-polynomial time in the size of the instance and the size of a smallest proof. Moreover, for the special case of subset sum instances, branch-and-bound with variable branching is automatizable.

Branch-and-bound algorithms in practice almost always cut off an optimal LP solution of the currently considered node by branching on a variable with fractional value in that solution, producing an LP-based branch-and-bound tree. We note that our results also hold for LP-based branch-and-bound trees, even though we show in Sect. 8 that arbitrary trees using variable disjunctions can be smaller by an exponential factor.

Note that pure branch-and-bound described above produces mixed results: On the one hand, it solves random binary programs (with a fixed number of constraints) in polynomial time [13], on the other hand, it takes exponential time to prove infeasibility of matching polytopes with an objective cut [7, Theorem 2.2] and Theorem 22, even though matching is solvable in polynomial time.

Since we are usually giving hardness results, we will formulate them for the narrowest possible class of binary programs, which will often be formulations to find a maximum weight matching or minimum weight edge cover. Since matchings (edge covers) correspond to stable sets (vertex covers) in the line graph, hardness results will also hold for the corresponding formulations in this case. Of course, hardness results will then also transfer to a formulation for the packing (covering) problem.

A preliminary version of this article has appeared in the IPCO conference proceedings [17]. However, the material has been revised and the results in Sects. 3, 4, 8 and the proof in Sect. 6 are new.

Throughout the article, for a positive integer \(n \in \mathbb {N}= \{1, 2, \dots \}\), we use \([n] {:}{=}\{1, \dots , n\}\) and \(2^{[n]}\) for the power set of [n].

2 Preliminaries

2.1 Branch-and-bound trees for binary programs and polytopes

One important application of branch-and-bound is the solution of (mixed) integer linear programs. For the sake of simplicity we will only consider binary programs of the form

$$\begin{aligned} \max \quad&c^\top x \\ \text {s.t.}\quad&Ax \le b, \\&x \in \{0,1\}^n, \end{aligned}$$
(P)

where \(A \in \mathbb {Q}^{m\times n}\)\(b\in \mathbb {Q}^m\) and \(c\in \mathbb {Q}^n\). By scaling, we can assume that Ab and c actually have integral entries and the greatest common divisor of the coefficients of A and the right hand side b in every row is 1.

To concentrate the investigation on the ability of branch-and-bound to prove optimality (or infeasibility), we assume that we know the optimal values of feasible instances and these are integrated by an objective cut into the problem: Let \(\nu ^\star {:}{=}\max \,\{c^\top x \,:\,Ax \le b,\; x\in \{0,1\}^n\}\) be the value of the binary program (P) and define the verification polytope for (P) to be

$$\begin{aligned} P_{\text {v}} {:}{=}\{ x \in [0,1]^n \,:\,Ax \le b,\; c^\top x \ge \nu ^\star +1\}, \end{aligned}$$
(1)

if \(\nu ^\star \) is finite. We define \(P_{\text {v}} {:}{=}\{x \in [0,1]^n \,:\,Ax \le b\}\) if (P) is infeasible. Note that (P) is always bounded. The goal is then to prove that \(P_{\text {v}}\) is integer-free, i.e., \(P_{\text {v}} \cap \mathbb {Z}^n = \emptyset \), by branch-and-bound. The integer-freeness of the verification polytope is equivalent to (P) having value at most \(\nu ^\star \).

Note that a slightly different definition of \(P_{\text {v}}\) is also natural: Instead of the objective cut \(c^\top x \ge \nu ^\star \), we could add the cut \(c^\top x \ge \nu ^\star + \varepsilon \) with small \(\varepsilon > 0\).

2.2 Branch-and-bound trees for polytopes

To fix notation, we formalize branch-and-bound trees. A disjunction is a pair of linear inequalities of the form \(a^\top x\le b \vee a^\top x\ge b +1\), where \(a\in \mathbb {Z}^n\) and \(b\in \mathbb {Z}\). In particular, every integer point satisfies exactly one of them. A branch-and-bound proof (of integer-freeness) of a polytope P or branch-and-bound tree for P is a rooted binary directed tree T with the following properties:

  • For every non-leaf node N there is a disjunction \(a^\top x\le b \vee a^\top x\ge b +1\), such that the left outgoing edge of N is labeled with the inequality \(a^\top x\le b\) and the right edge is labeled with \(a^\top x\ge b +1\). The neighbor of N incident to the left edge is called the \((a^\top x\le b)\)-child of N, whereas the neighbor incident to the right edge is the \((a^\top x\ge b +1)\)-child.

  • For each node N, let \(B_N\) denote the intersection of the halfspaces defined by the inequalities which occur as labels on the path from the root to N, i.e., the set of points compatible with the branching decisions leading to N. To every node N we attach a corresponding polytope \(P_N {:}{=}P \cap B_N\). We require \(P_N\) to be empty if and only if N is a leaf. We say N is feasible if \(P_N\) is non-empty and infeasible otherwise.

The \((a^\top x\le b)\)-branch at a node N is the directed subtree rooted at the \((a^\top x\le b)\)-child of N and the \((a^\top x\ge b +1)\)-branch is defined analogously.

In the case where all disjunctions in T are variable disjunctions, i.e., of the form \(x_i \le 0 \vee x_i \ge 1\) for some variable \(x_i\)T is called a variable disjunction branch-and bound tree. Since we are considering binary problems, we can then assume that the outgoing edges of N are instead labeled by the variable fixings \(x_i = 0\) and \(x_i =1\). We then say that N branches on the variable \(x_i\). A variable \(x_i\) has been fixed at a node N, if a constraint \(x_i=a\) for some \(a\in \{0,1\}\) occurs as a label on the path from the root to N. Without loss of generality, we assume every variable occurs at most once in a label of an edge on any root-leaf path in T.

The size of a tree is the number of its nodes and a tree with minimal size is called a shortest branch-and-bound proof or a smallest branch-and-bound tree. We are then interested in \({{\mathcal {T}}} (P)\), which for an integer-free polytope P is the size of a smallest branch-and-bound tree for P.

A branch-and-bound tree for a binary program (P) is a branch-and-bound tree for its verification polytope \(P_\text {v}\), and we say a class of (not necessarily infeasible) binary programs \({{\mathcal {P}}}\) is verifiable by branch-and-bound trees of size s(n), if for every \((P)\in {{\mathcal {P}}}\) with n variables, \(P_{\text {v}}\) admits a branch-and-bound tree of size at most s(n).

A leaf L of a T rules out a point \(x\in \{0,1\}^n\), if \(x\in B_L\). It is easy to see that a complete branch-and-bound tree T rules out all points in \(\{0,1\}^n\). For a node N of T, the LP-relaxation at N is the linear program \(\max \, \{c^\top x \,:\,x\in P \cap B_N\}\), where P is the LP-relaxation of (P).

2.3 Branching rules

Besides quantifying the hardness of computing the size of a smallest tree, we also want to obtain results on the hardness of predicting the size of the tree resulting from a particular branch-and-bound algorithm using variable disjunctions.

To obtain a concrete algorithm from the general framework of branch-and-bound, several choices need to be made, e.g., about when to add cutting planes, which heuristics to employ, how to select the next node to be processed and how to choose a branching variable at every node (and thus potentially about how to solve linear programs). However, our model does neither involve cutting planes, nor heuristics. Furthermore, since no information (e.g., primal bounds) is exchanged between the nodes in our model, the sequence in which nodes are processed does not influence the resulting branch-and-bound tree. Thus, the tree resulting from branch-and-bound in our model depends solely on the branching rule, i.e., on how to select the branching variable at every node.

To apply most-infeasible branching, we compute an optimal basic solution x of the LP-relaxation at the current node and choose a variable \(x_i\) maximizing \(\min \,\{x_i - \lfloor x_i \rfloor ,\lceil x_i \rceil - x_i\}\) as branching variable.

To apply full strong branching (e.g., [1, 3]) to a binary program with objective function, we compute an optimal basic solution x to the LP-relaxation and a score for every variable with fractional value in x. Let \(\delta \) denote the value of the LP-relaxation at the current node and \(\delta ^a_i\) the value of the linear program obtained by strengthening the current LP-relaxation by the fixing \(x_i = a\) for \(a \in \{0,1\}\). The gains resulting from branching in \(x_i\) are the improvements between these dual bounds, i.e., \(\delta -\delta ^1_i\) and \(\delta -\delta ^0_i\). The score for variable i is then \(\nu (g_i^+, g_i^-)\), where \(g_i = \max \, \{\delta -\delta ^1_i,\;\delta -\delta ^0_i\}\), \(g_i^- = \min \, \{\delta -\delta ^1_i,\;\delta -\delta ^0_i\}\) and \(\nu \,:\,\mathbb {R}^+_0 \times \mathbb {R}^+_0 \rightarrow \mathbb {R}\) is a scoring function. Typical scoring functions from practice are product functions of the form

$$\begin{aligned} \max \,\{\varepsilon ,\; g_i^-\}\cdot \max \,\{\epsilon ,\; g_i^+\} \end{aligned}$$

for a small value \(\varepsilon >0\) or linear functions of the form

$$\begin{aligned} (1-\mu )\, g^+_i \cdot \mu \, g^-_i \end{aligned}$$

with a parameter \(\mu \in [0,1]\) (cf. e.g., [3, 25]). We then choose a (fractional) variable with the highest score as a branching variable. In case there is a variable \(x_i\) such that fixing \(x_i = a\) with \(a\in \{0,1\}\) results in an infeasible LP-relaxation, i.e., \(g^+_i = \infty \), we assume we branch on such a variable when applying strong branching. Note that this increases the size of the resulting tree by at most a factor of n compared to introducing the fixing \(x_i=1-a\) without creating new nodes.

A scoring function is reasonable, if it is monotone with respect to the partial order \(\le \times \le \) on \(\mathbb {R}_+ \times \mathbb {R}_+\) where \(\le \) denotes the usual order on \(\mathbb {R}\). Both linear and product scoring functions are reasonable. We will show that our results for full strong branching hold for reasonable scoring functions. Note that the LPs with the tentatively fixed variables are solved to optimality in our setting, even though this is too slow in practice.

Both full strong branching and most-infeasible branching may assign the same score to two variables, in which case we will resort to some fixed total order on the variables in order to resolve ties. Furthermore, both branching rules depend on which solution x of the LP-relaxation at the current node is computed. We will get around this dependency by noting that our arguments hold even for adversarial choices of x.

Branch-and-bound with variable disjunctions as considered in the literature usually allows branching on any variable, even if it has an integral value in every optimal solution of the LP-relaxation at the current node. Algorithms in practice, however, most often only branch on variables with fractional value in some computed solution. Thus, they will only produce LP-based branch-and-bound trees, i.e., branch-and-bound trees that only branch on variables with fractional value in some optimal solution of the current LP-relaxation. As will be discussed below, our results also hold for LP-based branch-and-bound trees.

3 Hiding sets

3.1 Hiding sets for branch-and-bound with arbitrary disjunctions

Kaibel and Weltge [22] show that the existence of so called hiding sets implies lower bounds on the number of inequalities required to describe the integer hull of a polytope. We show that hiding sets also give lower bounds on the size of branch-and-bound trees using general disjunctions. The following definition is essentially a slight generalization of the one introduced by Kaibel and Weltge.

Fig. 1
figure 1

A hiding set \(H_1\) (left) and a VB-hiding set \(H_2\) (right), each represented with black dots, for polytopes \(X_1\) and \(X_2\), respectively, shown in gray

Definition 1

([22]) A hiding set for \(X \subseteq \mathbb {R}^n\) is a set \(H\subseteq \mathbb {Z}^n{\setminus } \text {conv}(X)\) such that for any a, \(b \in H\) with \(a\ne b\) we have \(\text {conv}\{a,b\} \cap \text {conv}(X) \ne \emptyset \).

Figure 1a shows an example. Note that Kaibel and Weltge were interested only in integral polyhedra (for which any branch-and-bound tree has size 1), whereas the above definition allows more general sets X.

Our first observation is that hiding sets also provide a lower bound on the size of branch-and-bound trees for integer-free polytopes.

Proposition 2

Let \(P \subset \mathbb {R}^n\) be an integer-free polyhedron and H be a hiding set for P. Then any branch-and-bound tree for P using arbitrary disjunctions has at least \(|{H}|\) leaves.

Proof

Recall that a leaf L of a branch-and-bound tree T rules out \(x\in \mathbb {Z}^n\), if \(x\in B_L\), where \(B_L\) is the set of points satisfying the branching decisions leading to L. Assume a, \(b \in H\), \(a\ne b\), are ruled out by the same leaf L of T. Since a, \(b \in B_L\), it follows that \(\text {conv}\{a,b\} \subseteq B_L\). But then there is \(c \in \text {conv}\{a,b\} \cap P \subseteq B_L \cap P = P_L\), contradicting the fact that \(P_L = \emptyset \), since L is a leaf. \(\square \)

However, the size of a hiding set is trivially limited:

Proposition 3

Let \(\{Ax\le d\}\) be an irredundant system of K inequalities. Then no hiding set H for \(P= \{Ax\le d\} \subseteq \mathbb {R}^n\) can have more than K elements.

Proof

Assume there is an inequality of \(Ax\le d\) that is violated by two distinct points a and b from H. Then \(\text {conv}\{a,b\}\) violates the same inequality and therefore \(\{x \,:\,Ax\le d\} \cap \text {conv}\{a,b\} = \emptyset \), a contradiction. \(\square \)

Unfortunately, the (exponential) lower bounds on the number of inequalities required to describe a polytope given by Kaibel and Weltge [22] do not transfer to lower bounds on the size of a branch-and-bound tree. The reason is that a branch-and-bound tree is required to bound the polytope under consideration only in the direction of some (possibly arbitrary) objective function, whereas a description by inequalities is required to bound the polytope in all directions.

Nevertheless, we recover a strengthening of an observation from Dadush and Tiwari [12], which has already been shown by Dey et al. [14] using a hiding set:

Proposition 4

Any branch-and-bound tree using arbitrary disjunctions for \(P^n {:}{=}\{x \in [0,1]^n \,:\,\sum _{i\in R}x_i + \sum _{i\not \in R}(1-x_i)\ge \tfrac{1}{2}~ \forall R\subseteq [n]\}\) has at least \(2^n\) leaves.

Proof

Note that \(\{0,1\}^n\) is a hiding set for \(P^n\). Indeed, for every two distinct points a, \(b\in \{0,1\}^n\), their convex hull contains a point \(x\in [0,1]^n\), such that \(x_i = \frac{1}{2}\) for some \(i\in [n]\), and it is easy to see that we have \(x\in P^n\) for any such x. \(\square \)

Furthermore, Dey et al. [14] investigated the following packing polytope using a different technique by Dadush and Tiwari [12].

$$\begin{aligned} P^{\le }_{n,k} {:}{=}\left\{ x\in [0,1]^n\,:\,\mathbb {1}^\top x \ge k \text { and }\sum _{i\in S}x_i \le k-1 \quad \forall S\subseteq [n] \text { with } |{S}| =k \right\} . \end{aligned}$$

We recover their result (Lemma 8 in [14]) using hiding sets:

Proposition 5

Any branch-and-bound tree for \(P^{\le }_{n,k}\) using arbitrary disjunctions has at least \(\frac{1}{n} {n\atopwithdelims ()k} + 1\) leaves.

Proof

We begin by proving the following two claims: \(\square \)

Claim 1

Let \({{\mathcal {S}}}\subset 2^{[n]}\), such that \(|{S}| = k\) for all \(S\in {{\mathcal {S}}}\) and \(|{S_1{\setminus } S_2}|\ge 2\) for distinct \(S_1\), \(S_2\in {{\mathcal {S}}}\), and let \(X_{{\mathcal {S}}}{:}{=}\{{\chi }_S \,:\,S\in {{\mathcal {S}}}\}\) be its set of characteristic vectors. Then \(X_{{\mathcal {S}}}\) is a hiding set for \(P^{\le }_{n,k}\).

Proof of claim

Let \(S_1\)\(S_2\in {{\mathcal {S}}}\) with \(S_1\ne S_2\). Then \(x {:}{=}\frac{1}{2}({\chi }_{S_1} + {\chi }_{S_2})\) has \(\ell \ge 4\) entries equal to \(\frac{1}{2}\), \(k - \ell \) entries equal to 1 and the remaining entries equal to 0. Thus, any set \(T \subseteq [n]\) with \({\chi }_T^\top x > k - 1\) has at least \(k+1\) elements and thus x satisfies all inequalities defining \(P^{\le }_{n,k}\). Since \(x\in P^{\le }_{n,k} \cap \text {conv}({\chi }_{S_1},{\chi }_{S_2})\) and \(S_1\) and \(S_2\) are arbitrary, \({{\mathcal {S}}}\) is a hiding set for \(\smash {P^{\le }_{n,k}}\). \(\square \)

Claim 2

There exists a set \({{\mathcal {S}}}\subset 2^{[n]}\), such that \(|{S}| = k\) for all \(S\in {{\mathcal {S}}}\) and \(|{S_1{\setminus } S_2}|\ge 2\) for distinct \(S_1\), \(S_2\in {{\mathcal {S}}}\) with \(|{{{\mathcal {S}}}}| \ge {n\atopwithdelims ()k}/{n}\).

Proof of claim

Color every set \(S \subseteq [n]\) with \(|{S}| = k\) with the color

$$\begin{aligned} \Big (\sum _{s\in S} s\Big ) \bmod n. \end{aligned}$$

Consider two k-element subsets \(S_1\), \(S_2\) of [n] with \(|{S_1\setminus S_2}| = 1\). Then there exist i, \(j\in [n]\) with \(S_1{\setminus } S_2 = \{i\}\) and \(S_2{\setminus } S_1 = \{j\}\). Thus, \(\sum _{s\in {S_1}}-\sum _{s\in {S_2}}s =i-j \in \pm \{1,\dots ,n-1\}\). Therefore, distinct elements \(S_1\) and \(S_2\) in the same color class satisfy \(|{S_1\setminus S_2}|\ge 2\). Hence, taking the largest resulting color class suffices by the pigeonhole principle. \(\square \)

The set \({{\mathcal {S}}}\) given by Claim 2 is a hiding set by Claim 1. It is easy to verify \({{\mathcal {S}}}\cup \{0\}\) is still a hiding set and thus the proposition follows from Proposition 2. \(\square \)

Remark 6

In the case \(d{:}{=}\gcd (k,n) >1\), we can strengthen the above bound by 1. Note that \(S= \{ i\in [n] \,:\,(i \bmod \frac{n}{d}) \in \{1,\dots , \frac{k}{d}\}\}\) has k elements and could also be colored with \(\big ((\sum _{s\in S} s) + \frac{n}{d}\big ) \bmod n\) while maintaining the desired property. In the case \(\gcd (n,k) = 1\), however, there are no other valid color choices for any set S. This is because a set S of color \(\alpha \) which could also be colored with \(\beta \ne \alpha \) is required to be a union of cosets of \(\langle \beta - \alpha \rangle \) in \(\mathbb {Z}/n \mathbb {Z}\). But then \(d = |{\langle \beta -\alpha \rangle }| \ne 1\) is a common divisor of k and n, a contradiction. Furthermore, in the case \(\gcd (n,k) = 1\) all color classes have the same size: Let \(\varphi :S \mapsto \{s+1 \bmod n \,:\,s\in S\}\). Then, if S has color a\(\varphi (S)\) has color \(a+k \bmod n\). Thus, the sets S\(\varphi (S)\)\(\dots \)\(\varphi ^{n-1}(S)\) have distinct colors and S and \(\varphi ^n(S)\) have the same color. Hence, if \({{\mathcal {S}}}\) is an arbitrary color class then \(|{{{\mathcal {S}}}}|\ge |{\varphi ({{\mathcal {S}}})}| \ge \dots \ge |{\varphi ^{n-1}({{\mathcal {S}}})}| \ge |{{{\mathcal {S}}}}|\). In particular, \({{\mathcal {S}}}\) is a smallest color class, since subsets of every color class appear in the above chain of inequalities.

Dey et al. [14] have shown that affine integral maps between polytopes allow to transfer lower-bounds on the size of branch-and-bound trees: An affine integral function is a function \(f\,:\,\mathbb {R}^n \rightarrow \mathbb {R}^m\) with \(f(x) = Cx+d\) for some \(C\in \mathbb {Z}^{m\times n}\) and \(d\in \mathbb {Z}^m\).

Lemma 7

(follows from Lemma 5 in [14]) Let \(P \subseteq \mathbb {R}^n\), \(Q \subseteq \mathbb {R}^m\) and \(f:\mathbb {R}^n \rightarrow \mathbb {R}^m\) be an affine integral map with \(f(P) \subseteq Q\). Then we have \({{\mathcal {T}}} (Q) \ge {{\mathcal {T}}} (P)\).

The following lemma implies that there is no hope to obtain an exponential lower bound on the size of a branch-and-bound tree for a polytope with a polynomial number of facets by transferring the bound obtained by hiding sets via the above lemma (cf. Corollary 9 below).

Lemma 8

Hiding sets are preserved by affine integral functions, i.e., if H is a hiding set for a polytope \(P \subset \mathbb {R}^n{\setminus } \mathbb {Z}^n\)\(f:\mathbb {R}^n \rightarrow \mathbb {R}^m\) is an affine integral function and \(Q \subset \mathbb {R}^m\setminus \mathbb {Z}^m\) is a polytope with \(f(P)\subseteq Q\), then f(H) is a hiding set for Q of cardinality \(|{H}|\).

Proof

Since f is affine integral,

  1. 1.

    f(h) is integral for any \(h\in H\), since h is integral and

  2. 2.

    \(\text {conv}\{f(h_1),f(h_2)\} \cap Q \supseteq \text {conv}\{f(h_1),f(h_2)\} \cap f(P) \supseteq f(\text {conv}\{h_1,h_2\}\cap P)\ne \emptyset \) for any \(h_1\), \(h_2 \in H\), since \(\text {conv}\{h_1,h_2\} \cap P \ne \emptyset \).

Thus, f(H) is a hiding set. Moreover, f(a) and f(b) are distinct for distinct a, \(b\in H\): If \(f(a) = f(b)\), consider \(c \in \text {conv}\{a,b\}\cap P\). On the one hand, \(f(c) =f(a) \in \mathbb {Z}^m\), on the other hand, we have \(f(c) \in f(P) \subseteq Q \subset \mathbb {R}^m{\setminus } \mathbb {Z}^m\), a contradiction.\(\square \)

Corollary 9

If \(P \subseteq \mathbb {R}^n\setminus \mathbb {Z}^n\) has a hiding set of size K and \(Q\subseteq \mathbb {R}^m{\setminus } \mathbb {Z}^m\) has strictly less than K facets, then there is no affine integral map \(f:\mathbb {R}^n \rightarrow \mathbb {R}^m\) with \(f(P) \subseteq Q\).

Proof

This is a consequence of Lemma 8 and Proposition 3.\(\square \)

Dey et al. [14] give lower-bounds for the size of general branch-and-bound trees for natural formulations of TSP and SetCover by providing affine integral maps from the packing polytope to those polytopes. By Lemma 8, we also obtain hiding sets of exponential size for these two polytopes. Then Corollary 9 implies that applying Lemma 7 using these polytopes cannot produce exponential lower bounds for the size of branch-and-bound trees of polytopes with a polynomial number of facets.

3.2 Hiding sets for branch-and-bound using variable disjunctions

There is a weaker notion of hiding sets which implies lower bounds on the size of a branch-and-bound tree using variable disjunctions. However, this notion is applicable only to binary programs.

Definition 10

A variable branching hiding set (VB-hiding set) for \(X \subseteq [0,1]^n\) is a set \(H\subseteq \{0,1\}^n{\setminus } \text {conv}(X) = \{0,1\}^n{\setminus } X\) such that for any a\(b \in H\) with \(a\ne b\) we have \(Z_{a,b} \cap \text {conv}(X) \ne \emptyset \) for

$$\begin{aligned} Z_{a,b} {:}{=}\{x\in [0,1]^n \,:\,x_i = a_i \text { whenever } a_i = b_i\}. \end{aligned}$$

See Fig. 1b for an example.

Proposition 11

Let \(P \subset [0,1]^n\) be an integer-free polyhedron and H be a VB-hiding set for P. Then any branch-and-bound tree for P using variable disjunctions has at least \(|{H}|\) leaves.

Proof

Recall again that a leaf L of a branch-and-bound tree T of P rules out \(y\in \{0,1\}^n\), if \(y\in B_L\), where \(B_L\) is the set of points satisfying the branching decisions leading to L. Assume a, \(b \in H\) with \(a\ne b\) are ruled out by the same leaf of T, i.e., a, \(b \in B_L\). We then have \(Z_{a,b} \subset B_L\): Otherwise, there is \(y \in Z_{a,b} \setminus B_L\). Since \(y \not \in B_L\), there is a branching decision \(x_i = v\) leading to L with \(v\in \{0,1\}\) and \(i\in [n]\) such that \(y_i \ne v\). Because \(y \in Z_{a,b}\), we either have \(a_i \ne b_i\) or \(y_i =a_i = b_i\). In any case, we have \(a_i \ne v\) or \(b_i\ne v\), which contradicts either \(a \in B_L\) or \(b \in B_L\).

Since H is a VB-hiding set, we now have \(\emptyset \ne Z_{a,b} \cap P \subseteq B_L \cap P = P_L\) contradicting the fact that \(P_L = \emptyset \), since L is a leaf. \(\square \)

Note that we assumed that our branching decisions are variable fixings, hence we had to restrict the the definition of VB-hiding sets to the case of binary programs.

Example 12

This yields an analysis of a well-known example by Jeroslow [21]. For \(J^n {:}{=}\{x\in [0,1]^{2n} \,:\,\sum _{i=1}^{2n} 2x_i = 2n+1\}\), the set \(\{x\in \{0,1\}^{2n+1}\,:\,x \text { contains }\) \(\text {exactly}~n \text { or}~n+1 \text { entries equal to }~1\}\) is a VB-hiding set of cardinality \({2n \atopwithdelims ()n}+ {2n \atopwithdelims (){n+1}}\).

Note that the resulting bound via Proposition 11 is tight: Since all variables are symmetric, even after fixing any subset to arbitrary integral values, we can assume we branch on variable i at level \(i-1\). The first summand then counts the number of leaves where we branched \(n+1\) variables to 1 and the second summand counts the number of leaves where we branched \(n+1\) variables to 0. Moreover, note that any general hiding set contains at most two points, since \(J^n\) has a branch-and-bound tree with two leaves using the disjunction \((\sum _{i=1}^{2n+1} x_i \le n) \vee (\sum _{i=1}^{2n+1} 2x_i \ge n+1)\).

It turns out that VB-hiding sets can characterize the minimal size of a branch-and-bound tree of a SubsetSum-instance \({{\mathcal {S}}}= (c_1x_1 + \dots + c_nx_n = \beta , x \in \{0,1\}^n )\), where \(c_1, \dots , c_n \in \mathbb {N}\). A set \(A \subseteq [n]\) is a cover of \({{\mathcal {S}}}\), if \(c(A) {:}{=}\sum _{i\in A} c_i > \beta \), a solution, if \(c(A) = \beta \) and an packing, if \(c(A) < \beta \).

Proposition 13

For a binary program \({{\mathcal {S}}}\) for an infeasible SubsetSum instance, the set of characteristic vectors \(\{{\chi }_H \,:\,h\in H\}\) for

$$\begin{aligned} H = \{\text {minimal covers of }{{\mathcal {S}}}\} \cup \{\text {maximal packings of } {{\mathcal {S}}}\} \ \end{aligned}$$

is a VB-hiding set for the LP-relaxation of \({{\mathcal {S}}}\).

Proof

Let \(A_1\), \(A_2 \in H\)\(A_1\) be a maximal packing and \(A_2\) be either a minimal cover or a maximal packing with \(A_1 \ne A_2\). Either due to \(c(A_2) > c(A_1)\) or since \(A_1\) and \(A_2\) are maximal packings, we have \(A_2 \setminus A_1 \ne \emptyset \). Let \(i \in A_2 {\setminus } A_1\). By maximality of \(A_1\)\(A_1 \cup \{i\}\) is a cover. Therefore, there is a solution x to the LP-relaxation of \({{\mathcal {S}}}\), which agrees with \({\chi }_{A_1}\) on \([n] \setminus \{i\}\) and has some fractional value for \(x_i\). In particular, x agrees with \({\chi }_{A_1}\) and \({\chi }_{A_2}\), whenever they agree.

The case where both \(A_1\) and \(A_2\) are minimal covers is analogous. \(\square \)

The bound resulting from Proposition 13 via Proposition 11 is sharp:

Proposition 14

For every infeasible SubsetSum instance \({{\mathcal {S}}}\), there exists a branch-and-bound tree with \(|{H}|\) leaves and such a tree is produced by branching the variables in weight-descending order.

Proof

Assume that \(c_1 \ge \dots \ge c_n\) and let T be the branch-and-bound tree produced by branching on the variables in order of increasing index. For a leaf L of T, it suffices to show that L rules out a point from H. At L, variables \(x_1,\dots ,x_k\) have been fixed to binary values \(a_1,\dots , a_k\). Assume \(a_k = 0\), i.e., the last variable fixed has been fixed to 0. The infeasibility of L implies either \(\sum _{i = 1}^k c_i a_i + \sum _{i = k+1}^n c_i < \beta \) or \(\smash {\sum _{i = 1}^k c_i a_i} > \beta \). Since the parent of L is feasible, the second case is impossible. Hence, \(A = \{i\in [k] \,:\,a_i = 1\} \cup \{k+1,\dots ,n\}\) is a packing. For any \(A' \supseteq A\), we have \(\sum _{i\in A'}c_i \ge \sum _{i = 1}^{k-1} a_ic_i +\sum _{i = k}^n c_i \ge \beta \), where the first inequality is because the \(c_i\) are sorted and the second due to the fact that the parent of L is feasible. Hence, A is a maximal packing and therefore L rules out an element from H. Similarly, if \(a_k = 1\), then \(A = \{i\in [k] \,:\,a_i = 1\}\) is a minimal cover. \(\square \)

Of course one can prove that the branching rule described in Proposition 14 is optimal without considering hiding sets. However, it is interesting that we can give a combinatorial characterization of the smallest branch-and-bound tree.

Remark 15

We note that Propositions 13 and 14 also hold for Knapsack instances where all items have the same density.

Our next results shows that there are polytopes for which there exists no branch-and-bound tree using variable disjunctions of subexponential size, but the size of any VB-hiding set is polynomially bounded.

For this, given a graph \(G = (V,E)\), we consider the fractional matching polytope \({{\mathcal {M}}}(G) {:}{=}\{ x\in [0,1]^E \,:\,\sum _{e\in \delta (v)} x_e \le 1\quad \forall v\in V\}\), where \(\delta (v)\) denotes the set of edges incident to v.

Proposition 16

Let \(G_k\) be the graph which is formed by the disjoint union of k triangles \(C_3\) and let \({{\mathcal {M}}}(G_k,k+\tfrac{1}{2})\) denote the fractional matching polytope strengthened by the objective cut \(\mathbb {1}^\top x \ge k+\tfrac{1}{2}\). Then \({{\mathcal {M}}}(G_k,k+\tfrac{1}{2})\) requires branch-and-bound trees using variable disjunctions of size at least \(2\cdot 2^k -1\), but every VB-hiding set for \({{\mathcal {M}}}(G_k,k+\tfrac{1}{2})\) contains at most \(3k+1\) elements.

Proof

The claim on the size of a smallest tree can be shown similarly to the proof of Theorem 22, yielding the slight strengthening of the bound here.

It remains to show that every VB-hiding set H contains at most \(3k+1\) elements. For this we identify elements of the hiding set \(h\in H\) with their support \(\{e\in E \,:\,x_e =1 \}\). First consider elements in H which violate a degree constraint on some node. There are only 3k minimal edge sets violating a degree constraint and no two distinct a\(b \in H\) contain the same violating edge set: otherwise the intersection of \(Z_{a,b} = \{x\in \mathbb {R}^E \,:\,x_e = a_e \text { whenever } a_e = b_e\}\) and \({{\mathcal {M}}}(G_k,k+\tfrac{1}{2})\) would be empty, contradicting the definition of a VB-hiding set. Thus, H contains at most 3k such elements (in fact for any graph a VB-hiding set for the fractional matching polytope can contain at most \(O({E \atopwithdelims ()2}) = O(n^4)\) elements violating degree constraints). Any other element \(h \in H\) can contain at most one edge from every triangle. Then, given two such elements h\(g \in H\) with \(h\ne g\), in every triangle there must be an edge which is contained in neither h nor g. Then this element is fixed to 0 in \(Z_{h,g}\). Once an edge in an triangle is fixed to 0, a fractional matching in that triangle has size at most 1. Summing over all k components, we have \(\mathbb {1}^\top x \le k\) for all \(x \in Z_{h,g}\cap {{\mathcal {M}}}(G_k,k+\tfrac{1}{2})\), hence \(Z_{h,g}\cap {{\mathcal {M}}}(G_k,k + \tfrac{1}{2}) = \emptyset \), contradicting the fact that h and g are distinct elements from the same hiding set. Therefore, H contains at most 3k elements violating a degree constraint and at most one additional element not violating a degree constraint. \(\square \)

A straightforward modification yields analogous results for a formulation to find a minimum weight edge cover.

4 A lower bound for decomposable instances

In this section, we generalize the instance from Proposition 16 in order to observe that many combinatorial problems on graphs require exponentially sized branch-and-bound trees using variable disjunctions. The instances for which we show lower bounds in this section arise by the following construction:

Definition 17

(Sum of BPs) Given two binary programs (P) and (Q)

figure a

with rational data, we define their disjoint sum \((P) \oplus (Q)\) as

$$\begin{aligned} \max \quad&\sum _{i=1}^n c_i x_i+\sum _{j=1}^m d_j y_j \\ \text {s.t.} \quad&Ax \le a,\quad By \le b, \\&(x,y) \in \{0,1\}^{n+m}. \end{aligned}$$

We say a BP (Q) is a component of a BP (P), if there exists a BP (R) with \((P) = (Q)\oplus (R)\), such that both (Q) and (R) contain at least one variable. In this case, (P) is called reducible, otherwise (P) is irreducible. It is easy to see that the irreducible components of a BP (P) are unique, computable in polynomial time and (P) is their sum. We identify a component with the set of variables which are used by its objective functions and constraints.

Observation 18

Let (P) and (Q) be BPs with n and m variables, respectively. A vector \((x,y) \in \mathbb {Q}^{n+m}\) is an optimal feasible solution of \((P)\oplus (Q)\) if and only if x and y are optimal feasible solutions to (P) and (Q).

Similarly, for polytopes \(P = \{x\,:\,Ax \le a\}\subseteq \mathbb {R}^n\) and \(Q = \{y\,:\,By \le b\}\subseteq \mathbb {R}^m\), we define \(P \oplus Q = \{(x,y)\,:\,Ax \le a,\; By \le b\}\subseteq \mathbb {R}^{n+m}\).

An easy, but very useful observation is that smallest branch-and-bound trees using variable disjunctions for a BP (P) must respect its decomposition into components to some extent. Before stating the lemma formalizing this, we would like to remind the reader that our notion of branch-and-bound trees is only concerned with showing infeasibility of the formulation with an appropriate objective cut added (cf. Sect. 2).

Lemma 19

Let \((P) = \bigoplus _{i=1}^k (P_i)\) be a decomposition of a BP into components and let T be a smallest branch-and-bound tree using variable disjunctions for (P). Then, if for some node N of T, the LP-relaxation \((P_N)\) associated to N has an optimal solution such that all variables of component \((P_i)\) attain integral values, then no descendant of N can branch on a variable from \((P_i)\).

Proof

Let \(x^i_j\), \(j \in [n_i]\), denote the variables from component \((P_i)\), \(i \in [k]\), and let y be an optimal solution of the LP-relaxation \((P)_N\) at a node N, such that \(y^i_j\) for all \(j\in [n_i]\) is integral. Assume for a contradiction that N has a descendant M branching on \(x^i_\ell \) for some \(\ell \in [n_i]\). Then we can assume w.l.o.g. that M

  1. 1.

    does not have a descendant in its \((x^i_\ell = y^i_\ell )\)-branch branching on some \(x^i_j\) with \(j\in [n_i]\) and

  2. 2.

    any variable fixing for a variable \(x^i_j\) for \(j\in [n_i]\) on the path from the root to M agrees with y.

Clearly, such a node M can be reached from N by iteratively descending to a node branching on some \(x_{j'}^i\) in the \((x^i_j= y^i_j)\)-branch of the current node, until this is no longer possible. Let L denote a leaf in the \((x^i_j = y^i_j)\)-branch at M, let \((P)_L\) denote the LP-relaxation at L, and let \(({\tilde{P}})_L\) result from further removing the fixing \(x^i_j=y^i_j\). By Observation 18, the maximal objective value in \(({\tilde{P}})_L\) is attained by a vector \(z^i\) with \(z^i_j = y^i_j\) for \(j \in [n_i]\). This means that \((P)_L\) and \(({\tilde{P}})_L\) have the same value and thus, if we replace the subtree rooted at M in T by the \(x_j^i = y_j^i\)-branch at M as indicated in Fig. 2, we obtain a branch-and-bound tree smaller than T, a contradiction.\(\square \)

Fig. 2
figure 2

The modifications to T to prove Lemma 19. Fixing \(x_j^i\) to \(y_j^i\) does not contribute to the infeasibility of the leaves in the respective branch and therefore the branching can be removed

Definition 20

For a BP or polytope (P) and \(n\in \mathbb {N}\), the n-fold multiple of (P) is defined by

  • \(1\cdot (P) {:}{=}(P)\) and

  • \(n\cdot (P) {:}{=}(n-1)\cdot (P)\oplus (P)\) for \(n>1\).

Let \({{\mathcal {P}}}\) be a subclass of the class of binary programs. We say \({{\mathcal {P}}}\) is additively closed if we have \((P) \oplus (Q) \in {{\mathcal {P}}}\) for any (P), \((Q)\in {{\mathcal {P}}}\). In particular, an additively closed class \({{\mathcal {P}}}\) of BPs contains arbitrary multiples \(n\cdot (P)\) of each \((P) \in {{\mathcal {P}}}\).

We will show that a class of BPs which contains arbitrary multiples of a BP, whose LP-relaxation has only fractional optimal solutions and for which there exists no variable that can be fixed to a binary value to achieve infeasibility of the relaxation, requires branch-and-bound trees of exponential size (Theorem 22). Before, we will briefly review some examples:

Example 21

  1. 1.

    The SteinerTree problem asks, given a graph \(G = (V,E)\) with edge weights \(w \in \mathbb {R}^E_+\) and a set of terminals \(T \subseteq V\), to find a minimal weight subtree of G spanning T. The following BP formulation is natural:

    where \(\delta (S)\) denotes the set of edges with exactly one endpoint in S. This class of BPs (over all Steiner tree instances) is additively closed: If \((G_1,T_1,w_1)\), \((G_2,T_2,w_2)\) are Steiner tree instances, then the sum of the BP formulations (\(P_St\)) for these instances is (\(P_St\)) for \((\tilde{G_1\cup G_2},T_1 \cup T_2, w)\), where \(w = (w_1, w_2)\) and \(\tilde{G_1 \cup G_2}\) is \(G_1 \cup G_2\) with some terminal in \(T_1\) identified with some terminal in \(T_2\). Note however, that the latter binary programs technically contains some additional, but redundant constraints, namely the constraints corresponding to cuts induced by node sets \(S = S_1{{\dot{\cup }}} S_2\), where \(S_i\) is a set of nodes from \(G_i\). However, such a constraint is clearly the sum of the constraints corresponding to \(S_1\) and \(S_2\), which are contained in the former binary program. Since the set of branch-and-bound trees for a binary program depends only on its feasible region and objective function, we ignore this technicality.

  2. 2.

    The traveling salesman problem (TSP) problem asks, given a graph \(G = (V,E)\) with edge weights \(w \in \mathbb {R}^E\), to find a minimal weight Hamiltonian tour in G. The most common BP formulation for TSP is via the subtour elimination polytope:

    Very similarly, the Hamiltonian s-t-path problem ( s -t -HamPath) problem asks, given G, w and two nodes s, \(t \in V\) with \(s\ne t\), to find a minimal weight Hamiltonian s-t-path in G. A BP formulation for s -t -HamPath is

    and is additively closed.

    The class of BPs (\(P_SEP\)) (over all TSP instances) is not additively closed. However, if (\({P_s-t}^1), \dots ,(P_s-t^k)\) with \(k \ge 3\) are BP formulations for s -t -HamPath instances \((G_i,s_i,t_i,w_i)\) as above, then (\({P_s-t}^1) \oplus \ldots \oplus ({P_s-t}^k)\) is a BP formulation of type (\(P_SEP\)) for the TSP instance (Gw), where \(w = (w_1, \dots , w_k)\) and G is \(\bigcup _{i=1}^kG_i\) in which every \(t_i\) is identified with \(s_{i+1}\) (with \(s_{k+1} {:}{=}s_0\)). The restriction to \(k \ge 3\) avoids multi-graphs.

  3. 3.

    The usual BP formulations to find a maximum weight matching or a minimal weight edge cover are additively closed.

Theorem 22

Let \({{\mathcal {P}}}\) be a class of BPs. Assume there is a binary program \((P) = (\max \, \{c^\top x \,:\,Ax \le b,\; x\in \{0,1\}^m\})\), such that

  1. 1.

    \(\max \,\{c^\top x \,:\,Ax \le b\}\) is not attained by an integral vector x,

  2. 2.

    for any variable \(x_i\) in (P) and any \(a\in \{0,1\}\), \(\{x \in \{0,1\}^m \,:\,Ax \le b,\; x_i=a\}\) is non-empty and

  3. 3.

    infinitely many BPs \(n\cdot (P)\) are in \({{\mathcal {P}}}\).

Then \({{\mathcal {P}}}\) requires branch-and-bound trees using variable disjunctions of size \(\Omega (2^{c\, s})\) for some constant \(c>0\), where \(s=n\,m\) is the number of variables of \(n\cdot (P)\).

Proof

Assume that A and b are integral. Let

$$\begin{aligned} \nu _\text {LP}&\,{:}{=}\,\max \, \{c^\top x\,:\,Ax \le b,\; x\in [0,1]^m\}, \\ \nu _\text {IP}&\,{:}{=}\, \max \, \{c^\top x \,:\,Ax \le b,\; x\in \{0,1\}^m\} \qquad \text { and} \\ \nu _{(x_i = a)}&\,{:}{=}\, \max \, \{c^\top x \,:\,Ax \le b,\; x_i = a,\; x\in \{0,1\}^m\}. \end{aligned}$$

For each of the 2n possible ways to fix a variable, \(x_i = a\), choose an (integral) solution \(y_{(x_i=a)}\in \{x\in \{0,1\}^n \,:\,Ax \le b,\; x_i = a\}\) attaining \(\nu _{(x_i=a)}\). Let \(\nu _- {:}{=}\min \, \{ \nu _{(x_i=a)} \,:\,i\in [n],\; a\in \{0,1\}\}\). We claim that any branch-and-bound tree T of \(n \cdot (P)\) using variable disjunctions has at least \(2^\ell \) leaves, where \(\ell {:}{=}\Big \lceil \frac{n\,(\nu _\text {LP} - \nu _\text {IP})-1}{\nu _\text {LP} - \nu _-} \Big \rceil \).

We describe \(2^\ell \) distinct leaves: Starting at the root, descend T according to the following rule:

  1. 1.

    If the current node branches on a variable from a component of \(n\cdot (P)\), for which we have not branched a variable in an earlier node, descend to an arbitrary child.

  2. 2.

    If we have already branched a variable belonging to the same component in an earlier node, say we have fixed \(x_i=a\) and are now considering a variable \(x_j\), descend to the \(x_j = (y_{(x_i=a)})_j\)-child.

The nodes of T reachable via the above descent rule form a subtree \(T'\) of T, such that leaves of \(T'\) are also leaves of T. We claim that every root-leaf path in \(T'\) contains \(\ell \) nodes with out-degree 2 in \(T'\), i.e., nodes corresponding to the first case of above procedure. This suffices, as T then contains a full binary tree of depth \(\ell \) as topological embedding and thus has at least \(2^\ell \) leaves itself.

Let p be a path in \(T'\) connecting the root to a leaf L. Recall that \(B_L\) denotes the set of points satisfying the constraints occurring on p. We observe that \(B_L \cap n \cdot (\{x \,:\,Ax\le b \}) \ne \emptyset \), since the second rule in above procedure guarantees that in every component there is a point which agrees with the branching decisions on p. Furthermore, the value of the LP-relaxation of \(n\cdot (P)\) at L is lowerbounded by \((n-k)\cdot \nu _\text {LP} + k \cdot \nu _- = n \cdot \nu _\text {LP} - k\cdot (\nu _\text {LP}-\nu _-)\), where k is the number of components for which a variable has been fixed at L. However, in order for L to be infeasible, the value of the LP-relaxation at L must be at most \(n \cdot \nu _\text {IP} +1\). Solving \(n\cdot \nu _\text {LP} - k\cdot (\nu _\text {LP}-\nu _-) \le n\cdot \nu _\text {IP} +1\) for k yields \(k \ge \frac{n\,(\nu _\text {LP} - \nu _\text {IP})-1}{\nu _\text {LP} - \nu _-} = \ell \).

Therefore, T has indeed at least  \(2^\ell \in \Omega (2^{c'n})\) leaves, where \(c' = \frac{(\nu _\text {LP} - \nu _\text {IP})}{\nu _\text {LP} - \nu _-}\). This lower bound is also exponential in the number of variables \(s = n\,m\) of \(n \cdot (P)\), since m is constant. More precisely, we have \(\Omega (2^{c'n})=\Omega (2^{cs})\) with \(c = \frac{c'}{m}\). \(\square \)

Example 23

The family of binary programs maximizing an objective function over the fractional matching polytope requires exponentially sized branch-and-bound trees using variable disjunctions: Consider \((P) = (\{ \max \,\{\mathbb {1}^\top x\,:\,x \in {{\mathcal {M}}}(C_3)\})\), i.e., the binary problem of finding a maximum cardinality fractional matching in the cycle of length 3 and apply Theorem 22. A slight modification yields the same result for a similar formulation to find a minimum weight edge cover. Similarly, the formulations to find minimum weight Hamiltonian cycles and Steiner trees from Example 21 require exponentially sized branch-and-bound trees using variable disjunctions.

5 Hardness of approximating the size of a smallest branch-and-bound tree using variable disjunctions

5.1 Boolean formulas and the exponential time hypothesis

In this section, we prove that the size \({{\mathcal {T}}} \) of a smallest branch-and-bound tree using variable disjunctions cannot be approximated within some exponential factor in subexponential time, unless the (strong) exponential time hypothesis fails.

To this end, we briefly review the statement of the (strong) exponential time hypothesis. A literal is a Boolean variable or its negation. A clause C is a disjunction of literals. A Boolean formula \(\varphi \) in conjunctive normal form (CNF) is a conjunction of clauses. Let \({{\mathcal {C}}}\) denote the set of clauses in \(\varphi \). For a clause \(C\in {{\mathcal {C}}}\) denote by \({{\mathcal {P}}}(C)\) the set of unnegated variables in C and by \({{\mathcal {N}}}(C)\) the set of negated variables in C. A CNF \(\varphi \) is a k-CNF, if every clause of \(\varphi \) contains at most k literals. Without loss of generality, every k-CNF contains at most \({n \atopwithdelims ()k} 2^k\) clauses. Let (k-)SAT denote the problem of deciding whether a (k-)CNF is satisfiable and

$$\begin{aligned} s_k {:}{=}\inf \,\{s\in \mathbb {R}_+ \,:\,k-{\textsc {SAT}} \text { can be decided in time}~O(2^{s\,n})\}. \end{aligned}$$

The exponential time hypothesis (ETH) formulated by Impagliazzo and Paturi [20] states \(s_3 > 0\) and the strong exponential time hypothesis (SETH) states \(s_\infty {:}{=}\lim _{k \rightarrow \infty } s_k = 1\). Note that \(\text {ETH}\) not only rules out polynomial algorithms for NP-hard problems, but even subexponential algorithms and therefore is a strengthening of the \(\text {P} \ne \text {NP} \)-conjecture.

5.2 Hardness of approximation

Consider the integer-free polytope \(J^n = \{x\in [0,1]^{2n} \,:\,\sum _{i=1}^{2n} 2x_i = 2n+1\}\) given by Jeroslow [21]. In Sect. 3.2 we saw that every branch-and-bound tree for \(J^n\) actually has \(2({{2n}\atopwithdelims ()n} +{{2n} \atopwithdelims (){n+1}})-1 \ge 2^{n+1}-1\) nodes.

Furthermore, for a given CNF \(\varphi \) with variables \(x_1, \dots , x_n\) and clauses \({\mathcal {C}}\), consider the polytope \(Q_\varphi \) defined by:

$$\begin{aligned} \sum _{x_i\in {{\mathcal {P}}}(C)} x_i +\sum _{x_i\in {{\mathcal {N}}}(C)}(1-x_i) \ge \tfrac{1}{2} \quad \forall C \in {{\mathcal {C}}},\qquad x_1,\dots ,x_n \in [0,1]. \end{aligned}$$
(2)

Note that we identify Boolean variables (true/false) with binary variables (0/1). Satisfying assignments of \(\varphi \) then correspond to integer points in \(Q_\varphi \).

Given \(\alpha \ge 1\), an \(\alpha \)-approximation algorithm for the function \({{\mathcal {T}}} \) computing the size of a smallest branch-and-bound tree is an algorithm A returning a natural number A(X) on input polytope X, which satisfies \({{\mathcal {T}}} (X) \le A(X) \le \alpha \cdot {{\mathcal {T}}} (X)\).

Our first result concerning the hardness of approximation is:

Theorem 24

There is no polynomial time \(\alpha \)-approximation algorithm for \({{\mathcal {T}}} \) with \(\alpha < 2^{(\frac{1}{2}-\varepsilon )n}\) for any \(\varepsilon >0\), unless \(\text {P} = \text {NP} \).

The proof is analogous to that of the next result:

Theorem 25

For any parameter \(\lambda \in \mathbb {N}\) with \(\lambda \ge 2\), there is no \(\alpha \)-approximation algorithm for \({{\mathcal {T}}} \) with \(\alpha < 2^{(\frac{1}{2}-\frac{1.5}{2\lambda +1})n}\) and running time \(O(2^{\delta n})\) with \(\delta < \frac{1}{1+2\lambda }\), unless SETH fails. Furthermore, for every \(\varepsilon >0\), there is some \(\delta >0\) such that there is no \(\alpha \)-approximation algorithm for \({{\mathcal {T}}} \) with \(\alpha < 2^{(\frac{1}{2}-\varepsilon )n}\) running in time \(O(2^{\delta n})\), unless ETH fails.

In particular, for \(\lambda = 2\), we obtain that it is not possible to approximate \({{\mathcal {T}}} \) within a factor of \(2^{\frac{1}{5}n}\) in time \(2^{\delta n}\) with \(\delta <\frac{1}{5}\), unless SETH fails. Moreover, Theorem 24 can be seen as the \(\lambda \rightarrow \infty \) case of Theorem 25.

Proof

We prove the theorem by contraposition: Assume such an algorithm exists for some \(\lambda \). We will show that we can then give a family of algorithms for SAT contradicting (S)ETH. Let a CNF \(\varphi \) with variables \(y_1, \dots , y_m\) and clauses \({\mathcal {C}}\) be given, and let \(\ell {:}{=}\lambda m\). Consider the integer-free polytope \(J \subset \mathbb {R}^n\) with \(n = 2 \ell + m=(2\lambda +1)m\) defined by

Assume that \(\varphi \) has no satisfying assignment. Then exhaustively branching on all \(y_i\) yields a branch-and-bound tree T for J due to (Jb) and T has size at most \(2^{m+1}-1\).

Conversely, assume there is an assignment \(\nu \) satisfying \(\varphi \) and we are given a branch-and-bound tree T of J. We can then obtain a branch-and-bound tree \(T'\) showing the integer-freeness of Jeroslow‘s example \(J^{\ell }\) with \(|{T'}|\le |{T}|\) from T: This can be achieved by iteratively removing a subtree of T with root node N, branching on a variable \(y_i\) and replacing it by the subtree rooted at the \(y_i = \nu _i\)-child of N, until no such nodes branching on a variable \(y_i\) remain.

Thus, we must have \(|{T}| \ge 2^{\ell + 1}-1\). Since \(\ell \ge m\), we have

$$\begin{aligned} \frac{2^{\ell +1}-1}{2^{m+1}-1} \ge \frac{2^{\ell +1}}{2^{m+1}} =2^{\ell -m} = 2^{(\lambda -1)m} = 2^{\frac{\lambda -1}{2\lambda +1} (2\ell +m)}= 2^{\frac{\lambda -1}{2\lambda +1} n}. \end{aligned}$$

Therefore, approximating the size of a smallest branch-and-bound tree for (J) within factor \(2^{\frac{\lambda -1}{2\lambda +1}n} = 2^{(\frac{1}{2}-\frac{1.5}{2\lambda +1})n}\) decides whether \(\varphi \) is satisfiable. Recalling that \(n=(2\lambda +1)m\), we see that an \(O(2^{\delta n})\)-time algorithm with \(\delta < \frac{1}{2\lambda +1}\) for this task is an \(O(2^{(2\lambda +1)\delta m})\)-time algorithm for k-SAT where \((2\lambda +1)\delta < 1\), and thus contradicts SETH, which proves the first part of the statement. Note that this works for any k.

Furthermore, choose \(\lambda \) with \(\smash {\tfrac{1.5}{2\lambda +1}}<\varepsilon \) for some fixed \(\varepsilon > 0\). Then an \(O(2^{\delta n})\)-time algorithm approximating \({{\mathcal {T}}} \) within factor \(2^{(\frac{1}{2}-\varepsilon )n}\), yields an \(O(2^{\delta ' m})\)-time algorithm for 3-SAT for \(\delta ' {:}{=}(2\lambda +1)\delta \). If such an algorithm exists for every \(\delta >0\) (and therefore every \(\delta '\)), this violates ETH, which yields the second part of the statement. \(\square \)

Remark 26

Note that an approximation algorithm is not allowed to underestimate the size of a smallest tree. This is natural for producing feasible solutions of optimization problems, which cannot be better than the optimum. In the context of this paper, however, small trees do not necessarily need to be output and we might be interested in approximations which have two-sided error, i.e., we require only \(A(X) / \alpha \le {{\mathcal {T}}} (X) \le A(X) \cdot \alpha \) for the output of our algorithm.

In this case, instead of a \(r {:}{=}2^{(\frac{1}{2}-\frac{1.5}{2\lambda +1})n}\)-approximation algorithm, we require an \(\sqrt{r}\)-approximation algorihm to distinguish between the two cases in above proof. Thus, the hardness results in this section hold for algorithms with two-sided errors as well, but only when the required approximation ratio is replaced by its root (or equivalently the exponent has to be halved).

We note that in the conference version of this paper [17], the results of this section have been stated for two-sided errors, but are actually correct only for the one-sided definition above.

Remark 27

It is easy to see that any vertex of J has value \(\tfrac{1}{2}\) for some y-variable, if \(\varphi \) is not satisfiable, even if some other y-variables already have been fixed. Thus, if \(\varphi \) is not satisfiable and we employ most-infeasible branching (with respect to any basic solution) with the tie-breaking rule \(y_1>\dots> y_m>x_1>\dots >x_{2\ell }\), we branch only on y-variables and thus still produce a branch-and-bound tree of size at most \(2^{m+1}-1\). Similarly, if we consider the objective \(\min \sum _i y_i\), and employ strong branching with a reasonable scoring function (see Sect. 2.3) and with the same tie-breaking, we still branch only on y-variables, since branching on an x-variable results in gain 0 in both resulting children. Therefore, we produce a branch-and-bound tree of size at most \(2^{m+1}-1\) in case \(\varphi \) is not satisfiable. Thus, if \({{\mathcal {M}}} (I)\) denotes the size of the branch-and-bound tree produced by variable branching using strong branching or most-infeasible branching with the above tie-breaking rule, Theorems 24 and 25 also hold with \({{\mathcal {T}}} \) replaced by \({{\mathcal {M}}}\). Moreover, Theorems 24 and 25 hold for LP-based branch-and-bound trees.

Remark 28

Note that if we allow polytopes with an exponential number of facets, we can replace \(J^n\) by \(P^n {:}{=}\{x \in [0,1]^n \,:\,\sum _{i\in R}x_i + \sum _{i\not \in R}(1-x_i)\ge \tfrac{1}{2}~\forall R \subseteq [n]\}\), for which any branch-and-bound tree with variable disjunctions must be a complete binary tree of depth n (Proposition 4). We can then strengthen the factor \(\frac{1}{2}-\frac{1.5}{2\lambda +1}\) in the exponent of the approximation ratio in the first part of Theorem 25 to \(\smash {1-\frac{2}{\lambda +1}}\) and the factor of \(\smash {\frac{1}{2}-\varepsilon }\) in the second part to \(1-\varepsilon \). Note \(P^n\) is separable in polynomial time: Given a point \({\hat{x}} \in [0,1]^{n}\), the set \({\hat{R}} {:}{=}\{i \in [n] \,:\,{\hat{x}}_i\le \tfrac{1}{2}\}\) minimizes the left hand side over all constraints. Thus, \({\hat{x}} \in P^n\) if and only if \({\hat{x}}\) satisfies \(\sum _{i\in {\hat{R}}}{\hat{x}}_i + \sum _{i\not \in {\hat{R}}}(1-{\hat{x}}_i) \ge \tfrac{1}{2}\).

6 #P-hardness of exactly computing the size of a smallest tree

We first review #P-hardness as introduced by Valiant [28, 29], which we will use to bound the hardness of finding a smallest branch-and-bound tree using variable disjunctions from below:

Definition 29

A function \(g:\{0,1\}^*\rightarrow \mathbb {N}\) is in #P if there exists a polynomial \(p:\mathbb {N}\rightarrow \mathbb {N}\) and a polynomial time Turing machine M such that for every \(x\in \{0,1\}^*\) we have

$$\begin{aligned} g(x) = |{\{y\in \{0,1\}^{p(|{x}|)} \,:\,M \text {accepts on input}~(x,y)\}}|. \end{aligned}$$

So #P is the class of functions which count certificates of some NP-problem. For example, #SAT is the function that on input of a Boolean formula returns the number of satisfying assignments.

We will identify natural numbers with their binary representation and let \(\text {FP} ^f\) denote the class of functions \(g:\{0,1\}^* \rightarrow \mathbb {N}\) which are computable in polynomial time by a Turing machine with access to an oracle computing f. A function f is (weakly) #P-hard, if \({\#P} \subseteq \text {FP} ^f\). Because the Cook-Levin reduction preserves the number of certificates, #SAT is #P-complete, see, e.g., [5, Theorem 17.10].

Our main result is:

Theorem 30

Computing \({{\mathcal {T}}} \) for 0/1-polytopes is (weakly) #P-hard, even when restricted to (weighted) fractional matching polytopes with an extra objective cut.

The following proof is simpler than the one given in the conference version of this paper [17] and shows that computing \({{\mathcal {T}}} \) is hard already for the matching polytope. However, the original proof showed strong #P-hardness, i.e. #P-hardness, even if the size of all occurring numbers is bounded by a polynomial.

The proof begins by showing that there are binary programs using the fractional matching polytope for which the gain, i.e., the improvement in the LP-bound, when branching on a variable is mostly independent of previous branching decisions, i.e., we show that there are binary programs for which the abstract branching model by Le Bodic and Nemhauser [25] is essentially accurate. We could then rely on the fact that computing the size of a smallest branch-and-bound tree in their model is (weakly) #P-hard. To avoid repeating their definitions, we instead rely directly on the #P-hardness of #Knapsack, which we define below. Haase and Kiefer [18] have shown (weak) #P-hardness of the equivalent decision version of #Knapsack, which they call KthLargestSubset due to historical reasons. We note, however, that finding the size of a smallest branch-and-bound tree in the abstract model in fact reduces to finding the length of a smallest (non-abstract) branch-and-bound tree (cf. Remark 35).

Definition 31

#Knapsack

Input::

item weights \(a_1,\dots ,a_{n}\in \mathbb {N}\) and a capacity \(C\in \mathbb {N}\).

Output::

\(|{ \{B \subseteq [n] \,:\,\sum _{i\in B} a_i\}\le C}|\)

We will assume that \(C \le a_1+\dots +a_n\), since the problem is trivial otherwise.

It is easy to see that \(\#\textsc {SubsetSum} \in \text {FP} ^{\#\textsc {Knapsack}}\) and the former problem is (weakly) #P-hard, since the usual chain of reductions from SubsetSum to SAT preserves the number of certificates. Thus, #Knapsack is (weakly) #P-hard; for a formal proof, see [18].

For a #Knapsack-instance given by \(a_1,\dots ,a_{n} \in \mathbb {N}\) and \(C\in \mathbb {N}\), we add an \((n+1)\)st element with weight \(a_{n+1}=0\), let \(K{:}{=}\sum _{i=1}^{n+1} a_i +1\), and consider a graph \(G =(V,E)\) with edge weights \(w: E \rightarrow \mathbb {R}\), such that:

  1. 1.

    G is the disjoint union of \(n+1\) triangles \(C_3\). Every triangle corresponds to an element \(a_i\) of A and is denoted \(C^i\). The edges in \(C^i\) will be denoted by \(e^{1}_i\)\(e^{2}_i\), and \(e^{3}_i\).

  2. 2.

    The weights w are defined by

    $$\begin{aligned}w_i^j \,{:}{=}\, w(e_i^j) \,{:}{=}\, {\left\{ \begin{array}{ll} 2K+a_i, &{}\text {if}~j \in \{1,2\},\\ 2K+2a_i, &{}\text {if}~j = 3, \end{array}\right. }\end{aligned}$$

for \(i\in [n+1]\).

Let \({{\mathcal {M}}}(G)\) denote the fractional matching polytope associated to G:

$$\begin{aligned}&{{\mathcal {M}}}(G) \,{:}{=}\, \{x \,:\,x_i^1+x_i^2 \le 1,\; x_i^2+x_i^3 \le 1, \; x_i^3+x_i^1\\ {}&\le 1,\; x_i^1,\; x_i^2,\; x_i^3 \ge 0 \; \forall i \in [n+1]\}, \end{aligned}$$

where variable \(x^j_i\) corresponds to edge \(e_i^j\).

Lemma 32

Let T be a smallest branch-and-bound tree proving integer-freeness of \({{\mathcal {M}}}(G,W) {:}{=}{{\mathcal {M}}}(G) \cap \{w^\top x \ge W\}\) for some \(W \in \mathbb {R}\). Then for any i, on every root-leaf path of T there is at most one node branching on \(x_i^1\)\(x_i^2\) or \(x_i^3\).

Proof

We note that \(\{x_i^1,x_i^2,x_i^3\}\) is a component of \({{\mathcal {M}}}(G,W)\) which is the fractional matching polytope \({{\mathcal {M}}}(C_3)\) for a triangle \(C_3\). The statement then follows from Lemma 19 and the fact that \({{\mathcal {M}}}(C_3)\) becomes integral after fixing some edge to an arbitrary value, since the remaining instance is bipartite.\(\square \)

Lemma 33

Let \({{\mathcal {M}}}(G)_N\) be obtained from \({{\mathcal {M}}}(G)\) by fixing some variables and assume that for every i at most one of \(x_i^1\)\(x_i^2\) and \(x_i^3\) has been fixed. If we define

$$\begin{aligned} I^+&\,{:}{=}\,\{i \in [n+1] \,:\,(x^3_i \,\text {has been fixed to}~1) \text { or }(x^1_i \text { or}~x^2_i \text { has been fixed to}~0) \}\\ I^-&\,{:}{=}\,\{i \in [n+1] \,:\,(x^3_i \text { has been fixed to}~0) \text { or }(x^1_i \text { or}~x^2_i \text { has been fixed to}~1) \}, \end{aligned}$$

then the LP-gain achieved by fixing the variables is

$$\begin{aligned} {\text {gain}}_w({{\mathcal {M}}}(G)_N)&{:}{=}\max \,\{w^\top x \,:\,x\in {{\mathcal {M}}}(G)\} - \max \,\{w^\top x \,\,:\,\, x\in {{\mathcal {M}}}(G)_N\} \\&= \sum _{i\in I^+} (K+a_i) + \sum _{i\in I^-}K. \end{aligned}$$

Proof

We again note that any fractional matching in G is the sum of fractional matchings in the connected components of G.

We consider the matching polytope for \(C_3\) with edges \(e_i^1\), \(e_i^2\)\(e_i^3\) and edge weights \(w_i^1 = w_i ^2 = 2K + a_i\) and \(w_i^3 = 2K+2a_i\). It is easy to see that the weight of a maximum weight fractional matching is \(\frac{1}{2}(6K +4a_i)\) (note \(w_i^1+ w_i^2 = 4K+2a_i \ge 2K+2a_i = w_i^3\)). If we fix \(x^j_i\) to some binary value, then the weight of the maximum weight fractional matching reduces to

$$\begin{aligned} \begin{array}{ll} 2K+2a_i = w^3_i, &{}\qquad \qquad \hbox {if we fixed~} x_i^1 \hbox {or~} x_i^2 \text { to}~0,\\ 2K+a_i= w^1_i = w^2_i, &{}\qquad \qquad \hbox {if we fixed~} x_i^1 \hbox {or~} x_i^2 \text { to}~1,\\ 2K+a_i = w^1_i = w^2_i, &{}\qquad \qquad \text {if we fixed}~x_i^3 \text { to}~0\, \text {and} \\ 2K+2a_i = w^3_i, &{}\qquad \qquad \hbox {if we fixed~} x_i^3 \hbox {to~} 1. \end{array} \end{aligned}$$

Thus, fixing \(x^3_i\) to 1 or \(x^j_i\) with \(j\in \{1,2\}\) to 0, reduces the weight of a maximum weight fractional matching by \(\frac{1}{2}(6K+4a_i) - (2K+2a_i) =K\) and fixing \(x^3_i\) to 0 or \(x^j_i\) with \(j\in \{1,2\}\) to 1, reduces it by \(\frac{1}{2}(6K+4a_i) - (2K+a_i) =K+a_i\). Taking the sum over all connected components gives the claim. \(\square \)

The following proof imitates a proof of Le Bodic and Nemhauser [25] for their abstract setting.

Proof

(of Theorem 30) For an instance of #Knapsack given by \(a_1,\dots ,a_n \in \mathbb {N}\) and \(C\in \mathbb {N}\), we consider \({{\mathcal {M}}}(G,W) {:}{=}{{\mathcal {M}}}(G) \cap \{w^\top x \ge W\}\), where

$$\begin{aligned} W = \max \,\{w^\top x \,:\,x\in {{\mathcal {M}}}(G)\} - nK- C = \sum _{i=1}^{n+1} (3K +2a_i) -nK - C. \end{aligned}$$

Let T be a smallest branch-and-bound tree for \({{\mathcal {M}}}(G,W)\). To show integer-freeness of \({{\mathcal {M}}}(G,W)\), the tree T must show that the weight of any integral matching in G has weight at most the weight of a fractional matching minus \(nK+C\). By Lemma 32, every variable occurs at most once on every root-leaf path in T. Furthermore if we assume \(a_1 \ge a_2 \ge \dots \ge a_{n+1} = 0\), by Lemma 33, we can assume that at every node we branch on a variable from \(X_i {:}{=}\{x_1^i, x_2^i,x_3^i\}\) for the minimal value of i such that we have not branched on a variable from \(X_i\) before. Then every node at depth i which is not a leaf branches on a variable from \(X_{i+1}\). In particular, the variables in \(X_{n+1}\) are only branched on at level n. By Lemma 33, no node strictly above depth n is infeasible. Furthermore, the map \(N \mapsto I_N^+\) from nodes at depth n to \(2^{[n]}\) is a bijection, where

$$\begin{aligned}I_N^+ {:}{=}\{i \in [n] \,:\,(x^3_i \text { has been fixed to}~1) \text { or }(x^1_i \text { or}~x^2_i \text { to}~0) \text { in}~{{\mathcal {M}}}(G,W)_N \} \end{aligned}$$

and \({{\mathcal {M}}}(G,W)_N\) is the problem associated to N. If \(I_N^- {:}{=}[n]{\setminus } I_N^+\) and \({{\mathcal {M}}}(G)_N\) is obtained from \({{\mathcal {M}}}(G,W)_N\) by removing the objective cut \(w^\top x \ge W\), then we have by Lemma 33:

$$\begin{aligned} \max \,\{w^\top x \,:\,x\in {{\mathcal {M}}}(G)_N\}&= \max \,\{w^\top x \,:\,x\in {{\mathcal {M}}}(G)\} - \sum _{i\in I_N^+} (K+a_i) - \sum _{i\in I_N^-} K \\&= \sum _{i=1}^{n+1} (3K +2a_i) - nK - \sum _{i\in I_N^+} a_i = W + C - \sum _{i\in I_N^+}a_i. \end{aligned}$$

Thus, N is feasible if and only if \(\sum _{i\in I_N^+}a_i\le C\). Nodes at depth \(n+1\) are always infeasible, since \((n+1)K > nK+C\). Therefore, T has \(2^n + |{ \{B \subseteq [n] \,:\,\sum _{i\in B} a_i \le C\}}|\) leaves, hence \({{\mathcal {T}}} ({{\mathcal {M}}}(G,W)) = 2(2^n + |{ \{B \subseteq A \,:\,\sum _{i\in B} a_i\le C \}}|)-1\). This shows that #Knapsack is Turing-reducible to \({{\mathcal {T}}} \). By the (weak) #P-hardness of #Knapsack, \({{\mathcal {T}}} \) is (weakly) #P-hard as well. \(\square \)

Remark 34

The branching rule described in the proof of Theorem 30 coincides with most-infeasible branching with the tie-breaking \(x_1^1> x_1^2> x_1^3> x_2^1 \dots > x_{n}^3\), since every optimal solution of the linear relaxation has value \(\frac{1}{2}\) for variables \(x_i^1\), \(x_i^2\), \(x_i^3\), where i is the smallest index such that none of these variables has been branched on before. Also, most-infeasible branching only branches on a single variable from every component on every root-leaf part, since the subproblem remaining in a component becomes integral, once a variable of this component has been branched on. Thus, computing the size of the branch-and-bound tree produced by most-infeasible branching is #P-hard. Moreover, Theorem 30 also holds for LP-based branch-and-bound trees. Furthermore, the branching rule described in the proof of Theorem 30 coincides with strong branching with some reasonable evaluation rule, since any such rule prefers the gains \((K+a_i, K)\) from branching on \(x_i^j\) with \(j\in [3]\) over the gains \((K+a_k,K)\) of branching on \(x_k^j\), if \(a_i > a_k\). Also strong branching branches only on a single variable from each component on every root-leaf path. Thus, computing the size of the branch-and-bound tree produced by strong branching is #P-hard as well.

Remark 35

We note that more generally, finding a smallest branch-and-bound tree using variable disjunctions generalizes the problem of finding a smallest branch-and-bound tree in the abstract model by Le Bodic and Nemhauser [25] in the following sense: For every instance I of the GeneralVariableBranching problem (as defined in [25]), given by abstract variables encoded via their gains \((l_i,r_i)\) with multiplicities \(m_i\) for \(i \in [n]\) and objective K, there is a polytope \(P_I \subset \mathbb {R}^{3\,m}\) with \(m= \sum _{i=1}^nm_i\), such that a smallest branch-and-bound tree for \(P_I\) has the same size as a smallest branch-and-bound tree for I closing a gap of K. It suffices to consider \(P_I = {{\mathcal {M}}}(G,W)\), where G is a disjoint union of m triangles, such that G contains \(m_i\) triangles for every variable i with edge weights \(r_i+l_i\), \(r_i+l_i\), \(2r_i\) and \(W {:}{=}\sum _i m_i\cdot \frac{1}{2}(4r_i+2l_i)-K\). Then the LP-gains in the children resulting from branching on an edge are \(l_i\) and \(r_i\). The proof is then analogous to that of Theorem 30.

Remark 36

Note that the above proof of Theorem 30 falls slightly short of showing that it is #P-hard to compute the size of branch-and-bound trees for BPs maximizing some non-negative objective function over the fractional matching polytope. This is because according to (1) verifying (P) only requires us to prove integer-freeness of \(\{x \in [0,1]^n \,:\,w^\top x\ge W, \, Ax\le b\}\) with \(W=\nu ^\star +1\) for every (P) of the form \(\max \,\{w^\top x \,:\,Ax\le b,\, x\in \{0,1\}^n\}\) and we chose a different value for W in our construction. However, this can easily be remedied by introducing another triangle into our graph, such that a fractional matching on this triangle has value \(3L + W - (\nu ^\star +1)\) and branching on any edge of this triangle results in an LP-gain of \(L+W-(\nu ^\star +1)\) in both children, where \(L \gg M \) is a constant.

To complement Theorem 30, we observe that for  0/1-polytopes P we can compute \({{\mathcal {T}}} (P)\) in polynomial space, see Algorithm 1. Note that the depth of recursive calls in Algorithm 1 never exceeds n and all occurring numbers are bounded by \(2^n\), so it is indeed a polynomial space algorithm. This shows:

Proposition 37

When restricted to 0/1-polytopes, \({{\mathcal {T}}} \) can be computed in polynomial space.

figure b

7 (Non-)automatizability of branch-and-bound

In this section we sketch how results about automatizing finding shortest so-called treelike resolution refutations transfer to the case of branch-and-bound proofs.

Treelike resolution is a proof system which certifies the infeasibility of Boolean formulas in CNF. Treelike resolution can be seen as a branch-and-bound procedure, where we branch by fixing a variable to either 1 (true) or 0 (false) and prune only nodes in which the already fixed variables falsify a clause, i.e., we prune a node N, if there is a clause C, such that we have fixed \(x = 0\) for all \(x \in {{\mathcal {P}}}(C)\) and \(x=1\) for all \(x \in {{\mathcal {N}}}(C)\) in the subproblem associated with N [24, Section 5.2]. A treelike resolution proof or refutation is a tree produced by this procedure.

Automatizability  It is easy to see that not all infeasible instances of an NP-complete problem can have proofs of size bounded by some polynomial, unless \(\text {NP} = \text {coNP} \). Therefore, the reason for the inability of polynomial-time algorithms to find shortest proofs might be the size of these proofs instead of the computational difficulty of finding them. To investigate this possibility, Bonet et al. [11] introduced the concept of automatization. A proof system is called automatizable (quasi-automatizable), if given an infeasible instance I, an infeasibility proof in this system can be produced in time polynomial (quasipolynomial) in the encoding length of I and the size of a shortest proof of I.

Beame and Pitassi [9] proved the positive result that branch-and-bound using variable disjunctions is quasi-automatizable for binary programs, for which we reproduce the proof for completeness:

Consider the following recursive procedure R(Sn), which generates a branch-and-bound tree for a given integer-free polytope P with n variables, if there is a tree of size at most S. Among the \(2\,n\) possible fixings of a variable in P, find a fixing \(x_i = a \in \{0,1\}\), such that \(R(S/2,n-1)\) succeeds on \((P \cap \{x_i=a\})\) by trial-and-error. This works if there exists a tree T of size at most S, because one of the branches at the root node of T has size at most S/2. Then apply \(R(S,n-1)\) to \((P \cap \{x_i = 1-a\})\), which succeeds if there exists a tree of size at most S and because \({{\mathcal {T}}} \) is non-increasing with respect to fixing variables. The resulting running time recursion \(t(S,n) = 2\,n \cdot t(S/2,n-1) + t(S,n-1)\) solves to roughly \(n^{\log S}\). We can then try R(Sn) for increasing values of S until it succeeds. \(\square \)

We now observe that the treelike resolution refutations of an infeasible Boolean formula \(\varphi \) are the branch-and-bound trees using variable disjunctions of \(Q_\varphi \), defined in (2). Indeed, in both cases, a node is infeasible if and only if the already fixed variables falsify a clause.

Thus, we can replace “treelike resolution” by “branch-and-bound using variable disjunctions” in a result by Alekhnovich and Razborov [4], which has been strengthened by Eickmeyer et al. [16] to obtain: Branch-and-bound using variable disjunctions is not automatizable unless \(\text {FPT} = \text {W[P]} \). Here, \(\text {FPT} \subseteq \text {W[P]} \) may be seen as an analogue of the question whether \(\text {NP} \subseteq \text {P} \) in the world of parameterized problems.

Notably, the recent breakthrough of Atserias and Müller [6], showing non-automatizability of general resolution under the weaker and optimal assumption \(\text {P} \ne \text {NP} \), likely does not transfer to branch-and-bound using variable disjunctions (or treelike resolution). In fact, their approach – to show that it is NP-hard to distinguish between instances with a refutation of size bounded by some polynomial and instances with refutations of at least exponential size – likely does not transfer; otherwise, quasi-automatizability of branch-and-bound using variable disjunction would cause the exponential time hypothesis to fail [4, 6].

However, we might be able to give automatizability results for interesting families of binary problems, for example Proposition 14 yields:

Proposition 38

Branch-and-Bound using variable disjunctions is automatizable when restricted to SubsetSum-instances.

We are currently unable to give a similar result for any other family of binary programs describing some combinatorial optimization problem. However, we consider this as evidence that automatizability might indeed be a useful notion formalizing our ability to make good branching decisions.

8 LP-based versus arbitrary trees using variable disjunctions

We show that there can be an exponential gap (in the number of nodes) between the smallest LP-based branch-and-bound tree using variable disjunctions and the smallest branch-and-bound tree using any variable disjunctions.

We note that Dey et al. [15] have independently shown a stronger statement, although in a less direct way: They show that for any ILP, there exists an extended formulation which exhibits this property.

The instance we will analyze consists of an even number m of copies of the example by Jeroslow \(J^m = \{x\,:\,\sum _{j=1}^{2\,m}2x_i^j=2\,m+1,\; x\in [0,1]^{2\,m}\}\), \(m \ge 2\), joined together by a set of inequalities which require at least one of the copies to be feasible. More precisely, consider the polytope P defined by the inequalities

$$\begin{aligned} \begin{array}{ll} \displaystyle \sum _{i=1}^m 2y_i + \sum _{i=1}^m z_i \ge (m+1), &{}\qquad \qquad \\ y_i +z_i \le 1 &{}\qquad \qquad \forall i \in [m],\\ \displaystyle \sum _{j=1}^{2m} 2x_i^j = y_i(2m+1) &{}\qquad \qquad \forall i\in [m],\\ \displaystyle x_i^j,y_i,z_i \in [0,1] &{}\qquad \qquad \forall i \in [m],\, j\in [2m]. \end{array} \end{aligned}$$

Lemma 39

There exists a branch-and-bound tree for P with at most \(m\cdot 2^{2m}\) leaves that possibly branches on variables with integral value in the LP-relaxation.

Proof

Let T be a smallest (i.e., any) branch-and-bound tree showing integer-freeness of \(J^m\) and let \(K = {{2\,m}\atopwithdelims ()m}+ {{2\,m}\atopwithdelims (){m+1}} \le 2^{2\,m}\) be the number of its leaves. Consider the branch-and-bound tree for P obtained by employing the following strategy:

  1. 1.

    Branch on some \(y_i\). The \(y_i=1\) branch can be exhausted by branching the \(x_i^j\) according to T. Repeat this procedure in the \((y_i=0)\)-branch for another \(y_i\).

  2. 2.

    Once we have \(y_i=0\) for all \(i\in [m]\), we have proved infeasibility of the whole instance.

This yields a branch-and-bound tree for P with at most \(m\cdot 2^{2m}\) leaves. \(\square \)

Lemma 40

The number of leaves of any LP-based branch-and-bound tree for P is at least \(2^{m\cdot (m+1)}\).

Proof

Assume we fix some subset of the variables \(x_i^j\)\(y_i\)\(z_i\) to binary values and assume \(B_N \subseteq \mathbb {R}^{m^2+2m}\) is the set of points satisfying these fixings. Let \(({{\tilde{x}}},{{\tilde{y}}},{{\tilde{z}}})\) denote an optimal solution of \(\max \,\{ \sum _{i=1}^m 2y_i + \sum _{i=1}^m z_i \,:\,(x,y,z) \in P \cap B_N\} {=}{:}\nu \). It is easy to see that if for some \(i \in [m]\) neither \(y_i\) nor \(z_i\) have been fixed and at most \(m-1\) of the \(x_i^j\) for \(j \in [2\,m]\) have been fixed, then we have \({{{\tilde{y}}}}_i = 1\) and \({\tilde{z}}_i=0\). Thus, in any LP-based tree, we only branch on \(y_i\) or \(z_i\) if at least m of the variables in \(\{x_i^j\,:\,j\in [2\,m]\}\) have been previously branched on. Furthermore, along every root-leaf path, we branch on only one of \(z_i\) and \(y_i\) for every i in a smallest branch-and-bound tree. Moreover, we have \(\nu < m+1\) (i.e., \(P\cap B_N = \emptyset \)), only if, for each \(i\in [m]\), either \(y_i\) or \(z_i\) has been fixed. Thus every leaf in in a branch-and-bound-tree using variable disjunctions for P has depth at least \(m\cdot (m+1)\). \(\square \)

Let \(n = 2m^2+2m = 2m(m+1)\) denote the number of variables in P. The ratio between the number of nodes of a smallest branch-and-bound tree and a smallest LP-based branch-and-bound tree is at least \(\frac{2^{m^2+m}}{m 2^{2m}} = 2^{\frac{n}{2} - \Omega (\sqrt{n})}~\). If we allow integer programs with an exponential number of constraints, we can again replace \(J^n\) by \(P^n\) (see Remark 28) and improve this bound to \(2^{n - \Omega (\sqrt{n})}~\).

9 Outlook

Firstly, we currently know \(P^{\#SAT} \subseteq P^{{\mathcal {T}}} \subseteq \text {PSPACE} \). It remains to determine the exact complexity of \({{\mathcal {T}}} \). Secondly, we are interested in giving a criterion on the cases in which hiding sets give good bounds on the size of a smallest branch-and-bound tree. Thirdly, we ask whether there are more classes of polytopes for which branch-and-bound is automatizable. This may yield branching rules which are either heuristically or provably ‘good’ for certain classes of problems.