41 Counterexamples to property (B) of the discrete time bomber problem

The discrete time “bomber problem” has been one of the longest standing open problems in operations research. In particular, the validity of one of the natural monotonicity conjectures—known as property (B)—has been an unresolved issue since 1968. In this paper we report 41 counterexamples to property (B) of this problem. We have found them by computing the exact solutions for nearly one million pairs of parameter values utilizing the GNU multiple precision arithmetic library. All our counterexamples can readily be verified using a simple Mathematica program included in this paper.


Introduction
At Professor Richard Weber's home page, 1 the discrete time bomber problem appears at the top of his list of unsolved problems in operations research. In this problem, a bomber with n ∈ N anti-aircraft missiles must survive t ∈ N hours before reaching its destination. In each hour, it encounters an enemy plane with probability r . The bomber survives for sure if it encounters no enemy plane. In the event of encountering an enemy plane, it survives with probability 1 − q k if it fires k missiles at the enemy plane. The objective is to maximize the probability of reaching the destination.
This problem can easily be solved numerically by dynamic programming, or backward induction. For this purpose, let N ∈ N and T ∈ N be the largest numbers of missiles n and hours t to be considered. Define p(n, 0) = 1, ∀n ∈ {0, . . . , N }. (1) Let p(n, t) be the optimal survival probability when the bomber has n missiles with t hours to go. Then for all n = 0, . . . , N and t = 1, . . . T , p(n, t) satisfies where v(n, t) = max k∈{0,...,n} (1 − q k ) p(n − k, t − 1).
The following three monotonicity properties have been extensively studied in the literature: (A) k(n, t) is nonincreasing in t.
(B) k(n, t) is nondecreasing in n.
(C) n − k(n, t) is nondecreasing in n.
The above problem was originally formulated in continuous time by Klinger and Brown (1968), who proved property (C) for the original continuous time model. They proved (A) assuming (B), and left (B) as an unsolved problem: It seems intuitively obvious that k(n, t) ≥ k(n − 1, t); that is, with a larger supply one is always willing to make at least as generous an allocation. The extensive tables we computed have confirmed this conjecture. However, determined efforts by a number of people at RAND have failed to yield a rigorous proof that this is indeed the case. (Klinger and Brown 1968, p. 182, instead of k in the original) Subsequently, Samuel (1970) proved (A) without assuming (B), but "found no proof of (B)." Simons and Yao (1990) formulated the problem in discrete time, proving (A) and (C) for the discrete time case (Simons and Yao 1990, Lemma 1, Corollary 1). They noted that a proof of (B) was "elusive," but their numerical work supported the validity of (B): Already, we have numerically 'verified' Conjecture B for tens of thousands of randomly generated pairs (q, r ). Mostly, these were checked for t ≤ 12 and n ≤ 20, but some larger values of t and n were checked when q is not too small. The truth of Conjecture B was always supported, except for a very few instances when unavoidable difficulties with round-off errors were clearly indicated, because of an extreme value of q or r . Weber (2013, p. 199) also noted that no counterexample to (B) had been found "despite a truly enormous amount of computational experimentation." He summarized the status of (B) as follows: Open problem for the bomber Despite 40 years of research, it is still not known if (B) is true for the bomber problem. So far as I know, the best we can say about (B) is that k(n+1, t) ≥ k(n, t) if either n ≤ 3 or t ≤ 3, and also that k(n, t) = 1 ⇒ k(n−1, t) = 1 for all t. (Weber (2013, p. 192), italics in the original) In this paper we close this open problem by reporting 41 counterexamples to (B). In the next section, we briefly discuss "unavoidable difficulties with round-off errors" associated with floating point numbers. In Sect. 3 we introduce an error-free algorithm consisting only of integer addition, subtraction, multiplication, and comparison. Implementing this algorithm in C with the GNU Multiple Precision (GMP) Arithmetic Library to solve the problem for all q, r ∈ {0.001, 0.002, . . . , 0.999}, we have found 41 counterexamples to property (B). We have also obtained the identical results by solving the problem with rational numbers for the same set of (q, r ) values using the GMP library. All our counterexamples can readily be verified using a simple Mathematica program provided in this paper. In Sect. 4 we discuss the robustness of our examples.
In closing the introduction, we should mention that various problems related to the bomber problem are still actively studied (e.g., Bartroff et al. 2010;Bartroff and Samuel-Cahn 2011;Elguedria et al. 2013;Krieger and Samuel-Cahn 2013). We refer the reader to Weber (2013) for an excellent survey of the literature surrounding the bomber problem.

Difficulties
Algorithm 1 shows pseudocode for the dynamic programming procedure specified by (1)-(4). Throughout the paper we fix N and T as follows: We have implemented Algorithm 1 in C with 64 bit "long double" precision for all q, r ∈ {0.01, 0.02, . . . , 0.99}.
In the solutions obtained, there are many numerical violations of properties (A), (B), and (C) even though (A) and (C) are known to be true. More specifically, there are 25,802 quadruples (q, r, n, t) violating (A), 29,584 quadruples violating (B), and 2,381 quadruples violating (C). These numbers are unstable, depending on the system and software used to implement the algorithm.  In fact, even a very elementary property of k(n, t) is violated in Fig. 1. To see this, note from (1) and (3) that This simply means that if the bomber encounters an enemy plane in the last hour, it should fire all the avaiable missiles. This obvious property is clearly violated in panel (a). In this example, we have max n,t∈{1,...,100} k(n, t) = 9 even though (7) requires that k(100, 1) = 100. This is because 1 − 0.01 k is rounded to 1 for all k ≥ 9 in C with long double precision, which implies that the strict inequality in line 2 of Algorithm 1 is never satisfied for any k > 9.

Error-free methods
Numerical errors are unavoidable as long as floating point numbers are used. However, there are several ways to implement Algorithm 1 without introducing numerical errors. For example, it is possible to compute k(n, t) by using only integers, provided that both q and r are rational numbers. To be more specific, suppose that there are integers Q, R, B ∈ N such that As in the original problem, define For n ∈ Z + and t ∈ N, define P(n, t) recursively as follows: where Algorithm 2: An error-free algorithm for the bomber problem Equation (10) can be obtained by multiplying both sides of (2) by B (N +1)t . Thus P(n, t) and Hence the solution of (11) is identical to that of (3). A useful feature of this equivalent formulation is that as long as P(n, t − 1) is an integer for each n = 0, . . . , N , so is P(n, t).
Algorithm 2 shows pseudocode for the procedure given by (9)-(11). All variables remain integers throughout the algorithm; the problem is that they can be extremely large. Fortunately, arbitrarily large integers can be handled using the GMP library. Figure 2 shows the exact optimal policy computed by implementing Algorithm 2 in C with this library. This policy corresponds to that in Fig. 1. In sharp contrast to Fig. 1, panel (b) in Fig. 2 shows that both (A) and (B) are clearly satisfied; panel (a) shows that (7) is also satisfied.
To investigate the validity of (B), we have implemented Algorithm 2 in the same way for all (q, r ) given by (6) Table 1, which shows all the quadruples (Q, R, n, t) for which k(n, t) < k(n − 1, t). These k values are also reported in the table. Figure 3 shows combinations of parameter values for which (B) is violated. Note from this figure and Table 2 that all the (q, r ) pairs lie in the region [0.4, 1] × [0.8, 1], and that the values of t are restricted to 4, 5, and 6. Since the smallest values of n, t, and k(n, t) in Table 1 are 31, 6, and 12, respectively, our counterexamples are consistent with Weber (2013) results quoted in the introduction. Observe that for each (Q, R) in Table 1, there is exactly one violation of (B). Hence a violation of (B) is an exception even for the (Q, R) pairs in the table, for each of which there are 100 2 − 1 pairs of (n, t) values satisfying (B). It is worth noting that there are only 41 violations of (B) out of 999 2 × 100 2 quadruples of (Q, R, n, t) values. It took approximately 33 h to test all the quadruples against (A), (B), and (C) using Algorithm 2 on a dedicated Linux workstation with dual Intel Xeon E5-2699v3 2.30 Hz CPUs (72 threads in total).
Since the GMP library allows one to handle arbitrarily large rational numbers without numerical errors in addition to integers, we have also implemented Algorithm 1 in C with this library for all (q, r ) given by (14). The results were identical to those obtained from Algorithm 2. It took approximately 15 h to test all the quadruples (Q, R, n, t) against (A), (B), and (C). Hence, at least in our case, it is considerably more efficient to let the GMP library directly handle rational numbers than to transform the problem so that all variables remain integers.
We have computed the optimal policies for all (Q, R) in Table (1) using the above two methods, which generated identical results. As an example, Table 2 shows the optimal policy k(n, t) for Example #19, which has the smallest value of n in Table 1. One can see that (B) is indeed violated at (n, t) = (31, 6).
So far we have discussed only our error-free C implementations of Algorithms 1 and 2. However, there are other ways to implement Algorithm 1 without numerical errors. An example is given by Algorithm 3, which shows a simple Mathematica program that generates the optimal policy in Table 2; the program is essentially identical to Algorithm 1. This Mathematica program is sufficiently efficient for verifying a relatively small number of examples. Using a modified version of this program, we have verified the optimal policies corresponding to all (Q, R) reported in Table 1. We have further confirmed all the optimal policies using Python as well. Thus for each (Q, R) in Table 1, we have cross-checked the optimal policy using the four different methods.