Maximum Box Problem on Stochastic Points

Given a finite set of weighted points in Rd\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathbb {R}}^d$$\end{document} (where there can be negative weights), the maximum box problem asks for an axis-aligned rectangle (i.e., box) such that the sum of the weights of the points that it contains is maximized. We consider that each point of the input has a probability of being present in the final random point set, and these events are mutually independent; then, the total weight of a maximum box is a random variable. We aim to compute both the probability that this variable is at least a given parameter, and its expectation. We show that even in d=1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=1$$\end{document} these computations are #P-hard, and give pseudo-polynomial time algorithms in the case where the weights are integers in a bounded interval. For d=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d=2$$\end{document}, we consider that each point is colored red or blue, where red points have weight +1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+1$$\end{document} and blue points weight -∞\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-\infty $$\end{document}. The random variable is the maximum number of red points that can be covered with a box not containing any blue point. We prove that the above two computations are also #P-hard, and give a polynomial-time algorithm for computing the probability that there is a box containing exactly two red points, no blue point, and a given point of the plane.


Introduction
The maximum box problem receives as input a finite point set in R d , where each point is associated with a positive or negative weight, and outputs an axis-aligned rectangle (i.e., box) such that the sum of the weights of the points that it contains is maximized [3]. We consider this problem on a recent uncertainty model in which each element of the input has assigned a probability. Particularly, each point has assigned its own and independent probability of being present in the final (hence random) point set. Then, one can ask the following questions: What is the probability that for the final point set there exists a box that covers a weight sum greater than or equal to a given parameter? What is the expectation of the maximum weight sum that can be covered with a box? Uncertainty models come from real scenarios in which large amounts of data, arriving from many sources, have inherent uncertainty. In computational geometry, we can find several recent works on uncertain point sets such as: the expected total length of the minimum Euclidean spanning tree [5]; the probability that the distance between the closest pair of points is at least a given parameter [11]; the computation of the mostlikely convex hull [16]; the probability that the area or perimeter of the convex hull is at least a given parameter [15]; the smallest enclosing ball [9]; the probability that a 2colored point set is linearly separable [10]; the area of the minimum enclosing rectangle [17]; and Klee's measure of random rectangles [20]. We deal with the maximum box problem in the above mentioned random model. The maximum box problem is a geometric combinatorial optimization problem, different from most of the problems considered in this random setting that are computations of some measure or structure of the extent of the geometric objects.
For d = 1, the maximum box problem asks for an interval of the line. If the points are uncertain as described above, then it is equivalent to consider as input a sequence of random numbers, where each number has two possible outcomes: zero if the number is not present and the actual value of the number otherwise. The output is the subsequence of consecutive numbers with maximum sum. We consider the simpler case when the subsequence is a partial sum, that is, it contains the first (or leftmost) number of the sequence. More formally: We say that a random variable X is zero-value if X = v with probability ρ, and X = 0 with probability 1 − ρ, for an integer number v = v(X ) = 0 and a probability ρ. We refer to v as the value of X and to ρ as the probability of X . In any sequence of zero-value variables, all variables are assumed to be mutually independent. Let X = X 1 , X 2 , . . . , X n be a sequence of n mutually independent zero-value variables, whose values are a 1 , a 2 , . . . , a n , respectively. We study the random variable S(X ) = max{0, X 1 , X 1 + X 2 , . . . , X 1 + · · · + X n }, which is the maximum partial sum of the random sequence X . The fact that E[max{X , Y }] is not necessarily max{E[X ], E[Y ]}, even if X and Y are independent random variables, makes hard the computation of the expectation E[S(X )].
Kleinberg et al. [13] proved that the problem of computing Pr[X 1 + · · · + X n > 1] is #P-complete, in the case where the values of the variables X 1 , X 2 , . . . , X n are all positive. The proof can be straightforwardly adapted to also show that computing Pr[X 1 + · · · + X n > z] is #P-complete, where the values of X 1 , X 2 , . . . , X n are all positive, for any fixed z > 0. This last fact implies that computing Pr[S(X ) ≥ z] for any fixed z ≥ 1 is #P-hard. We show hardness results when the probabilities of X 1 , X 2 , . . . , X n are the same, and their values are not necessarily positive. Namely, we prove (Sect. 2.1) that computing Pr[S(X ) ≥ z] for any fixed z ≥ 1, and computing the expectation E[S(X )] are both #P-hard problems, even if all variables of X have the same less-than-one probability. When a 1 , a 2 , . . . , a n ∈ [−a..b] for bounded a, b ∈ N, we show (Sect. 2.2) that both Pr[S(X ) ≥ z] and E[S(X )] can be computed in time polynomial in n, a, and b. For two integers u < v, we use [u.
For d = 2, we consider the maximum box problem in the context of red and blue points, where red points have weight +1 and blue points weight −∞. Let R and B be disjoint finite point sets in the plane with a total of n points, where the elements of R are colored red and the elements of B are colored blue. The maximum box problem asks for a box H such that |H ∩ R| is maximized subject to |H ∩ B| = ∅. This problem has been well studied, with algorithms whose running times go from O(n 2 log n) [6], O(n 2 ) [3], to O(n log 3 n) [2]. Let S ⊆ R ∪ B be the random point set where every point p ∈ R ∪ B is included in S independently and uniformly at random with probability π( p) ∈ [0, 1]. Let box(S) denote the random variable equal to the maximum number of red points in S that can be covered with a box not covering any blue point of S.
We prove (Sect. 3.1) that computing the probability Pr[box(S) ≥ k] for any given k ≥ 2, and computing the expectation E[box(S)], are both #P-hard problems. We further show (Sect. 3.2) that given a point o of the plane, computing the probability that there exists a box containing exactly two red points of S, no blue point of S, and the point o can be solved in polynomial time. If we remove the restriction of containing o, this problem is also #P-hard. This fact is a direct consequence of the previous #P-hardness proofs.
In all running time upper bounds in this paper, in both algorithms and reductions, we assume a real RAM model of computation where each arithmetic operation on large-precision numbers takes constant time. Otherwise, the running times should be multiplied by a factor proportional to the bit complexity of the numbers, which is polynomial in n and the bit complexity of the input probability values [5,11].
.n] : X j = 0}, and for any s, let Then, #SubsetSum asks for N t − N t+1 . Call A(X ) to compute Pr[S(X ) ≥ z]. Then: where, and Hence, we can compute N t in polynomial time from the value of Pr[S(X ) ≥ z]. Consider now the random sequence X = X 0 , X 1 , X 2 , . . . , X n , where X 0 has value −km−(t +1)+z. Using arguments similar to those above, by calling A(X ) to compute Pr[S(X ) ≥ z], we can compute N t+1 in polynomial time from this probability. Then, N t − N t+1 can be computed in polynomial time, plus the time of calling twice the oracle A. This implies the theorem.
Proof Let X = X 1 , X 2 , . . . , X n be a sequence of zero-value random variables, each with probability ρ, and consider the sequence X = X 0 , X 1 , . . . , X n , where X 0 is a zero-value random variable with value −1 and probability ρ. Let w be the sum of the positive values among the values of X 1 , . . . , X n . Then: Then, we have that Since computing Pr[S(X ) ≥ 1] is #P-hard (Theorem 1), then computing E[S(X )] is also #P-hard via a Turing reduction.

Pseudo-Polynomial Time Algorithms
Let X = X 1 , X 2 , . . . , X n be a sequence of n zero-value random variables, with values a 1 , a 2 , . . . , a n ∈ [−a..b] ⊂ Z and probabilities ρ 1 , ρ 2 , . . . , ρ n , respectively, for some a, b ∈ N. We show that both Pr For every t ∈ [1..n], let for every t, and that L 1 can be trivially computed. Using the dynamic programming algorithm design paradigm, we next show how to compute the values of L t , t ≥ 2, assuming that all values of L t−1 have been computed. Note that: where and When k = s, we have for a t < 0 that Pr[M t = k, S t = s | X t = a t ] = 0, since this event indicates that S t = X 1 + · · · + X t is a maximum partial sum of X 1 , . . . , X t , but this cannot happen because any maximum partial sum ends in a positive element. For a t > 0 we have When k > s, M t does not count the element a t , hence M t−1 = M t . Then Modeling each set L t as a 2-dimensional table (or array), note that each value of L t can be computed in O(k − (s − a t )) = O(w 1 ) time, and hence all values of L t can be computed in As a consequence, we get the following result.

Hardness
Given a graph G = (V , E), a subset V ⊆ V is an independent set of G if no pair of vertices of V define an edge in E. Let N (G) denote the number of independent sets of G. The problem #IndSet of counting the number of independent sets in a graph is #Pcomplete, even if the graph is planar, bipartite, and with maximum degree 4 [18]. We show in what follows a one-to-many Turing reduction from #IndSet to the problem of computing Pr[box(S) ≥ k], for any given k ≥ 2. The proof uses techniques similar to that of Kamousi et al. [11] to prove that counting vertex covers in weighted unit-disk graphs is #P-hard, and that of Vadhan [18] to prove that counting weighted matchings in planar bipartite graphs is #P-hard. For any subset V ⊆ V and any edge e = {u, v} ∈ E, we say that V 1-covers edge e if exactly one of u and v belongs to V . We also say that V 2-covers e if both u and v belong to V . Let C i, j denote the number of subsets of V that 1-cover exactly i edges and 2-cover exactly j edges. Then, The next lemma relates the number N (G s ) of independent sets of G s to the values C i, j in G.

Lemma 4 We have
is not necessarily an independent set of G because it may 2-cover some edges. Let V ⊆ V be any subset of V that 1-covers i edges and 2-covers j edges. For any edge e ∈ E, let p e denote the path induced by the s vertices added to e when constructing G s from G. An independent set of G s inducing V can be obtained by starting with V and adding vertices in the following way. For every edge e = {u, v} ∈ E: (1) if V neither 1-covers nor 2-covers e, then add any independent set of p e .
(2) if V 1-covers e, say u ∈ V , then add any independent set of p e not containing the extreme vertex of p e adjacent to u in G s . (3) if V 2-covers e, then add any independent set of p e with no extreme vertex.
It is well known that the number of independent sets of a path of length is exactly f +3 [18]. Since p e has length s −1 for every e, the number of choices for cases (1), (2), and (3) are f s+2 , f s+1 , and f s , respectively. Therefore, the number of independent sets of G s inducing a subset of V that 1-covers i edges and 2-covers j edges is precisely which completes the proof.
of degree m, whose coefficients a 0 , a 1 , . . . , a m are linear combinations of the terms C i, j . By Lemma 4, and using the known values of b s and α s for every s ∈ T , we have m + 1 evaluations of P(x) of the form b s = P(α s ), each corresponding to the linear equation b s = a 0 + a 1 · α s + a 2 · α 2 s + · · · + a m · α m s with variables the coefficients a 0 , a 1 , . . . , a m . The main matrix of this system of m + 1 linear equations is Vandermonde, with parameters α s for every s ∈ T . All α s are distinct (refer to [18] or Appendix A for completeness), so the determinant of the main matrix is non-zero, and the system has a unique solution a 0 , a 1 , . . . , a m , which can be computed in time polynomial in n. Finally, observe that for j = 0, the coefficient of the polynomial sum up to zero. Indeed, it suffices to note that P(1) = 0. Hence, we obtain which shows that N (G) can be computed in time polynomial in n.
In polynomial time, the graph G = (V , E) can be embedded in the plane using O(n 2 ) area in such a way that its vertices are at integer coordinates, and its edges are drawn so that they are polylines made up of line segments of the form x = i or y = j, for integers i and j [19] (see Fig. 2a). Let h = O(n) be the maximum number of bends of the polylines corresponding to the edges.
For s = h, h + 1, . . . , h + m, we embed the graph G s in the following way. We embed the graph G as above; scale the embedding by factor 2(s + 1); and for each edge of G, add s intermediate vertices to the polyline of the edge so that they have even integer coordinates and cover all bends of the polyline (see Fig. 2b). Then, each edge of G s is represented in the embedding by a vertical or horizontal segment. Let the point set R 0 = R 0 (s) ⊂ Z 2 denote the vertex set of the embedding, and color these points in red. By translation if necessary, we can assume R 0 ⊂ [0..N ] 2 for some N = O(n 2 ). Let B 0 = B 0 (s) be the following set of blue points: For each horizontal or vertical line through a point of R 0 , and each two consecutive points p, q ∈ R 0 in such that the vertices p and q are not adjacent in G s , we add a blue point in the segment pq connecting p and q, in order to "block" this segment, so that the blue point has one odd coordinate. In this way, blue points blocking horizontal segments have odd x-coordinates and even y-coordinates; and blue points blocking vertical segments have even x-coordinates and odd y-coordinates. Hence, a blue point cannot block at the same time a horizontal and a vertical segment defined by two red points. Note that . Now, a horizontal or vertical segment connecting two points p and q of R 0 ∪ B 0 represents an edge of G s if and only if p, q ∈ R 0 and the segment does not contain any other point of R 0 ∪ B 0 in its interior (see Fig. 4).
.N ] 2 to obtain a point set with rational coordinates by applying the function λ : [0..N ] 2 → Q 2 , where to every p ∈ R 0 ∪ B 0 , where x( p) and y( p) denote the x-and y-coordinates of p, respectively. Similar perturbations can be found in [1,4], and refer to Fig. 3. Since λ is injective [4], let λ −1 denote the inverse of λ.
, and define the sets To simplify the notation, let R = R s and B = B s . Note that |R| = O(n 2 ) and |B| = O(n 2 ) . For two points a and b, let D(a, b) be the box with the segment ab as a diagonal. The proof of the next technical lemma is deferred to Appendix B.
Then, for each s ∈ T we can compute N (G s ) by calling A once. By Lemma 5, we can compute N (G) from the m + 1 computed values of N (G s ) for each s ∈ T . Hence, it is #P-hard to compute Pr[box(S) ≥ 2] via a Turing reduction from #IndSet. To show that computing E[box(S)] is also #P-hard, for each s ∈ T consider the above point set R ∪ B and note that Let now k ≥ 3. For each s ∈ T , the graph G s can be colored with two colors, 0 and 1, because it is also a bipartite graph. Each red point in R corresponds to a vertex in G s . Then, for each red point p ∈ R with color 0 we add new k 2 − 1 red points close enough to p (say, at distance much smaller than δ), and for each red point q ∈ R with color 1 we add new k 2 − 1 red points close enough to q. Let R = R (s) be the set of all new red points, and assign π(u) = 1 for every u ∈ R . In this new colored point set R ∪ R ∪ B, there is no box containing more than k red points and no blue point. Furthermore, every box containing exactly k red points and no blue point contains two points p, q ∈ R such that λ −1 ( p) and λ −1 (q) are adjacent in G s ; and for every p, q ∈ R such that λ −1 ( p) and λ −1 (q) are adjacent in G s such a box containing p and q exists. Then, when taking S ⊆ R ∪ R ∪ B at random, we also have Hence, computing Pr[box(S) ≥ k] is also #P-hard for any k ≥ 3.

Two-Point Boxes
From the proof of Theorem 7, note that it is also #P-hard to compute the probability that in S ⊆ R ∪ B there exists a box that contains exactly two red points p, q and no blue point; and that this box can be restricted to be the minimum box D( p, q) having p and q as opposed vertices. In this section, we present a polynomial-time algorithm to compute such a probability when the box is further restricted to contain a given point o / ∈ R ∪ B of the plane in the interior. We assume general position, that is, there are no two points of R ∪ B ∪ {o} with the same x-or y-coordinate. We further assume w.l.o.g. that o is the origin of coordinates.
Given a fixed X ⊆ R ∪ B, and S ⊆ R ∪ B chosen at random, let E(X ) = E(X , S) be the event that there exist two red points p, q ∈ S ∩ X such that the box D( p, q) contains the origin o, no other red point in S ∩ X , and no blue in S ∩ X . Then, our goal is Pr[E(R ∪ B)].

Theorem 8 Given R ∪ B, Pr[E(R ∪ B)] can be computed in polynomial time.
Proof Let X ⊆ R ∪ B, and define X + = {p ∈ X | y( p) > 0} and X − = {p ∈ X | y( p) < 0}. Given points q ∈ X + and r ∈ X − , define the events Let U q (X ) = U q (X , S) and W r (X ) = W r (X , S). Using the formula of the total probability, we have: To compute Pr E(X ) | U q (X ) , we assume x(q) > 0. The case where x(q) < 0 is symmetric. If q ∈ B, then observe that when restricted to the event U q (X ) any box D( p , q ) defined by two red points p , q ∈ S ∩ X , containing the origin o and no other red point in S ∩ X , where one between p and q is to the right of q, will contain q. Hence, we must "discard" all points to the right of q, all points between the horizontal lines through q and o because they are not present, and q itself. Then, we have that: where X q ⊂ X contains the points p ∈ X such that x( p) < x(q) and either y( p) > y(q) or y( p) < 0. If q ∈ R, we expand Pr E(X ) | U q (X ) as follows: There are now three cases according to the relative positions of q and r . Case 1: x(r ) < 0 < x(q). Let Y q,r ⊂ X contain the points p ∈ X (including q) such that x(r ) < x( p) ≤ x(q) and either y( p) < y(r ) or y( p) ≥ y(q). If r ∈ R, then Pr E(X ) | U q (X ), W r (X ) = 1. Otherwise, if r ∈ B, given that U q (X ) and W r (X ) hold, any box D( p , q ) defined by two red points p , q of S ∩ X , containing the origin o and no other red point in S ∩ X , where one between p and q is not in Y q,r , will contain q or r in the interior. Then Similar arguments are given in the next two cases.
where Z q,r ⊂ X contains the points p ∈ X such that x( p) < x(r ) and either y( p) < y(r ) or y( p) > y(q). Note that the event [E(Z q,r ∪ {r }) | W r (Z q,r ∪ {r })] is symmetric to the event [E(X ) | U q (X )], thus its probability can be computed similarly. Otherwise, if r ∈ B, we have Note that in the above recursive computation of Pr[E(X )], for X = R ∪ B, there is a polynomial number of subsets X q , Y q,r , and Z q,r ; each of such subsets can be encoded in constant space (i.e., by using a constant number of coordinates). Then, we can use dynamic programming, with a polynomial-size table, to compute Pr[E(R ∩ B)] in time polynomial in n.

Discussion and Open Problems
For fixed d ≥ 1, the maximum box problem for non-probabilistic points can be solved in O(n d ) time [3]. This fact, combined with the Monte Carlo method and known techniques, can be used to approximate the expectation of the total weight of the maximum box on probabilistic points, in polynomial time and with high probability of success. That is, we provide a FPRAS, which is explained in Appendix C. Approximating the probability that the total weight of a maximum box is at least a given parameter is an open question of this paper. To this end, we give in Appendix D a FPRAS for approximating this probability, but only in the case where the points are colored red or blue, each with probability 1/2, and we look for the box covering the maximum number of red points and no blue point (i.e. red points have weight +1 and blue points weight −∞).
For d = 2 and red and blue points, there are several open problems: For example, to compute Pr[box(S) ≥ k] (even for k = 3) in d = 2 when the box is restricted to contain a fixed point. Other open questions are the cases in which the box is restricted to contain a given point as vertex, or has some side contained in a given axis-parallel line. This two latter variants can be solved in n O(k) time (see Appendix E), which means that they are polynomial-time solvable for fixed k. This contrasts with the original question that is #P-hard for every k ≥ 2.
For red and blue points in d = 1, both Pr[box(S) ≥ k] and E[box(S)] can be computed in polynomial time by using standard dynamic programming techniques. This implies that for d = 2 and a fixed direction, computing the probability that there exists a strip (i.e., the space between two parallel lines) perpendicular to that direction and covering at least k red points and no blue point can be done in polynomial time. If the orientation of the strip is not restricted, then such a computation is #P-hard for every k ≥ 3 (see Appendix F).
Then, the pair (i − 1, j − 1) satisfies the property, which is a contradiction because (i, j) is such that i is minimum. Hence, the lemma follows. Proof (⇒) Let p, q ∈ R be red points such that vertices p 0 = λ −1 ( p) and q 0 = λ −1 (q) are adjacent in G s . We have either x( p 0 ) = x(q 0 ) or y( p 0 ) = y(q 0 ). We will prove that D( p, q) ∩ B is empty by assuming x( p 0 ) = x(q 0 ) = a; the case where y( p 0 ) = y(q 0 ) is similar. Further assume w.l.o.g. that y( p 0 ) < y(q 0 ). Since the segment p 0 q 0 contains no other point of R 0 ∪ B 0 , then D( p, q) does not contain points in λ(R 0 )∪λ(B 0 ) different from p and q (refer to Lemma 7 in [4]). Then, we need to prove that D( p, q) does not contain any blue point of the form λ(u 0 ) The left inequality implies Since both a and x(u 0 ) are integers, a − 1 < x(u 0 ) < a is a contradiction. Hence, such a point u 0 does not exist. Assume now that there exists The left inequality again implies a − 1 < x(u 0 ), and the right one implies x(u 0 ) < a + 1/2 − δ < a + 1. Then, we must have x(u 0 ) = a, and Eq. (2) simplifies to This implies y(u 0 ) < y(q 0 ), and then y( p 0 ) = y(u 0 ) (i.e., u 0 = p 0 ) because the segment p 0 q 0 is empty of points of R 0 ∪ B 0 in its interior. Then, we have y(u) = y( p) − δ < y( p) < y(q) which contradicts u ∈ D( p, q). Hence, such a point u 0 does not exist. (⇐) Let p, q ∈ R be red points such that vertices p 0 = λ −1 ( p) and q 0 = λ −1 (q) are not adjacent in G s . We will prove that D( p, q) ∩ B is not empty. We have several cases: (a) x( p 0 ) = x(q 0 ) or y( p 0 ) = y(q 0 ), and segment p 0 q 0 contains a point u 0 ∈ R 0 ∪B 0 . Consider x( p 0 ) = x(q 0 ), if y( p 0 ) = y(q 0 ) then the proof is similar. Assume y( p 0 ) < y(q 0 ) w.l.o.g., let u = λ(u 0 ), and note that x( p 0 ) = x(u 0 ) = x(q 0 ) and y( p 0 ) < y(u 0 ) < y(q 0 ). If u 0 ∈ B 0 , then Then, and which imply x( p) < x(v) < x(q); y( p) < y(v) < y(q); and v ∈ D( p, q) ∩ B.

C Random Approximation of the Expectation
In this section, let P ⊂ R d be an n-point set, for fixed d ≥ 1, with weights w : P → R \ {0}, and probabilities π : P → (0, 1]. Let S ⊆ P be the random sample where each p ∈ P is included in S with probability π( p). We assume the model in which deciding whether a given p is in the sample S can be done in O(1) time, then any random sample of J ⊆ P can be generated in O(|J |) time.
Let box(S) denote the total weight of the maximum box of S. We show, given ε, δ ∈ (0, 1), how to compute in time polynomial in n, ε −1 , and δ −1 , the estimateμ Let Q = {p ∈ P | w( p) > 0}, m = |Q|, and q 1 , q 2 , . . . , q m denote the elements of Q such that w(q 1 ) ≤ w(q 2 ) ≤ · · · ≤ w(q m ). For j = 1, . . . , m, let w j = w(q j ), Q j = {q 1 , . . . , q j }, and μ j be defined as follows: that is, μ j is the expectation of box(S) conditioned on q j being the element of heaviest weight in S. By the formula of the total probability, we have that: Then, to approximate μ we compute the approximationμ j of μ j , for j = 1, . . . , m. We do this by using the Monte-Carlo method, within the proof of the next lemma: .m], and N = (4 j/ε 2 ) ln(2/δ) . Let S 1 , S 2 , . . . , S N be N random samples of Q j ∪ (P \ Q), each containing q j , and letμ j be the average of box(S 1 ), box(S 2 ), . . . , box(S N ). Then, we have that Furthermore,μ j can be computed in N · (O( j + n − m) + O(n d )) = O((n d+1 /ε 2 ) log(1/δ)) time.
Proof A standard Chernoff bound [14] asserts that if X 1 , . . . , X N are i.i.d. random variables over a bounded domain [0, U ] with expectation α = E[X i ], then the average By letting X i = box(S i ) for i = 1, . . . , N , we have that α = μ j , X =μ j , and U = w 1 + w 2 + · · · + w j ≤ j · w j . Furthermore, since q j ∈ S i for all i, we must have w j ≤ X i ≤ U (because there is a small enough box containing only point q j , and X i is the weight of a maximum box of S i ), which implies w j ≤ μ j ≤ U . For N ≥ (4/ε 2 )(U /α) ln(2/δ) = (4/ε 2 )(U /μ j ) ln(2/δ), Eqs. (5) and (4) hold. This is ensured by the definition of N , and because w j ≤ μ j and U ≤ j · w j . That is, Since for any d ≥ 1, the maximum box problem on n points in R d can be solved in O(n d ) time [3] (after sorting the points), the running time follows.

D Random Approximation of the Probability
In this section, let P ⊂ R d , d ≥ 1, be an n-point set of two-colored points. Let P = R ∪ B, where R is the set of red points and B the set of blue points. We assume that P is in general position, which means that no two points of P belongs to the same axis-parallel hyperplane. Let S ⊆ P be a random sample in which each element of P is included in S independently with probability 1/2, and let box(S) denote the maximum number of red points in S that can be covered with a box, without covering any blue point.
Given an integer z ≥ 1, we show how to approximate the probability f = Pr[box(S) ≥ z]. Namely, we show how to compute in O(n O(min{z,n−z}) · ε −2 ) time a valuef that satisfies Note that the running time is polynomial if min{z, n − z} = O(1). The idea of the algorithm is to reduce the computation of the probability to count the number of satisfiable assignments of some DNF formula. Given n boolean variables x 1 , x 2 , . . . , x n , a DNF formula on these variables is a disjunction of clauses, where each clause is a conjunction of variables or negations of variables (e.g.
The key observation for the reduction is that there exists a box covering at least z red points of S, without covering any blue points, if and only if there exists a similar box covering exactly z points. This follows from the fact that a box with more than z red points can be shrunk to cover exactly z of them.
For every p ∈ P, let x p ∈ {0, 1} be the boolean, or indicator, (random) variable such that x p = 1 if and only if p ∈ S. For every set Q ⊆ R of cardinality z, let C Q denote the clause where bb(Q) denotes the minimum box covering Q. That is, C Q stands for the event that Q ⊆ S and the elements of Q, all of them red, can be covered with a box without covering any blue point of S (because bb(Q) is forced to be empty of elements of B). Let F = Q∈2 R :|Q|=z C Q be the DNF formula consisting of the disjunction of C Q over all subsets Q ⊆ R with |Q| = z, which has n variables and m = n z clauses. Let N be the number of satisfying assignments to F. Then, note that f = N /2 n . Using the algorithm of Karp, Luby, and Madras [12], we can find in O(nm · ε −2 ) = O(n O(min{z,n−z}) ·ε −2 ) time a valueÑ such that Pr[(1−ε)N ≤Ñ ≤ (1+ε)N ] ≥ 3/4. This implies thatf satisfies Eq. (6).

E Anchored Boxes
In this section, we consider the computation of Pr[box(S) ≥ k] when the box is restricted to contain the origin o of coordinates as bottom-left vertex. We assume that R ∪ B is in general position and k is an integer. For simplicity, we further assume that R ∪ B is contained in the first quadrant, π(b) = 1 for every b ∈ B, the elements b 1 , b 2 , . . . , b |B| of B form a staircase (i.e., x(b 1 ) < x(b 2 ) < · · · < x(b |B| ) and y(b 1 ) > y(b 2 ) > · · · > y(b |B| )), and x(b 1 ) < x(r ) < x(b |B| ) for every r ∈ R. We show that Pr[box(S) ≥ k] can be computed in n O(k) time. If all the simplifications are dropped, or it is considered that the box has a side in a given axis-parallel line, the computation (although more detailed) can also be done within this time.  ∅). We compute Pr[E(i, Q)] recursively (see Fig. 5). We have that: Note that Pr |(H i \ D i ) ∩ S| < k − |Q| can be computed in n O(k) time since we need to consider all subsets of size less than k of (H i \ D i ) ∩ R. Also note that i < |B| − 1, then: where Q = Q∩D i+1 and T = T ∩D i+1 . The sum has n O(k) terms, and using dynamic programming on a

F Covering with a Strip
Let R and B be disjoint finite point sets in the plane with a total of n points, where the elements of R are colored red and the elements of B are colored blue. Let S ⊆ R ∪ B be the random point set where every point p ∈ R ∪ B is included in S independently and uniformly at random with probability π( p) ∈ [0, 1]. Let strip(S) denote the random variable equal to the maximum number of red points in S that can be covered with a strip without covering any blue point. Note that p i , q j , and s i, j are collinear for every i, j. Furthermore, for i ∈ V 1 and j ∈ V 2 such that i = i or j = j we have s i, j = s i , j . Consider the next sets of red points: R 1 = {p 1 , p 2 , . . . , p N }, R 2 = {q 1 , q 2 , . . . , q N }, and R 1,2 = s i, j | {i, j} ∈ E , in which the only triplets of points that are collinear are those of the form ( p i , q j , s i, j ). There exist rational numbers δ > 0 such that the following set of blue points ensures that every triangle with vertices in R 1 ∪ R 2 ∪ R 1,2 and positive area contains in the interior a vertex from B [7]. Note that we are excluding the degenerate triangles (those with zero area) with vertices at p i , q j , and s i, j for some i, j. Such a value of δ can be computed in polynomial time. For ε > 0, let us define s i, j = s i, j + (ε, 0) and R 1,2 = s i, j | {i, j} ∈ E .
In polynomial time, we can also compute a rational value for ε such that the set R = R 1 ∪ R 2 ∪ R 1,2 of red points ensures that a triangle with vertices at elements of R contains in the interior a point of B if and only if the triangle does not have vertices at p i , q j , and s i, j for some i, j. Furthermore, for every i ∈ V 1 and j ∈ V 2 such that {i, j} ∈ E, there exists a (very thin) strip containing p i , q j , and s i, j and no other point of R ∪ B. These conditions imply that three red points of R can be covered with a strip that does not cover any blue point if and only if they are p i , q j , and s i, j for some i, j. For every u ∈ R 1 ∪ R 2 assign π(u) = 1/2, and for every v ∈ R 1,2 ∪ B assign π(v) = 1. When taking S ⊆ R ∪ B at random, we have strip(S) ≥ 3 if and only if V (S ∩ (R 1 ∪ R 2 )) is not an independent set in G, where for U ⊆ R 1 ∪ R 2 we denote When k ≥ 4, as in the proof of Theorem 7, for each red point s ∈ R 1,2 we can add k − 3 new red points with probability 1 close enough to s. The proof then follows similar arguments.