On the rectangular knapsack problem

A recent paper by Schulze et al. (Math Methods Oper Res 92(1):107–132, 2020) presented the Rectangular Knapsack Problem (Rkp) as a crucial subproblem in the study on the Cardinality-constrained Bi-objective Knapsack Problem (Cbkp). To this end, they started an investigation into its complexity and approximability. The key results are an NP -hardness proof for a more general scenario than Rkp, and a 4.5-approximation for Rkp, raising the question of improvements for either result. In this note we settle both questions conclusively: we show that (a) Rkp is indeed NP -hard in the considered setting (and even in more restricted settings), and (b) there exists both a pseudopolynomial algorithm and a fully-polynomial time approximation scheme (i.e., efficient approximability within any desired ratio α>1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha >1$$\end{document}) for Rkp.


Introduction
We mainly consider the Rectangular Knapsack Problem: Definition 1 Given a, b ∈ N n and κ ∈ N, the Rectangular Knapsack Problem (Rkp) is formulated as Fig. 1 A graphical representation of an Rkp instance: Given a multiset of (possibly degenerated) rectangles and an integer κ (left), we ask for a subset of cardinality κ that spans the largest possible area (right). Here, we are given 9 rectangles (A,B,... I) with κ = 4 where 1 is the all-one vector. We also define a decision version of this problem, where we are additionally given a θ ∈ N, and ask whether there exists a feasible x ∈ {0, 1} n with f (x) ≥ θ .
Rkp allows a nice geometric interpretation, see Fig. 1: We are given an integer κ ∈ N and a multiset R = {(a i , b i )} i=1,...,n of rectangles, specified by their width and height. The rectangles are allowed to be degenerated to orthogonal line segments (i.e., have width or height 0). For a subset of rectangles, we may lay them out in the plane connectedly such that the (axis-parallel) bounding-box has maximum area. Clearly, this is achieved by placing the rectangles in a linear sequence such that the bottom-left corner of a rectangle is the top-right corner of its predecessor. In Rkp we thus ask for a rectangle subset of cardinality at most κ that maximizes this area.
Rkp arises naturally in the study of the Cardinality-constrained Bi-objective Knapsack problem (Cbkp), as observed by Schulze et al. (2020): Definition 2 Given a, b ∈ N n and κ ∈ N, the Cardinality-constrained Bi-objective Knapsack Problem (Cbkp) is formulated as In this problem, we ask for the Pareto-front of two linear profit functions, subject to a cardinality constraint. As the Pareto-front can be exponential in the size of the input, Schulze et al. (2020) consider approximative methods to find a good respresentation of it; this can be achieved by iteratively solving Rkp instances. The approach has also been evaluated experimentally by Paquete et al. (2022). Schulze et al. (2020) give a 4.5-approximation for Rkp and conjecture that it is NP-hard. However, they can only show NP-hardness of two more general cases of the Rkp, namely when the a and b vectors may consist of general integer values (i.e., negative components are allowed) or when the cardinality constraint is a more general Knapsack constraint. The complexity status of the actual Rkp remained open.
Rkp is a special case of the Quadratic Knapsack Problem (Qkp) first introduced by Gallo et al. (1980); see (Pisinger 2007) for a comprehensive survey. Thereby we want to optimize a quadratic function over binary variables, i.e., f q (x) = x T Qx, subject to a knapsack constraint. Qkp is known to be strongly NP-complete even if the Knapsack constraint is a cardinality constraint, by reduction from Clique (Garey and Johnson 1979). Recall that strong NP-hardness means that the problem remains NP-hard even when all values of the numbers in the input are bounded by a polynomial in the input size; thus, such problems do not allow a pseudopolynomial algorithm unless P=NP.
Qkp with a general integral weight matrix Q does not allow a constant-factor approximation unless P=NP (Rader and Woeginger 2002). For non-negative Q, a constant-factor approximation can currently not be ruled out, but the best known approximation guarantees only a ratio of O n 2 /5+ε in O n 9 /ε time (Taylor 2016). Recall that, for a given quality requirement ε > 0, a (fully) polynomial-time approximation scheme, abbreviated as PTAS (FPTAS), achieves a (1 − ε)-approximation while its running time is bounded by a polynomial in the input size and an arbitrary (polynomial, respectively) function in ε −1 . A weight-matrix Q induces an n-vertex graph G that has an edge (i j) if and only if Q i, j = 0. There is an FPTAS if G has bounded tree-width and a PTAS if G is H -minor-free, for any fixed minor H (see (Pferschy and Schauer 2016) for both results). However, the problem remains strongly NP-hard even if G is guaranteed to be 3-book-embeddable (Pferschy and Schauer 2016) or vertex series-parallel (Rader and Woeginger 2002). Furthermore, an FPTAS exists for a special symmetric quadratic knapsack problem, where the knapsack constraint coefficients are dependent on the matrix Q (Kellerer and Strusevich 2010; Xu 2012). Our contribution. Our first result is to settle NP-hardness for Rkp in Sect. 3.
Next, we show that we can use an FPTAS from the literature to construct an FPTAS for Rkp in Sect. 4, concluding that Rkp is in fact only weakly NP-hard. Interestingly, we multi-objectivize the Rkp, i.e., turn the Rkp into a multi-objective optimization problem, to apply an FPTAS from the muliobjective literature.
In Sect. 5, we describe an exact pseudopolynomial time algorithm for Rkp. Our algorithm can also be used to directly and exactly solve the original Cbkp, the starting point of the investigation of Rkp by Schulze et al. (2020). Finally, in Sect. 6 we show how to use our algorithm as a building block to develop an FPTAS for Rkp and Cbkp. We show that both our pseudopolynomial algorithm and FPTAS yield improved running times compared to the FPTAS used in Sect. 4. W ↓ := min{α, β}. W.l.o.g., we assume α ≥ β and, thus, W ↑ = α and W ↓ = β. Let N be the encoding length of the Rkp instance; we observe in particular that n ≤ N and log α ≤ N . We may use these observations when showing running time bounds depending solely on the input length. Whenever we compare two vectors, e.g. x ≤ y for x, y ∈ N n , this is understood to be component-wise.
Consider a more general version of the Cbkp: the multiobjective Knapsack problem with general knapsack constraints and an arbitrary number of objective functions.
Definition 3 Given C ∈ N d×n , w ∈ N n , and κ ∈ N, the Multiobjective Knapsack Problem (Mokp) is formulated as The rows c 1 , . . . , c d ∈ N n of the objective function matrix C can be seen as individual Computing the non-dominated set for Mokp is known to be NP-hard, since it contains the single-objective knapsack problem as a special case.
An FPTAS for a multiobjective problem is an algorithm that computes, for any given quality guarantee ε ∈ (0, 1), a set S ⊆ Y , such that for every y ∈ Y N , there is an x ∈ S with x ≥ (1 − ε)y in time polynomial in the input size and 1 /ε. While for many multiobjective optimization problems it is known that the size of Y N can be exponential in the input size, e.g., multiobjective versions of the shortest path, spanning tree, or knapsack problems, there also always exists a set S as above of size bounded polynomially in the input size and 1 /ε (cf. Papadimitriou and Yannakakis (2000)).

NP-Hardness
Before we proceed with the hardness proof, we start with an initial observation. Consider an arbitrary Rkp instance and any two vectors We can therefore deduce: Observation 4 If there is an optimal solution (or a witness to a yes-instance) x with 1 T x < κ for some Rkp instance, then there also exists another optimal solution (yes-witness) x with 1 T x = κ. It thus suffices to search for solutions with the latter property.
The next problem is known to be (weakly) NP-hard for decades (Garey and Johnson 1979). Intuitively, it asks whether we can partition the components of a vector c into two subsets of equal size such that the sum of the components of each subsets coincide.

Definition 5 (Cardinality-Constraint Partition (Ccp)) Given a vector
Ccp was also used by Schulze et al. (2020) to show NP-hardness for the problem that arises from Rkp problem when a, b are allowed to include negative components. However, for the true Rkp (which was tackled by the algorithmic approximation by Schulze et al. (2020)), no hardness result was shown. By massaging Ccp-instances further, we can in fact show NP-hardness for strictly non-negative (or even strictly positive) vectors a, b.

Theorem 6 Rkp is NP-hard.
Proof We reduce from Ccp to the decision variant of Rkp, establishing that the latter is NP-complete as NP membership is trivial. Let c ∈ N n be a Ccp instance. We define the sum of all c-components C:=1 T c and a sufficiently large constant M:= c ∞ · n /2. Let C:=C · 1 and M:=M · 1 be the n-dimensional vectors with uniform components C and M, respectively. We construct an Rkp instance (a, b, κ, θ) with We can assume w.l.o.g. that n and C are even numbers (as c would otherwise be a trivial no-instance for Ccp) and non-zero. Thus we have integrality for all our constructed numbers. Since c ∈ N n , all constructed numbers (except for perhaps the components of b) are trivially positive and thus elements of N + . Since M = κ c ∞ is at least as large as any component of κ · c, also every component of b is in fact positive and thus from N + . The tuple (a, b, κ, θ) is hence a legal Rkp instance.
We now show that (a, b, κ, θ) is an Rkp yes-instance if and only if the original c ∈ N n is a Ccp yes-instance by proving the equivalence directly. Assume (a, b, κ, θ) is a yes-instance for Rkp, and consider a solution vectorx ∈ {0, 1} n . By Observation 4, we can assume w.l.o.g. that 1 Tx = κ and thus M Tx = κ M and C Tx = κC = κ · c T 1.
We examine the objective f (x): Thus (a, b, κ, θ) is a yes-instance for Rkp if and only if c is a yes-instance for Ccp (both witnessed by the identical solution vectorx).

This inequality holds if and only if c Tx
Ccp remains NP-hard even if all components of c are positive (in contrast to only being non-negative) and distinct (Garey and Johnson 1979). By the above construction we naturally obtain: Observation 7 Rkp remains NP-hard even if all components of a and all components of b are positive and distinct.

An FPTAS via MOKP
The Mokp admits an FPTAS proposed by Erlebach et al. (2002). To apply this FPTAS, we first establish a connection between Rkp and Mokp. To this end, we define for a given Rkp instance with a, b ∈ N n and κ ∈ N a Cbkp instance and thus a Mokp instance of the following form: Lemma 8 Any optimal solution to Rkp is a Pareto-optimal solution to the respective Mokp instance.
Proof Letx be an optimal solution to Rkp and assume its value vector is dominated in the Mokp instance by some other value vector. Let x be a solution corresponding to the latter. Clearly, f (x) = a T x b T x > a Tx b Tx = f (x). As both solutions are feasible w.r.t. the cardinality constraint,x cannot be an optimal solution to Rkp. Rkp instance (a, b, κ) and ε > 0, we solve it as an Mokp instance using the FPTAS by Erlebach et al. (2002). The latter algorithm is thereby started with a quality requirement ε := ε /2 to compute an approximate solution set S. Our Rkp solution is an x ∈ S with maximum a T x b T x .

Theorem 9 Algorithm 1 is an FPTAS for Rkp with running time bounded by
Proof We first prove the approximation ratio. Letx be an optimal solution to Rkp. We also know that (a, b) Tx is a non-dominated point to the Mokp instance by Lemma 8. Thus, the FPTAS computes at least one solutionx with a Tx ≥ (1 − ε )a Tx and b Tx ≥ (1 − ε )b Tx . The chosen solutionx ∈ S has maximum a Tx b Tx and thus we have The running time of the FPTAS by Erlebach et al. (2002) where U i is an upper bound on the respective objective function values. Since ε /2 ∈ (ε), we can bound the running time in the above algorithm by O ε −2 n 3 (log nα)(log nβ) .

Corollary 10 Algorithm 1 is an FPTAS for Cbkp with running time bounded by
While these results, based on using the FPTAS from the literature as a black box, already improve on the known 4.5-approximation for Rkp, we show below that we can further improve the results in terms of running time by attacking the problem more directly.

Exact pseudopolynomial algorithm
We describe an exact pseudopolynomial algorithm for Rkp. It is also based on a multiobjective optimization decomposition. Given a solution x ∈ {0, 1} n , we map it to its evaluation tuple via These tuples form the corner stone of our enumeration procedure below. Observe that multiple solutions may attain the same evaluation tuple. A tuple t = t 1 , t 2 , t 3 dominates a tuple t = t 1 , t 2 , t 3 if t 1 ≥ t 1 , t 2 ≥ t 2 , t 3 ≤ t 3 , and t = t . Recall that we can assume w.l.o.g. that β ≤ α.

Algorithm 2
We maintain a set of tuples T . It is initialized with T = { 0, 0, 0 }, the corresponding solution would be the n-dimensional all-zeros vector. The algorithm now runs in n iterations. In iteration i, we consider every current tuple t = t 1 , t 2 , t 3 ∈ T and obtain a new tuple t = t 1 , t 2 , t 3 := t 1 + a i , t 2 + b i , t 3 + 1 . In solution space, this corresponds to setting the i-th component of the associated solution to 1. We discard t if t 3 > κ, or there is another tuple s, t 2 , t 3 ∈ T with s ≥ t 1 . We denote these pruning strategies by P1 and P2, respectively.
After all n iterations, we attain the optimal objective value as max t 1 ,t 2 ,t 3 ∈T {t 1 · t 2 }.
If we are also interested in optimal solutions, we can use standard techniques to keep track of one solution per tuple. This incurs an additional factor of (n) in the running time later.
We observe that the computed solution is feasible. Moreover, the final set T contains every possible non-dominated tuple-and possibly some more:

Lemma 11 Algorithm 2 finds all non-dominated evaluation tuples of a given instance.
Proof The algorithm is essentially a brute-force enumeration, computing all possible evaluation tuples. However, the algorithm uses two pruning strategies.
Pruning P1 is correct since the algorithm never decreases the number of non-zero components in any considered solution. Pruning P2 only deletes (some) dominated evaluation tuples: The tuples in iteration i correspond to subsolutions only considering the first i components. It is well-known, see e.g. (Nemhauser and Ullmann 1969), that any non-dominated solution x cannot contain any dominated subsolutions. If it contained a dominated subsolution x , let y denote the subsolution dominating x ; we could substitute x by y in x to achieve a new solution that then dominates x.
We now establish how the non-dominated tuples given by Algorithm 2 help us in solving Rkp. We have to be a bit more careful than in the proof of Lemma 8.

Lemma 12 The evaluation tuple of any optimal solution to Rkp is non-dominated.
Proof Letx be an optimal solution to Rkp and assume its evaluation tuple is dominated by some other evaluation tuple. Let x be a solution corresponding to the latter. Then a T x ≥ a Tx , b T x ≥ b Tx , and 1 T x ≤ 1 Tx , with at least one of these inequalities being strict. If one of the first two is strict, x attains a better objective value thanx-a contradiction. If only the thirds is strict, we have 1 T x < κ and could set a further component in x to 1, obtaining a yet better objective value-again a contradiction.

Theorem 13 Algorithm 2 is a pseudopolynomial exact algorithm for Rkp. Its running time is bounded by
Proof The fact that Algorithm 2 computes all non-dominated evaluation tuples together with Lemma 12 establishes correctness, and it remains to discuss the running time. We can encode T as a two-dimensional array A with κβ rows and κ columns. A tuple t 1 , t 2 , t 3 is stored as A[t 2 , t 3 ] = t 1 ; we have A[t 2 , t 3 ] = −∞ if there is no tuple ·, t 2 , t 3 ∈ T . Thus each pruning test can be trivially performed in constant time. For later reference, let τ denote the maximum number of tuples ever in T . We have τ ≤ κ 2 β ≤ n 2 β. Over the n iterations this yields a running time of O(nτ ) ⊆ O(n 3 β).

FPTAS
Algorithm 3 For a given ε ∈ (0, 1), let δ:= n 1 /(1−ε) > 1 and for y ∈ N let Given a solution x ∈ {0, 1} n , we map it to its scaled evaluation tuple via We reuse Algorithm 2 but modify it slightly: Instead of working on evaluation tuples, we let Algorithm 2 now work on scaled tuples. Observe that we thus initialize T with the tuple 0, −1, 0 that corresponds to x = 0.
The only algorithmic part we need to change is the computation of new scaled evaluation tuples from a predecessor. In Algorithm 2, we were able to deduce a new candidate tuple t using only the predecessor tuple t ∈ T and not an actual solution x, since only linear functions were involved. To also achieve the same running time in the new algorithm, we also shall not store a full solution for each tuple. Instead, for each scaled tuple t ∈ T , we additionally store a single value B(t): Let x be a solution that yields tuple t; we want B(t):=b T x to be the value we would store as the second entry in an unscaled evaluation tuple of x. The initial tuple has B( 0, −1, 0 ) = 0. For a scaled evaluation tuple t = t 1 , t 2 , t 3 ∈ T in iteration i, we can then efficiently generate a new tuple t := t 1 + a i , (B(t) + b i ), t 3 + 1 with B(t ):=B(t)+b i . Tuple t is added to T subject to the same pruning strategies as described in Algorithm 2, working purely on the scaled evaluation tuples.
The final objective value is naturally computed as max t= t 1 ,t 2 ,t 3 ∈T t 1 · B(t).

Lemma 15 The running time of Algorithm 3 is bounded by
Proof The second entry in any scaled evaluation tuple is comprised by the function that can attain at most log δ κβ + 2 different values. Thus, we can let our array A have log δ κβ + 2 rows and κ columns. Again, let τ denote the size of A. We have: The B(t) values can be perceived as an additional entry in every cell of A and thus contributing as a constant factor to the size of A. Considering 1 /ε → ∞, we have by Taylor expansion. Thus, τ ∈ O(ε −1 n 2 log nβ). The algorithm's overall running time is thus understood to be bounded by O(nτ ) = O(ε −1 n 3 log nβ).

Theorem 16 Algorithm 3 is an FPTAS for Rkp.
Proof Consider an Rkp instance (a, b, κ) with an optimal solutionx. Lemma 15 settles the running time requirement. It remains to show that for every ε > 0, Algorithm 3 finds a feasible solutionx with f (x) ≥ (1 − ε) f (x). Let X i be a set of solutions corresponding to T after iteration i ∈ {1, . . . n}, and X 0 = {0}. Recall that in any solution x ∈ X i , only the first i components of x may be non-zero. At any point in the algorithm, we may consider a final best solutionx that we could still hope to find; initially setx:=x.
If the algorithm finds an optimal solution, the claim is true. Suppose an optimal solution is not found; let be the smallest iteration index, such that there is no solution in X that has the same first components asx. We definex ∈ {0, 1} n such thať x i :=x i for 1 ≤ i ≤ andx i :=0 for < i ≤ n. That is,x is equal tox on the first components and 0 in all further components. Sincex / ∈ X , there must be some solutionx ∈ X that dominatesx .
Sincex dominatesx we have

Focusing on the second inequality, if
If neither -evaluation yields −1, we have for ν:= log δ b Tx and μ:= log δ b Tx , that ν ≥ μ − 1 and thus Consequently, in all cases the inequality b Tx ≥ 1 /δ · b Tx holds.
We now definex r :=x −x , i.e.,x r matchesx on the last n − components and is 0 on the first components. Letx:=x +x r . We see that a Tx = a Tx + a Txr ≥ a Tx + a Txr = a Tx , Intuitively, whilex is no longer attainable after iteration , solutionx is still attainable as it can arise fromx. At the same time,x is a solution with an objective value that is at most a factor of 1 /δ worse thanx in the second component.
We can iterate the above consideration withx assuming the role ofx. We again look for an iteration index , such that the newx does not agree with some solution in X on its first components. If such an index is not present then the newx is actually computed by the algorithm. If such an index is present, this new index is now strictly larger than the index considered before. As there are only n possible indices, we repeat this argument at most n times. For each repetition, we lose a factor of at most 1 /δ in the second component.
Letx be the final solution by the algorithm. We conclude that b Tx ≥ 1 /δ n · b Tx = (1 − ε)b Tx , while a Tx ≥ a Tx and 1 Tx ≤ 1 Tx ≤ κ. Consequently,x is feasible and

Conclusion
We answered all open questions from Schulze et al. (2020) regarding the complexity and approximability of Rkp: while the problem is indeed NP-hard, it allows not only a pseudopolynomial exact algorithm, but also an FPTAS-the theoretically strongest kind of approximation algorithm. Furthermore, our techniques in fact also allow us to directly tackle the Cbkp, achieving the equivalent algorithmic results.
Comparing Algorithms 1 and 3, our approach achieves a better running time in all cases: O(ε −2 n 3 log(nW ↓ ) log(nW ↑ )) ⊆ O(ε −2 N 5 ) vs. O(ε −1 n 3 log nW ↓ ) ⊆ O(ε −1 N 4 ). Especially, our pseudopolynomial algorithm shows that Rkp can be solved in polynomial time even if only W ↓ (but not W ↑ ) is bounded by a polynomial in the input size.
Furthermore, it should be understood that our algorithms (and proofs) can trivially be extended to any fixed arbitrary number of objective functions.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.