Approximating Subset Sum Ratio via Partition Computations

We present a new FPTAS for the Subset Sum Ratio problem, which, given a set of integers, asks for two disjoint subsets such that the ratio of their sums is as close to $1$ as possible. Our scheme makes use of exact and approximate algorithms for the closely related Partition problem, hence any progress over those -- such as the recent improvement due to Bringmann and Nakos [SODA 2021] -- carries over to our FPTAS. Depending on the relationship between the size of the input set $n$ and the error margin $\varepsilon$, we improve upon the best currently known algorithm of Melissinos and Pagourtzis [COCOON 2018] of complexity $O(n^4 / \varepsilon)$. In particular, the exponent of $n$ in our proposed scheme may decrease down to $2$, depending on the Partition algorithm used. Furthermore, while the aforementioned state of the art complexity, expressed in the form $O((n + 1 / \varepsilon)^c)$, has constant $c = 5$, our results establish that $c<5$.


Introduction
One of Karp's 21 NP-complete problems [19], Subset Sum has seen astounding progress over the last few years.Koiliaris and Xu [22], Bringmann [8] and Jin and Wu [18] have presented pseudopolynomial algorithms resulting in substantial improvements over the long-standing standard approach of Bellman [7], and the improvement by Pisinger [30].Moreover, the latter two algorithms [8,18] match the SETH-based lower bounds proved in [1].Additionally, recently there has been progress in the approximation scheme of Subset Sum, the first such improvement in over 20 years, with a new algorithm introduced by Bringmann and Nakos [10], as well as corresponding lower bounds obtained through the lens of fine-grained complexity.
A thoroughly studied special case of Subset Sum is the Partition problem, which asks for a partition of the input set to two subsets such that the difference of their sums is minimum.Any algorithm solving the first applies to the latter, though recent progress [10,27] has shown that Partition may be solved more efficiently in the approximation setting.On the other hand, regarding exact solutions, no better algorithm has been developed, therefore Subset Sum algorithms remain the state of the art.
The Equal Subset Sum problem, which, given an input set, asks for two disjoint subsets of equal sum, is closely related to Subset Sum and Partition.It finds applications in multiple different fields, ranging from computational biology [14,11] and computational social choice [23], to cryptography [31], to name a few.In addition, it is related to important theoretical concepts such as the complexity of search problems in the class TFNP [29].
The centerpiece of this paper is the Subset Sum Ratio problem, the optimization version of Equal Subset Sum, which asks, given an input set S ⊆ N, for two disjoint subsets S 1 , S 2 ⊆ S, such that the following ratio is minimized max si∈S1 s i , sj ∈S2 s j min si∈S1 s i , sj ∈S2 s j We present a new approximation scheme for Subset Sum Ratio, highlighting its close relationship with the classical Partition problem.Our proposed algorithm is the first to associate these closely related problems and, depending on the relationship of the cardinality of the input set n and the value of the error margin ε, achieves better asymptotic bounds than the current state of the art [24].Moreover, while the complexity of the current state of the art approximation scheme expressed in the form O((n + 1/ε) c ) has an exponent c = 5, we present an FPTAS with constant c < 5.

Related Work
Equal Subset Sum as well as its optimization version called Subset Sum Ratio [6] are closely related to problems appearing in many scientific areas.Some examples include the Partial Digest problem, which comes from computational biology [14,11], the allocation of individual goods [23], tournament construction [21], and a variation of Subset Sum, called Multiple Integrated Sets SSP, which finds applications in the field of cryptography [31].Furthermore, it is related to important concepts in theoretical computer science; for example, a restricted version of Equal Subset Sum lies in a subclass of the complexity class TFNP, namely in PPP [29], a class consisting of search problems that always have a solution due to some pigeonhole argument, and no polynomial time algorithm is known for this restricted version.
Equal Subset Sum has been proven NP-hard by Woeginger and Yu [32] (see also the full version of [26] for an alternative proof) and several variations have been proven NP-hard by Cieliebak et al. in [12,13].A 1.324-approximation algorithm has been proposed for Subset Sum Ratio in [32] and several FPTASs appeared in [6,28,24], the fastest so far being the one in [24] of complexity O(n 4 /ε), the complexity of which seems to also apply to various meaningful special cases, as shown in [25].
As far as exact algorithms are concerned, recent progress has shown that Equal Subset Sum can be solved probabilistically in 3 O * (1.7088 n ) time [26], faster than a standard "meet-in-the-middle" approach yielding an These problems are tightly connected to Subset Sum, which has seen impressive advances recently, due to Koiliaris and Xu [22] who gave a deterministic Õ( √ nt) algorithm, where n is the number of input elements and t is the target, and Bringmann [8] who gave a Õ(n + t) randomized algorithm, which is essentially optimal under SETH [1].See also [3] for an extension of these algorithms to a more general setting.Jin and Wu subsequently proposed a simpler randomized algorithm [18] achieving the same bounds as [8], which however seems to only solve the decision version of the problem.Recently, Bringmann and Nakos [9] have presented an O |S t (Z)| 4/3 poly(log t) algorithm, where S t (Z) is the set of all subset sums of the input set Z that are smaller than t, based on top-k convolution.
Partition shares the complexity of Subset Sum regarding exact solutions, where the meet in the middle approach [17] from the 70's remains the state of the art as far as algorithms dependent on n are concerned.On the other hand, one can approximate Partition more efficiently than Subset Sum unless the minplus convolution conjecture [15] is false.In particular, Bringmann and Nakos [10] have presented the first improvement for the latter in over 20 years, since the scheme of [20] had remained the state of the art.Moreover, in their paper they have shown that developing a significantly better algorithm would contradict said conjecture.Furthermore, they develop an approximation scheme for Partition utilizing min-plus convolution computations, improving upon the recent work of Mucha et al. [27] and circumventing the lower bounds established for Subset Sum in their work.

Our Contribution
We present a novel approximation scheme for the Subset Sum Ratio problem.Our algorithm makes use of exact and approximation algorithms for Partition, thus, any improvement over those carries over to our proposed scheme.Additionally, depending on the relationship between n and ε, our algorithm improves upon the best existing approximation scheme of [24].
We start by presenting some necessary background in Section 2. Afterwards, in Section 3 we introduce an FPTAS for a restricted version of the problem.In the following Section 4, we explain how to make use of the algorithm presented in the previous section, in order to obtain an approximation scheme for the Subset Sum Ratio problem.The complexity of the final scheme is thoroughly analyzed in Section 5, followed by some possible directions for future research in Section 6.
Prior Work.In this current paper we improve upon the results of the preliminary version [2], by using approximate and exact Partition algorithms instead of Subset Sum computations.

Preliminaries
Let, for x ∈ N, [x] = {0, . . ., x} denote the set of integers in the interval [0, x].Given a set S ⊆ N, denote its largest element by max(S) and the sum of its elements by Σ(S) = s∈S s.If we are additionally given a value ε ∈ (0, 1), define the following partition of its elements: -The set of its large elements as In the following, since the values of the associated parameters will be clear from the context, they will be omitted and we will refer to these sets simply as L and M .
Definition 2 (Approximate Partition, from [27]).Given a set X and error margin ε, compute a subset

Scheme for a Restricted Version
In this section, we present an FPTAS for the constrained version of the Subset Sum Ratio problem where we are only interested in approximating solutions that involve the largest element of the input set.In other words, one of the subsets of the optimal solution contains max(A) = a n (assuming that A = {a 1 , . . ., a n } is the sorted input set); let r opt denote the subset sum ratio of such an optimal solution.Our FPTAS will return a solution of ratio r, such that 1 ≤ r ≤ (1 + ε) • r opt , for a given error margin ε ∈ (0, 1); however, we allow that the sets of the returned solution do not necessarily satisfy the aforementioned constraint (i.e. a n may not be involved in the approximate solution).

Outline of the Algorithm
We now present a rough outline of the algorithm 1: -At first, we search for approximate solutions involving exclusively large elements from L(A, ε).-To this end, we produce the subset sums formed by these large elements.If their number exceeds n/ε 2 , then we can find an approximate solution.-Otherwise, there are at most n/ε 2 subsets of large elements.In this case, we can find a solution by running an exact or an approximate Partition algorithm for each subset.-In the case that the optimal solution involves small elements, we show that it suffices to add elements of M (A, ε) in a greedy way.

Solution Involving Exclusively Large Elements
We firstly search for an (1 + ε)-approximate solution with ε ∈ (0, 1), without involving any of the elements that are smaller than ε After partitioning the input set, we split the interval [0, n • a n ] into smaller intervals, called bins, of size l = ε 2 • a n each, as depicted in figure 1.Thus, there are a total of B = n/ε 2 bins.Notice that each possible subset of the input set will belong to a respective bin constructed this way, depending on its sum.Additionally, if two sets correspond to the same bin, then the difference of their subset sums will be at most l.
The next step of our algorithm is to generate all the possible subset sums, occurring from the set of large elements L. The complexity of this procedure is O 2 |L| , where |L| is the cardinality of set L. Notice however, that it is possible to bound the number of the produced subset sums by the number of bins B, since if two sums belong to the same bin they constitute a solution, as shown in Lemma 1, in which case the algorithm terminates in time O(n/ε 2 ).
Lemma 1.If two subsets correspond to the same bin, we can find an (1 + ε)approximation solution.
Proof.Suppose there exist two sets L 1 , L 2 ⊆ L whose sums correspond to the same bin, with Σ(L 1 ) ≤ Σ(L 2 ).Notice that there is no guarantee regarding the disjointness of said subsets, thus consider Therefore, the sets L 1 and L 2 constitute an (1 +ε)-approximation solution, since where the last inequality is due to the fact that It remains to show that L 1 = ∅.Assume that L 1 = ∅.This implies that L 1 ⊆ L 2 and since we consider each subset of L only once and the input is a set and not a multiset, it holds that L 1 ⊂ L 2 =⇒ L 2 = ∅.Since L 1 and L 2 correspond to the same bin, it holds that which is a contradiction, since L 2 is a non empty subset of L, which is comprised of elements greater than or equal to ε Consider an ε such that (1 + ε )/(1 − ε ) ≤ 1 + ε for all ε ∈ (0, 1) (the exact value of ε will be computed in Section 5).
If every produced subset sum of the previous step belongs to a distinct bin, then, we can infer that the number of subsets of large elements is bounded by n/ε 2 .Moreover, we can prove the following lemma.Proof.Assume that Σ(S * 1 ) ≤ Σ(S * 2 ).Note that sets S * 1 , S * 2 are also the optimal solution of the Partition problem on input S * .By running an (1 − ε ) approximate Partition algorithm on input set S * , we obtain the sets S 1 , S 2 with Σ(S 1 ) ≤ Σ(S 2 ), where S 1 = S p and S 2 = S p .Then, where we used the fact that (1 . Therefore, we have proved that when the optimal solution consists of sets comprised of only large elements, it is possible to find an (1 + ε)-approximation solution for the constrained Subset Sum Ratio problem by running an (1−ε )approximation algorithm for Partition with input the union of said large elements.In order to do so, it suffices to consider as input all the 2 |L|−1 subsets of L containing a n and each time run an (1 − ε )-approximation Partition algorithm.The total cost of this procedure will be thoroughly analyzed in Section 5 and depends on the algorithm used.
It is important to note that by utilizing an (exact or approximation) algorithm for Partition, we establish a connection between the complexities of Partition and approximating Subset Sum Ratio in a way that any future improvement in the first carries over to the second.

General (1 + ε)-Approximation Solutions
Whereas we previously considered optimal solutions involving exclusively large elements, here we will search for approximations for those optimal solutions that use all the elements of the input set, hence include small elements, and satisfy our constraint (i.e. a n belongs to the optimal solution sets).We will prove that in order to approximate those optimal solutions, it suffices to consider only the (1 − ε )-apx solutions of the Partition problem corresponding to each subset of large elements and add small elements to them.In other words, instead of considering any two random disjoint subsets consisting of large elements 4 and subsequently adding to these the small elements, we can consider only the (1−ε )approximate solutions to the Partition problem computed in the previous step, ergo, at most B = n/ε 2 configurations regarding the large elements.Moreover, we will prove that it suffices to add the small elements to our solution in a greedy way.
Since the algorithm has not detected a solution so far, due to Lemma 1 every computed subset sum of set L belongs to a different bin.Thus, their total number is bounded by the number of bins B, i.e.
We proceed by additionally involving small elements into our solutions in order to reduce the difference between the sums of the sets, thus reducing their ratio.
Lemma 3. Assume that we are given the (1 − ε )-apx solutions for the Partition problem on every subset of large elements containing a n .Then, an (1 + ε)approximation solution for the constrained version of Subset Sum Ratio can be found, when the optimal solution involves small elements.
Proof.Let S * 1 , S * 2 be disjoint subsets that form an optimal solution for the constrained version of Subset Sum Ratio, where: Note that, due to Lemma 1, it holds that Σ(L * 1 ) = Σ(L * 2 ).Moreover, let L * p and L * p be the optimal solution of the Partition problem on input L * 1 ∪ L * 2 , while L p and L p be the sets returned by an (1 − ε )-apx algorithm.

Complexity
The total complexity of the final algorithm is determined by three distinct operations, over the n iterations of the algorithm: 1.The cost to compute all the possible subset sums occurring from large elements.It suffices to consider the case where this is bounded by the number of bins B = n/ε 2 , due to Lemma 1. 2. The cost to compute an (1 − ε )-apx solution for Partition on each subset of large elements.The cost of this operation will be analyzed in the following subsection.3. The cost to include small elements to the (1 − ε )-apx solutions for Partition.There are B such solutions, and each requires O(log n) time, thus the total time required is O n ε 2 • log n .

Complexity of Partition Computations
Using Exact Partition Computations.Firstly, we will consider the case where we compute the optimal solution of the Partition problem.In order to do so, we will use the standard meet in the middle algorithm [17] where we used the Binomial Theorem.Consequently, the complexity to solve the Partition problem for all the subsets of large elements is Using Approximate Partition Computations.Here we will analyze the complexity in the case we run an approximate Partition algorithm in order to compute the (1 − ε )-approximation solutions.
For subset L ⊆ L, we run an approximate Partition algorithm with error margin ε such that and by choosing the maximum such ε , it holds that Since there are at most n/ε 2 subsets of large elements, we will need to run said algorithm at most n/ε 2 times on |L | ≤ |L| elements and with error margin ε .Note that any approximate Subset Sum algorithm could be used in order to approximate Partition, such as the one presented by Kellerer et al. [20] of complexity O min n ε , n + 1 ε 2 • log(1/ε) .In our case, with |L| = log(n/ε 2 ) and error margin ε , the total complexity is

Total Complexity
The total complexity of the algorithm occurs from the n distinct iterations required and depends on the algorithm chosen to find the (exact or approximate) solution to the Partition problem, since all of the presented algorithms dominate the time of the rest of the operations.Thus, by choosing the fastest one (depending on the relationship between n and ε), the final complexity is O min n 2.3 ε 2. 6 • log(n/ε 2 ),

Conclusion and Future Work
The main contribution of this paper, apart from the introduction of a new FPTAS for the Subset Sum Ratio problem, is the establishment of a connection between Partition and approximating Subset Sum Ratio.In particular, we showed that any improvement over the classic meet in the middle algorithm [17], or over the approximation scheme for Partition will result in an improved FPTAS for Subset Sum Ratio.
Additionally, we establish that the complexity of approximating Subset Sum Ratio, expressed in the form O((n + 1/ε) c ) has an exponent c < 5, which is an improvement over all the previously presented FPTASs for the problem.
It is important to note however, that there is a distinct limit to the complexity that one may achieve for the Subset Sum Ratio problem using the techniques discussed in this paper.
As a direction for future research, we consider the use of exact Subset Sum or Partition algorithms parameterized by a concentration parameter β, as described in [4,5], where they solve the decision version of Subset Sum.See also [16] for a use of this parameter under a pseudopolynomial setting.It would be interesting to investigate whether analogous arguments could be used to solve the optimization version.

Fig. 1 .
Fig. 1.Split of the interval [0, n • an] to bins of size l.
of size ε 2 • an.3: while filling the bins with the subset sums of L do 4:if two subset sums correspond to the same bin then 5:return an apx solution based on these.O(n/ε 2 ) complexity.9: for each subset of large elements containing an do O(n/ε 2 ) subsets.
for Subset Sum, and in the following we analyze its complexity.Let subset L ⊆ L such that |L | = k.The meet in the middle algorithm on the set L costs time O 2 |L |/2 • |L | Notice that the number of subsets of L of cardinality k is |L|