Approximating subset sum ratio via partition computations

Alonistiotis, Giannis; Antonopoulos, Antonis; Melissinos, Nikolaos; Pagourtzis, Aris; Petsalakis, Stavros; Vasilakis, Manolis

doi:10.1007/s00236-023-00451-7

Approximating subset sum ratio via partition computations

Original Article
Open access
Published: 12 January 2024

Volume 61, pages 101–113, (2024)
Cite this article

Download PDF

You have full access to this open access article

Acta Informatica Aims and scope Submit manuscript

Approximating subset sum ratio via partition computations

Download PDF

Abstract

We present a new FPTAS for the Subset Sum Ratio problem, which, given a set of integers, asks for two disjoint subsets such that the ratio of their sums is as close to 1 as possible. Our scheme makes use of exact and approximate algorithms for Partition, and clearly showcases the close relationship between the two algorithmic problems. Depending on the relationship between the size of the input set n and the error margin $\varepsilon $, we improve upon the best currently known algorithm of Melissinos and Pagourtzis [COCOON 2018] of complexity $\mathcal {O} (n^4 / \varepsilon )$. In particular, the exponent of n in our proposed scheme may decrease down to 2, depending on the Partition algorithm used.

Approximating Subset Sum Ratio via Subset Sum Computations

Approximation Schemes for Subset Sum Ratio Problems

Faster algorithms for k-subset sum and variations

Article Open access 01 December 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

One of Karp’s 21 NP-complete problems [20], Subset Sum has seen astounding progress over the last few years. Koiliaris and Xu [23], Bringmann [10] and Jin and Wu [19] have presented pseudopolynomial algorithms resulting in substantial improvements over the long-standing standard approach of Bellman [7] and the improvement by Pisinger [31]. Moreover, the latter two algorithms [10, 19] match the SETH-based lower bounds proved in [1]. Additionally, recently there has been progress in the approximation scheme of Subset Sum, the first such improvement in over 20 years, with a new algorithm introduced by Bringmann and Nakos [8], as well as corresponding lower bounds obtained through the lens of fine-grained complexity.

A thoroughly studied special case of Subset Sum is the Partition problem, which asks for a partition of the input set to two subsets such that the difference of their sums is minimum. Any algorithm solving the first applies to the latter, though recent progress [8, 16, 28] has shown that Partition may be solved more efficiently in the approximation setting. On the other hand, regarding exact solutions, no better algorithm has been developed, therefore Subset Sum algorithms remain the state of the art.

The Equal Subset Sum problem, which, given an input set, asks for two disjoint subsets of equal sum, is closely related to Subset Sum and Partition. It finds applications in multiple different fields, ranging from computational biology [12, 13] and computational social choice [24], to cryptography [32], to name a few. In addition, it is related to important theoretical concepts such as the complexity of search problems in the class TFNP [30].

The centerpiece of this paper is the Subset Sum Ratio problem, the optimization version of Equal Subset Sum, which asks, given an input set $S \subseteq \mathbb {N}$, for two disjoint subsets $S_1, S_2 \subseteq S$, such that the following ratio is minimized

$$\begin{aligned} \frac{\max \left\{ \sum _{s_i \in S_1} s_i, \sum _{s_j \in S_2} s_j \right\} }{\min \left\{ \sum _{s_i \in S_1} s_i, \sum _{s_j \in S_2} s_j \right\} }. \end{aligned}$$

This problem is known to be NP-hard, and many FPTASes have been proposed over the years [6, 25, 29], all of which rely on some kind of scaling of the input elements. The current state of the art [25] achieves a running time of $\mathcal{O}(n^4/\varepsilon )$, leaving a significant gap in comparison with known approximation algorithms for the closely related Subset Sum and Partition problems, especially with respect to n. This leads to the natural question of whether we can improve this performance and achieve an FPTAS with a running time $\mathcal{O}(n^{c_1} / \varepsilon ^{c_2})$, where either $c_1 < 4$ or $c_1 + c_2<5$. We manage to answer both questions in the affirmative, by presenting a novel approximation scheme which utilizes exact or approximate Partition algorithms and achieves running time^{Footnote 1}$\tilde{\mathcal{O}}(n^{2.3}/\varepsilon ^{2.6})$ or $\tilde{\mathcal{O}}(n^{2}/\varepsilon ^3)$ respectively. Our proposed algorithm significantly differs from previous approaches while it is the first to associate these closely related problems.

1.1 Related work

Equal Subset Sum as well as its optimization version called Subset Sum Ratio[6] are closely related to problems appearing in many scientific areas. Some examples include the Partial Digest problem, which comes from computational biology [12, 13], the allocation of individual goods [24], tournament construction [22], and a variation of Subset Sum, called Multiple Integrated Sets SSP, which finds applications in the field of cryptography [32]. Furthermore, it is related to important concepts in theoretical computer science; for example, a restricted version of Equal Subset Sum lies in a subclass of the complexity class $\textsf{TFNP}$, namely in $\textsf{PPP}$ [30], a class consisting of search problems that always have a solution due to some pigeonhole argument, and no polynomial time algorithm is known for this restricted version.

Equal Subset Sum has been proven NP-hard by Woeginger and Yu [33] (see also the full version of [27] for an alternative proof) and several variations have been proven NP-hard by Cieliebak et al. [11, 14]. A 1.324-approximation algorithm has been proposed for Subset Sum Ratio in [33] and several FPTASes appeared in [6, 25, 29], the fastest so far being the one in [25] of complexity $\mathcal{O}(n^4/\varepsilon )$, the complexity of which also applies to various meaningful special cases, as shown in [26].

As far as exact algorithms are concerned, recent progress has shown that Equal Subset Sum can be solved probabilistically in $\mathcal{O}^{\star }(1.7088^n)$ time [27], faster than a standard “meet-in-the-middle” approach yielding an $\mathcal{O}^{\star }(3^{n/2}) \le \mathcal{O}^{\star }(1.7321^n)$ time algorithm.

These problems are tightly connected to Subset Sum, which has seen impressive advances recently, due to Koiliaris and Xu [23] who gave a deterministic $\tilde{\mathcal{O}}(\sqrt{n}t)$ algorithm, where n is the number of input elements and t is the target, and Bringmann [10] who gave a $\tilde{\mathcal{O}}(n + t)$ randomized algorithm, which is essentially optimal under SETH [1]. See also [3] for an extension of these algorithms to a more general setting. Jin and Wu subsequently proposed a simpler randomized algorithm [19] achieving the same bounds as [10], which however seems to only solve the decision version of the problem. Recently, Bringmann and Nakos [9] have presented an $\mathcal{O}\left( |\mathcal {S}_t(Z) |^{4/3} \textrm{poly}(\log t) \right) $ algorithm, where $\mathcal {S}_t(Z)$ is the set of all subset sums of the input set Z that are smaller than t, based on top-k convolution.

Partition shares the complexity of Subset Sum regarding exact solutions, where the meet-in-the-middle approach [18] from the 70’s remains the state of the art as far as algorithms dependent on n are concerned. On the other hand, one can approximate Partition more efficiently than Subset Sum unless the min-plus convolution conjecture [15] is false. In particular, Bringmann and Nakos [8] have presented the first improvement for the latter in over 20 years, since the scheme of [21] had remained the state of the art. Moreover, in their paper they have shown that developing a significantly better algorithm would contradict said conjecture. Furthermore, they develop an approximation scheme for Partition utilizing min-plus convolution computations, improving upon the recent work of Mucha et al. [28] and circumventing the lower bounds established for Subset Sum in their work. Very recently, Deng, Jin and Mao [16] presented an even faster approximation algorithm for Partition, further widening the gap between the complexities of both problems in the approximation setting.

1.2 Our contribution

We present a novel approximation scheme for the Subset Sum Ratio problem, which, depending on the relationship between n and $\varepsilon $, improves upon the best existing approximation scheme of [25]. Our algorithm significantly differs from previous approaches, which in most cases rely on some kind of scaling of the input elements, and instead makes use of either exact or approximation algorithms for Partition. In particular, we first partition the input elements into small and large, and then prove that we can either easily find an approximate solution involving only large elements or there are at most $\log (n / \varepsilon ^2)$ of them. In the latter case, in order to approximate Subset Sum Ratio it suffices to solve instances of Partition on all the subsets of large elements, i.e., polynomially many instances, each of size at most $\log (n / \varepsilon ^2)$. By leveraging known Partition algorithms in the second case, we manage to improve upon previous FPTASes. In the case of exact computations, we show that by employing such a Partition algorithm of complexity $\mathcal{O}^{\star }(2^{\alpha n})$, our proposed scheme runs in time $\tilde{\mathcal{O}}( n \cdot (n / \varepsilon ^2)^{\log (1 + 2^\alpha )})$, for some constant $\alpha > 0$. It is already known that such an algorithm exists for $\alpha = 1/2$ [18], and any further improvements will positively affect our FPTAS. On the other hand, using the approximation algorithm of Kellerer et al. [21] we achieve a running time of $\tilde{\mathcal{O}}(n^2 / \varepsilon ^3)$, while any improvement over it (e.g., [8, 16]) will only affect polylogarithmic factors of our scheme, as is further discussed in Sect. 5.

We start by presenting some necessary background in Sect. 2. Afterward, in Sect. 3 we introduce an FPTAS for a restricted version of the problem. Then, in Sect. 4, we explain how to make use of the algorithm presented in Sect. 3, in order to obtain an approximation scheme for the Subset Sum Ratio problem. The complexity of the final scheme is thoroughly analyzed in Sect. 5, followed by some possible directions for future research in Sect. 6.

Prior work In the current paper we improve upon the results of the preliminary version [2], by using approximate and exact Partition algorithms instead of Subset Sum computations.

2 Preliminaries

Let, for $x \in \mathbb {N}$, $[x] = \left\{ z \in \mathbb {N} \mid 1 \le z \le x \right\} $ denote the set of integers in the interval [1, x]. Given a set $S \subseteq \mathbb {N}$, denote its largest element by $\max (S)$ and the sum of its elements by $\Sigma (S) = \sum _{s \in S} s$. If we are additionally given a value $\varepsilon \in (0, 1)$, define the following partition of its elements:

The set of its large elements as $L(S, \varepsilon ) = \left\{ s \in S \mid s \ge \varepsilon \cdot \max (S) \right\} $. Note that $\max (S) \in L(S, \varepsilon )$, for any $\varepsilon \in (0, 1)$.
The set of its small elements as $M(S, \varepsilon ) = \left\{ s \in S \mid s < \varepsilon \cdot \max (S) \right\} $.

In the following, since the values of the associated parameters will be clear from the context, they will be omitted and we will refer to these sets simply as L and M.

Definition 1

(Partition) Given a set X, compute a subset $X^*_p \subseteq X$, such that $\Sigma (X^*_p) = \max \left\{ \Sigma (Z) \mid Z \subseteq X, \Sigma (Z) \le \Sigma (X) / 2 \right\} $. Moreover, let $\overline{X^*_p} = X {\setminus } X^*_p$.

Definition 2

(Approximate Partition, from [28]) Given a set X and error margin $\varepsilon $, compute a subset $X_p \subseteq X$ such that $(1 - \varepsilon ) \cdot \Sigma (X^*_p) \le \Sigma (X_p) \le \Sigma (X^*_p)$. Moreover, let $\overline{X_p} = X \setminus X_p$.

3 Scheme for a restricted version

In this section, we present an FPTAS for the constrained version of the Subset Sum Ratio problem where we are only interested in approximating solutions that involve the largest element of the input set. In other words, one of the subsets of the optimal solution contains $\max (A) = a_n$ (assuming that $A = \left\{ a_1, \ldots , a_n \right\} $ is the sorted input set); let $r_{\text { opt}}$ denote the subset sum ratio of such an optimal solution. Our FPTAS will return a solution of ratio r, such that $1 \le r \le (1 + \varepsilon ) \cdot r_{\text { opt}}$, for a given error margin $\varepsilon \in (0, 1)$; however, we allow that the sets of the returned solution do not necessarily satisfy the aforementioned constraint (i.e., $a_n$ may not be involved in the approximate solution).

3.1 Outline of the algorithm

We now present a rough outline of Algorithm 1:

At first, we search for approximate solutions involving exclusively large elements from $L(A,\varepsilon )$.
To this end, we produce the subset sums formed by these large elements. If their number exceeds $n / \varepsilon ^2$, then we can find an approximate solution.
Otherwise, there are at most $n / \varepsilon ^2$ subsets of large elements. In this case, we can find a solution by running an exact or an approximate Partition algorithm for each subset.
In the case that the optimal solution involves small elements, we show that it suffices to add elements of $M(A, \varepsilon )$ in a greedy way.

3.2 Solution involving exclusively large elements

We firstly search for a $(1 + \varepsilon )$-approximate solution with $\varepsilon \in (0,1)$, without involving any of the elements that are smaller than $\varepsilon \cdot a_n$. Let $M = \left\{ a_i \in A \mid a_i < \varepsilon \cdot a_n \right\} $ be the set of small elements and $L = A {\setminus } M = \left\{ a_i \in A \mid a_i \ge \varepsilon \cdot a_n \right\} $ be the set of large elements.

After partitioning the input set, we split the interval $[0, n \cdot a_n]$ into smaller intervals, called bins, of size $l = \varepsilon ^2 \cdot a_n$ each, as depicted in Fig. 1.

Thus, there are a total of $B = n / \varepsilon ^2$ bins. Notice that each possible subset of the input set will belong to a respective bin constructed this way, depending on its sum. Additionally, if two sets correspond to the same bin, then the difference of their subset sums will be at most l.

The next step of our algorithm is to generate all the possible subset sums, occurring from the set of large elements L. The complexity of this procedure is $\mathcal{O}\left( 2^{|{L}|} \right) $, where $|{L}|$ is the cardinality of set L. Notice however, that it is possible to bound the number of the produced subset sums by the number of bins B, since if two sums belong to the same bin they constitute a solution, as shown in Lemma 1, in which case the algorithm terminates in time $\mathcal{O}(n / \varepsilon ^2)$.

Lemma 1

If two subsets correspond to the same bin, we can find a $(1 + \varepsilon )$-approximation solution.

Proof

Suppose there exist two sets $L_1, L_2 \subseteq L$ whose sums correspond to the same bin, with $\Sigma (L_1) \le \Sigma (L_2)$. Notice that there is no guarantee regarding the disjointness of said subsets, thus consider $L'_1 = L_1 \setminus L_2$ and $L'_2 = L_2 {\setminus } L_1$, for which it is obvious that $\Sigma (L_1') \le \Sigma (L_2')$. Additionally, assume that $L'_1 \ne \emptyset $. Then it holds that

$$\begin{aligned} \Sigma (L_2') - \Sigma (L_1') = \Sigma (L_2) - \Sigma (L_1) \le l. \end{aligned}$$

Therefore, the sets $L'_1$ and $L'_2$ constitute a $(1+\varepsilon )$-approximation solution, since

$$\begin{aligned} \frac{\Sigma (L_2')}{\Sigma (L_1')}&\le \frac{\Sigma (L_1') + l}{\Sigma (L_1')} = 1 + \frac{l}{\Sigma (L_1')}\\&\le 1 + \frac{\varepsilon ^2 \cdot a_n}{\varepsilon \cdot a_n} = 1 + \varepsilon \end{aligned}$$

where the last inequality is due to the fact that $L_1' \subseteq L$ is composed of elements $\ge \varepsilon \cdot a_n$, thus $\Sigma (L_1') \ge \varepsilon \cdot a_n$.

It remains to show that $L'_1 \ne \emptyset $. Assume that $L'_1 = \emptyset $. This implies that $L_1 \subseteq L_2$ and since we consider each subset of L only once and the input is a set and not a multiset, it holds that $L_1 \subset L_2 \implies L'_2 \ne \emptyset $. Since $L_1$ and $L_2$ correspond to the same bin, it holds that

$$\begin{aligned} \Sigma (L_2) - \Sigma (L_1) \le l \implies \Sigma (L_2') - \Sigma (L_1') \le l \implies \Sigma (L'_2) \le l \end{aligned}$$

which is a contradiction, since $L'_2$ is a non empty subset of L, which is comprised of elements greater than or equal to $\varepsilon \cdot a_n$, hence $\Sigma (L'_2) \ge \varepsilon \cdot a_n > \varepsilon ^2 \cdot a_n = l$, since $\varepsilon < 1$. $\square $

Consider an $\varepsilon '$ such that $(1 + \varepsilon ')/(1 - \varepsilon ') \le 1 + \varepsilon $ for all $\varepsilon \in (0, 1)$ (the exact value of $\varepsilon '$ will be computed in Sect. 5).

If every produced subset sum of the previous step belongs to a distinct bin, then, we can infer that the number of subsets of large elements is bounded by $n / \varepsilon ^2$. Moreover, we can prove the following lemma.

Lemma 2

If the optimal ratio $r_{\text { opt}}$ involves sets $S^*_1, S^*_2$ consisting of only large elements, with $S^*_1 \cup S^*_2 = S^* \subseteq L$ and $a_n \in S^*$, then $\Sigma (\overline{S_p}) / \Sigma (S_p) \le (1 + \varepsilon ) \cdot r_{\text { opt}}$, where $S_p$ is a $(1 - \varepsilon ')$-apx solution to the Partition problem on input $S^*$.

Proof

Assume that $\Sigma (S^*_1) \le \Sigma (S^*_2)$. Note that sets $S_1^*, S_2^*$ are also the optimal solution of the Partition problem on input $S^*$. By running a $(1 - \varepsilon ')$ approximate Partition algorithm on input set $S^*$, we obtain the sets $S_1, S_2$ with $\Sigma (S_1) \le \Sigma (S_2)$, where $S_1 = S_p$ and $S_2 = \overline{S_p}$. Then,

$$\begin{aligned} \frac{\Sigma (S_2)}{\Sigma (S_1)}&\le \frac{\Sigma (S^*_2) + \varepsilon ' \cdot \Sigma (S^*_1)}{(1 - \varepsilon ') \Sigma (S^*_1)}\\&\le \frac{\Sigma (S^*_2) + \varepsilon ' \cdot \Sigma (S^*_2)}{(1 - \varepsilon ') \Sigma (S^*_1)}\\&= \frac{1 + \varepsilon '}{1 - \varepsilon '} \cdot \frac{\Sigma (S^*_2)}{\Sigma (S^*_1)}\\&\le (1 + \varepsilon ) \cdot r_{\text { opt}} \end{aligned}$$

where we used the fact that $(1 - \varepsilon ') \cdot \Sigma (S^*_1) \le \Sigma (S_1)$ as well as $\Sigma (S_2) \le \Sigma (S^*_2) + \varepsilon ' \cdot \Sigma (S^*_1)$. $\square $

Therefore, we have proved that when the optimal solution consists of sets comprised of only large elements, it is possible to find a ($1 + \varepsilon $)-approximation solution for the constrained Subset Sum Ratio problem by running a $(1 - \varepsilon ')$-approximation algorithm for Partition with input the union of said large elements. In order to do so, it suffices to consider as input all the $2^{|{L}|-1}$ subsets of L containing $a_n$ and each time run a $(1 - \varepsilon ')$-approximation Partition algorithm. The total cost of this procedure will be thoroughly analyzed in Sect. 5 and depends on the algorithm used.

It is important to note that by utilizing an (exact or approximation) algorithm for Partition, we establish a connection between the complexities of Partition and approximating Subset Sum Ratio in a way that any future improvement in the first carries over to the second.

3.3 General $(1+\varepsilon )$-approximate solutions

Whereas we previously considered optimal solutions involving exclusively large elements, here we will search for approximations for those optimal solutions that use all the elements of the input set, hence include small elements, and satisfy our constraint (i.e. $a_n$ belongs to the optimal solution sets). We will prove that in order to approximate those optimal solutions, it suffices to consider only the $(1 - \varepsilon ')$-apx solutions of the Partition problem corresponding to each subset of large elements and add small elements to them. In other words, instead of considering any two random disjoint subsets consisting of large elements^{Footnote 2} and subsequently adding to these the small elements, we can consider only the $(1 - \varepsilon ')$-approximate solutions to the Partition problem computed in the previous step, ergo, at most $B = n / \varepsilon ^2$ configurations regarding the large elements. Moreover, we will prove that it suffices to add the small elements to our solution in a greedy way.

Since the algorithm has not detected a solution so far, due to Lemma 1 every computed subset sum of set L belongs to a different bin. Thus, their total number is bounded by the number of bins B, i.e.

$$\begin{aligned} 2^{|{L}|} \le \left( \frac{n}{\varepsilon ^2} \right) \iff |{L}| \le \log \left( \frac{n}{\varepsilon ^2} \right) \end{aligned}$$

We proceed by additionally involving small elements into our solutions in order to reduce the difference between the sums of the sets, thus reducing their ratio.

Lemma 3

Assume that we are given the $(1 - \varepsilon ')$-apx solutions for the Partition problem on every subset of large elements containing $a_n$. Then, a $(1 + \varepsilon )$-approximation solution for the constrained version of Subset Sum Ratio can be found, when the optimal solution involves small elements.

Proof

Let $S^*_1,S^*_2$ be disjoint subsets that form an optimal solution for the constrained version of Subset Sum Ratio, where:

$\Sigma (S^*_1) \le \Sigma (S^*_2)$ and $a_n \in S^* = S^*_1 \cup S^*_2$.
$S^*_1 = L^*_1 \cup M^*_1$ and $S^*_2 = L^*_2 \cup M^*_2$, where $L^*_1, L^*_2 \subseteq L$ and $M^*_1, M^*_2 \subseteq M$.
$M^*_1 \cup M^*_2 \ne \emptyset $.

Moreover, let $L^*_p$ and $\overline{L^*_p}$ be the optimal solution of the Partition problem on input $L^* = L_1^* \cup L_2^*$, while $L_p$ and $\overline{L_p}$ be the sets returned by a $(1 - \varepsilon ')$-apx algorithm. Then, it holds that:

$\Sigma (L^*_p) \le \Sigma (\overline{L^*_p})$ and $\Sigma (\overline{L^*_p}) - \Sigma (L^*_p) \le |{\Sigma (L^* {\setminus } X) - \Sigma (X)}|, \forall X \subseteq L^*$.
$(1 - \varepsilon ') \cdot \Sigma (L^*_p) \le \Sigma (L_p) \le \Sigma (L^*_p)$.
$\Sigma (\overline{L^*_p}) \le \Sigma (\overline{L_p}) \le \Sigma (\overline{L^*_p}) + \varepsilon ' \cdot \Sigma (L^*_p) \le (1 + \varepsilon ') \cdot \Sigma (\overline{L^*_p})$.
$a_n \le \Sigma (\overline{L^*_p})$, since $a_n \in L^*$.

Case 1. Suppose that $\Sigma (L_p) + \Sigma (M) \ge \Sigma (\overline{L_p})$. In this case, there exists k such that $M_k = \left\{ a_i \in M \mid i \in [k] \right\} \subseteq M$ and $0 \le \Sigma (L_p \cup M_k) - \Sigma (\overline{L_p}) \le \varepsilon \cdot a_n$, since all elements of M have value less than $\varepsilon \cdot a_n$. Hence,

$$\begin{aligned} 1 \le \frac{\Sigma (L_p \cup M_k)}{\Sigma (\overline{L_p})} \le 1 + \frac{\varepsilon \cdot a_n}{\Sigma (\overline{L_p})} \le 1 + \frac{\varepsilon \cdot a_n}{a_n} = 1 + \varepsilon . \end{aligned}$$

Case 2. Alternatively, it holds that $\Sigma (L_p) + \Sigma (M) < \Sigma (\overline{L_p})$. Then,

$$\begin{aligned} \frac{\Sigma (\overline{L_p})}{\Sigma (L_p \cup M)}&= \frac{\Sigma (\overline{L_p})}{\Sigma (L_p) + \Sigma (M)}\\&\le \frac{(1 + \varepsilon ') \cdot \Sigma (\overline{L^*_p})}{ (1 - \varepsilon ') \cdot \Sigma (L^*_p) + \Sigma (M)}\\&\le \frac{1 + \varepsilon '}{1 - \varepsilon '} \cdot \frac{\Sigma (\overline{L^*_p})}{\Sigma (L^*_p) + \Sigma (M)}\\&\le (1 + \varepsilon ) \cdot \frac{\Sigma (\overline{L^*_p})}{\Sigma (L^*_p) + \Sigma (M)}. \end{aligned}$$

If $\Sigma (L^*_p) + \Sigma (M) \ge \Sigma (\overline{L^*_p})$, then it follows that $\frac{\Sigma (\overline{L_p})}{\Sigma (L_p \cup M)} \le 1 + \varepsilon $. On the other hand, if $\Sigma (L^*_p) + \Sigma (M) < \Sigma (\overline{L^*_p})$, then it follows that $\Sigma (S_1^*) = \Sigma (L^*_p \cup M)$ and $\Sigma (S_2^*) = \Sigma (\overline{L^*_p})$, therefore $\frac{\Sigma (\overline{L_p})}{\Sigma (L_p \cup M)} \le (1 + \varepsilon ) \cdot \frac{\Sigma (S^*_{2})}{\Sigma (S^*_{1})}$. $\square $

3.4 Adding small elements efficiently

Here, we will describe a method to efficiently add small elements to our sets. In particular, we search for some k such that $0 \le \Sigma (L_p \cup M_k) - \Sigma (\overline{L_p}) \le \varepsilon \cdot a_n$, where $M_k = \left\{ a_i \in M \mid i \in [k] \right\} $. Notice that if $\Sigma (M) \ge \Sigma (\overline{L_p}) - \Sigma (L_p)$, there always exists such a set $M_k$, since by definition, each element of set M is smaller than $\varepsilon \cdot a_n$. In order to determine $M_k$, we make use of an array of partial sums $T[k] = \Sigma (M_k)$, where $k \le |{M}|$. Since T is sorted, we can find k in $\mathcal{O}(\log |{L}|) = \mathcal{O}(\log n)$ using binary search.

4 Final algorithm

The algorithm presented in the previous section constitutes an approximation scheme for Subset Sum Ratio when one of the solution subsets contains the maximum element of the input set. Thus, in order to solve the Subset Sum Ratio problem, it suffices to run the previous algorithm n times, where n depicts the cardinality of the input set A, while each time removing the max element of A.

In particular, suppose that the optimal solution involves disjoint sets $S_1^*$ and $S_2^*$, where $a_k = \max (S_1^* \cup S_2^*)$. There exists an iteration for which the algorithm considers as input the set $A_k = \left\{ a_1, \ldots , a_k \right\} $. In this iteration, the element $a_k$ is the largest element and the algorithm searches for an approximation of the optimal solution for which $a_k$ is contained in one of the solution subsets. The optimal solution of the unconstrained version of Subset Sum Ratio has this property so the ratio of the approximate solution that the algorithm of the previous section returns is at most $(1 + \varepsilon )$ times the optimal.

Consequently, n repetitions of the algorithm suffice to construct an FPTAS for Subset Sum Ratio. Notice that if at some repetition, the sets returned due to the algorithm of Sect. 3 have ratio at most $1 + \varepsilon $, then this ratio successfully approximates the optimal ratio $r_{\text { opt}} \ge 1$, since $1 + \varepsilon \le (1 + \varepsilon ) \cdot r_{\text { opt}}$, therefore they constitute an approximate solution.

5 Complexity

The total complexity of the final algorithm is determined by three distinct operations, over the n iterations of the algorithm:

1.
The cost to compute all the possible subset sums occurring from large elements. It suffices to consider the case where this is bounded by the number of bins $B = n / \varepsilon ^2$, due to Lemma 1.
2.
The cost to compute an exact or $(1 - \varepsilon ')$-apx Partition solution on each subset of large elements. The cost of this operation will be analyzed in the following subsection.
3.
The cost to include small elements to the Partition solutions. There are B such solutions, and each requires $\mathcal{O}(\log n)$ time, and thus the total time required is $\mathcal{O}\left( \frac{n}{\varepsilon ^2} \cdot \log n \right) $.

5.1 Complexity of partition computations

5.1.1 Using exact partition computations

First, we will consider the case where we compute the optimal solution of the Partition problem. In order to do so, we will use the standard meet-in-the-middle algorithm [18] for Subset Sum, and in the following we analyze its complexity.

Let subset $L' \subseteq L$ such that $|{L'}| = k$, and suppose we are given an exact Partition algorithm of complexity $\mathcal{O}(2^{\alpha k} \cdot k^{\beta })$, for some constants $\alpha , \beta $. Notice that the number of subsets of L of cardinality k is $\left( {\begin{array}{c}|{L}|\\ k\end{array}}\right) $ and that $|{L}| \le \log (n / \varepsilon ^2)$. Then, it holds that

$$\begin{aligned} \sum _{k = 0}^{|{L}|} \left( {\begin{array}{c}|{L}|\\ k\end{array}}\right) \cdot 2^{\alpha k} \cdot k^{\beta }&\le |{L}|^{\beta } \cdot \sum _{k = 0}^{|{L}|} \left( {\begin{array}{c}|{L}|\\ k\end{array}}\right) \cdot 2^{\alpha k}\\&= |{L}|^{\beta } \cdot \left( 1 + 2^\alpha \right) ^{|{L}|}\\&= |{L}|^{\beta } \cdot 2^{|{L}| \log (1 + 2^\alpha )}\\&\le \log ^{\beta } (n / \varepsilon ^2) \cdot (n / \varepsilon ^2)^{\log (1 + 2^\alpha )} \end{aligned}$$

where we used the binomial theorem. By employing the meet in the middle algorithm [18], where $\alpha = 1/2$ and $\beta = 1$, it follows that $\log (1 + 2^\alpha ) = 1.271\ldots < 1.3$. Consequently, the complexity to solve the Partition problem for all the subsets of large elements is

$$\begin{aligned} \mathcal{O}\left( \frac{n^{1.3}}{\varepsilon ^{2.6}} \cdot \log (n / \varepsilon ^2) \right) = \tilde{\mathcal{O}}\left( \frac{n^{1.3}}{\varepsilon ^{2.6}} \right) \end{aligned}$$

5.1.2 Using approximate partition computations

Here we will analyze the complexity in the case we run an approximate Partition algorithm in order to compute the $(1 - \varepsilon ')$-approximation solutions.

For subset $L' \subseteq L$, we run an approximate Partition algorithm with error margin $\varepsilon '$ such that

$$\begin{aligned} \frac{1 + \varepsilon '}{1 - \varepsilon '} \le 1 + \varepsilon \iff \varepsilon ' \le \frac{\varepsilon }{2 + \varepsilon } \end{aligned}$$

and by choosing the maximum such $\varepsilon '$, it holds that

$$\begin{aligned} \varepsilon ' = \frac{\varepsilon }{2 + \varepsilon } \implies \frac{1}{\varepsilon '} = \frac{2 + \varepsilon }{\varepsilon } = \frac{2}{\varepsilon } + 1 \implies \frac{1}{\varepsilon '} = \mathcal{O}\left( \frac{1}{\varepsilon } \right) \end{aligned}$$

Since there are at most $n / \varepsilon ^2$ subsets of large elements, we will need to run said algorithm at most $n / \varepsilon ^2$ times on $|{L'}| \le |{L}|$ elements and with error margin $\varepsilon '$.

Note that any approximate Subset Sum algorithm could be used in order to approximate Partition, such as the one presented by Kellerer et al. [21] of complexity $\mathcal{O}\left( \min \left\{ \frac{n}{\varepsilon }, n + \frac{1}{\varepsilon ^2} \cdot \log (1 / \varepsilon ) \right\} \right) $. In our case, with $|{L}| = \log (n / \varepsilon ^2)$ and error margin $\varepsilon '$, the total complexity is

$$\begin{aligned}&\mathcal{O}\left( \frac{n}{\varepsilon ^2} \cdot \min \left\{ \frac{|{L}|}{\varepsilon '}, |{L}| + \frac{1}{(\varepsilon ')^2} \cdot \log (1 / \varepsilon ') \right\} \right) =\\&\mathcal{O}\left( \frac{n}{\varepsilon ^2} \cdot \min \left\{ \frac{\log (n / \varepsilon ^2)}{\varepsilon }, \log (n / \varepsilon ^2) + \frac{1}{\varepsilon ^2} \cdot \log (1 / \varepsilon ) \right\} \right) = \\&\tilde{\mathcal{O}} \left( \frac{n}{\varepsilon ^3} \right) . \end{aligned}$$

Using the state of the art $\tilde{\mathcal{O}}(n + (1 / \varepsilon )^{1.25})$ algorithm of Deng et al. [16] for approximating Partition, one could, in some cases, further improve the last term of the previous minimum. However, since the Partition instances that we are solving involve $|{L}| = \log (n / \varepsilon ^2)$ elements, any improvement resulting from said approximation algorithm would only affect polylogarithmic factors. Due to this, the algorithm of Kellerer et al. has a better performance compared to other Partition approximation algorithms, in case we choose to ignore those factors. On the other hand, if one takes them into account, it might be preferable to use the aforementioned algorithm of Deng et al. (always depending on the relation between n and $\varepsilon $).

5.2 Total complexity

The total complexity of the algorithm occurs from the n distinct iterations required and depends on the algorithm chosen to find the (exact or approximate) solution to the Partition problem, since all of the presented algorithms dominate the time of the rest of the operations. Thus, by choosing the fastest one (depending on the relationship between n and $\varepsilon $), the final complexity is

$$\begin{aligned} \tilde{\mathcal{O}}\left( \min \left\{ \frac{n^{2.3}}{\varepsilon ^{2.6}}, \frac{n^2}{\varepsilon ^3} \right\} \right) \end{aligned}$$

6 Conclusion and future work

The main contribution of this paper, apart from the introduction of a new FPTAS for the Subset Sum Ratio problem, is the establishment of a connection between Partition and approximating Subset Sum Ratio. In particular, our scheme employs Partition computations, and any improvement in the latter will have an effect to its complexity.

Additionally, we establish that the complexity of approximating Subset Sum Ratio, expressed in the form $\tilde{\mathcal{O}}(n ^{c_1} / \varepsilon ^{c_2})$, has $c_1 < 2.3$ and $c_1+c_2 < 5$, which is an improvement over all the previously presented FPTASes for the problem. Moreover, the exponent of n may go down to 2 if we employ approximation Partition algorithms, which is a significant improvement over the $\mathcal{O}(n^4 / \varepsilon )$ algorithm of [25].

It is important to note however, that there is a distinct limit to the complexity that one may achieve for the Subset Sum Ratio problem using the techniques discussed in this paper, since although of polylogarithmic size, the number of Partition instances required to be solved is $\mathcal{O}(n^2 / \varepsilon ^2)$ in total. Consequently, an interesting natural question arising from our work, is whether one can further improve the complexity of the problem, possibly developing a $\mathcal{O}(n^{c_1} / \varepsilon ^{c_2})$ algorithm, where $c_1 < 2$ or even $c_1 + c_2 < 4$.

As another direction for future research, we consider the use of exact Subset Sum or Partition algorithms parameterized by a concentration parameter $\beta $, as described in [4, 5], where they solve the decision version of Subset Sum. See also [17] for a use of this parameter under a pseudopolynomial setting. It would be interesting to investigate whether analogous arguments could be used to solve the optimization version.

Notes

Standard $\mathcal{O}^{\star }$ and $\tilde{\mathcal{O}}$ notation is used to hide polynomial and polylogarithmic factors respectively.
Note that the number of these random pairs is $2 \cdot 3^{|{L}|-1}$, since $a_n$ is necessarily part of the solution.

References

Abboud, A., Bringmann, K., Hermelin, D., Shabtay, D.: Seth-based lower bounds for subset sum and bicriteria path. ACM Trans. Algorithms 18(1), 6–1622 (2022). https://doi.org/10.1145/3450524
Article MathSciNet Google Scholar
Alonistiotis, G., Antonopoulos, A., Melissinos, N., Pagourtzis, A., Petsalakis, S., Vasilakis, M.: Approximating subset sum ratio via subset sum computations. In: Combinatorial Algorithms - 33rd International Workshop, IWOCA 2022. Lecture Notes in Computer Science, vol. 13270, pp. 73–85. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06678-8_6
Antonopoulos, A., Pagourtzis, A., Petsalakis, S., Vasilakis, M.: Faster algorithms for k-subset sum and variations. J. Comb. Optim. 45(1), 24 (2023). https://doi.org/10.1007/s10878-022-00928-0
Article MathSciNet Google Scholar
Austrin, P., Kaski, P., Koivisto, M., Nederlof, J.: Dense subset sum may be the hardest. In: 33rd Symposium on Theoretical Aspects of Computer Science, STACS 2016. LIPIcs, vol. 47, pp. 13–11314. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2016). https://doi.org/10.4230/LIPIcs.STACS.2016.13
Austrin, P., Kaski, P., Koivisto, M., Nederlof, J.: Subset sum in the absence of concentration. In: 32nd International Symposium on Theoretical Aspects of Computer Science, STACS 2015. LIPIcs, vol. 30, pp. 48–61. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2015). https://doi.org/10.4230/LIPIcs.STACS.2015.48
Bazgan, C., Santha, M., Tuza, Z.: Efficient approximation algorithms for the SUBSET-SUMS EQUALITY problem. J. Comput. Syst. Sci. 64(2), 160–170 (2002). https://doi.org/10.1006/jcss.2001.1784
Article MathSciNet Google Scholar
Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton (1957)
Google Scholar
Bringmann, K., Nakos, V.: A fine-grained perspective on approximating subset sum and partition. In: Proceedings of the 2021 ACM-SIAM symposium on discrete algorithms, SODA 2021, pp. 1797–1815. SIAM, USA (2021). https://doi.org/10.1137/1.9781611976465.108
Bringmann, K., Nakos, V.: Top-k-convolution and the quest for near-linear output-sensitive subset sum. In: Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, pp. 982–995. ACM, New York, NY, USA (2020). https://doi.org/10.1145/3357713.3384308
Bringmann, K.: A near-linear pseudopolynomial time algorithm for subset sum. In: Proceedings of the twenty-eighth annual ACM-SIAM symposium on discrete algorithms, SODA 2017, pp. 1073–1084. SIAM, USA (2017). https://doi.org/10.1137/1.9781611974782.69
Cieliebak, M., Eidenbenz, S.J., Pagourtzis, A.: Composing equipotent teams. In: Fundamentals of computation theory, 14th international symposium, FCT 2003. Lecture Notes in Computer Science, vol. 2751, pp. 98–108. Springer, Berlin, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45077-1_10
Cieliebak, M., Eidenbenz, S.J., Penna, P.: Noisy data make the partial digest problem NP-hard. In: Algorithms in bioinformatics, third international workshop, WABI 2003. Lecture Notes in Computer Science, vol. 2812, pp. 111–123. Springer, Berlin, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39763-2_9
Cieliebak, M., Eidenbenz, S.J.: Measurement errors make the partial digest problem np-hard. In: LATIN 2004: theoretical informatics, 6th Latin American symposium. Lecture Notes in Computer Science, vol. 2976, pp. 379–390. Springer, Berlin, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24698-5_42
Cieliebak, M., Eidenbenz, S.J., Pagourtzis, A., Schlude, K.: On the complexity of variations of equal sum subsets. Nord. J. Comput. 14(3), 151–172 (2008)
MathSciNet Google Scholar
Cygan, M., Mucha, M., Wegrzycki, K., Wlodarczyk, M.: On problems equivalent to (min, +)-convolution. ACM Trans. Algorithms 15(1), 14–11425 (2019). https://doi.org/10.1145/3293465
Article Google Scholar
Deng, M., Jin, C., Mao, X.: Approximating knapsack and partition via dense subset sums. In: Proceedings of the 2023 ACM-SIAM symposium on discrete algorithms, SODA 2023, pp. 2961–2979. SIAM, USA (2023). https://doi.org/10.1137/1.9781611977554.ch113
Dutta, P., Rajasree, M.S.: Algebraic algorithms for variants of subset sum. In: Algorithms and Discrete Applied Mathematics - 8th International Conference, CALDAM 2022. Lecture Notes in Computer Science, vol. 13179, pp. 237–251. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-95018-7_19
Horowitz, E., Sahni, S.: Computing partitions with applications to the knapsack problem. J. ACM 21(2), 277–292 (1974). https://doi.org/10.1145/321812.321823
Article MathSciNet Google Scholar
Jin, C., Wu, H.: A simple near-linear pseudopolynomial time randomized algorithm for subset sum. In: 2nd symposium on simplicity in algorithms, SOSA 2019. OASIcs, vol. 69, pp. 17–1176. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2019). https://doi.org/10.4230/OASIcs.SOSA.2019.17
Karp, R.M.: Reducibility among combinatorial problems. In: Proceedings of a symposium on the complexity of computer computations. The IBM Research Symposia Series, pp. 85–103. Springer, Boston, MA (1972). https://doi.org/10.1007/978-1-4684-2001-2_9
Kellerer, H., Mansini, R., Pferschy, U., Speranza, M.G.: An efficient fully polynomial approximation scheme for the subset-sum problem. J. Comput. Syst. Sci. 66(2), 349–370 (2003). https://doi.org/10.1016/S0022-0000(03)00006-0
Article MathSciNet Google Scholar
Khan, M.A.: Some problems on graphs and arrangements of convex bodies. PRISM (2017). https://doi.org/10.11575/PRISM/10182
Article Google Scholar
Koiliaris, K., Xu, C.: Faster pseudopolynomial time algorithms for subset sum. ACM Trans. Algorithms 15(3), 40–14020 (2019). https://doi.org/10.1145/3329863
Article MathSciNet Google Scholar
Lipton, R.J., Markakis, E., Mossel, E., Saberi, A.: On approximately fair allocations of indivisible goods. In: Proceedings of the 5th ACM conference on electronic commerce (EC-2004), pp. 125–131. ACM, New York, NY, USA (2004). https://doi.org/10.1145/988772.988792
Melissinos, N., Pagourtzis, A.: A faster FPTAS for the subset-sums ratio problem. In: Computing and Combinatorics—24th International Conference, COCOON 2018. Lecture Notes in Computer Science, vol. 10976, pp. 602–614. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94776-1_50
Melissinos, N., Pagourtzis, A., Triommatis, T.: Approximation schemes for subset-sums ratio problems. Theor. Comput. Sci. 931, 17–30 (2022). https://doi.org/10.1016/j.tcs.2022.07.027
Article MathSciNet Google Scholar
Mucha, M., Nederlof, J., Pawlewicz, J., Wegrzycki, K.: Equal-subset-sum faster than the meet-in-the-middle. In: 27th annual european symposium on algorithms, ESA 2019. LIPIcs, vol. 144, pp. 73–17316 (2019). https://doi.org/10.4230/LIPIcs.ESA.2019.73
Mucha, M., Wegrzycki, K., Wlodarczyk, M.: A subquadratic approximation scheme for partition. In: Proceedings of the thirtieth annual ACM-SIAM symposium on discrete algorithms, SODA 2019, pp. 70–88. SIAM, USA (2019). https://doi.org/10.1137/1.9781611975482.5
Nanongkai, D.: Simple FPTAS for the subset-sums ratio problem. Inf. Process. Lett. 113(19–21), 750–753 (2013). https://doi.org/10.1016/j.ipl.2013.07.009
Article MathSciNet Google Scholar
Papadimitriou, C.H.: On the complexity of the parity argument and other inefficient proofs of existence. J. Comput. Syst. Sci. 48(3), 498–532 (1994). https://doi.org/10.1016/S0022-0000(05)80063-7
Article MathSciNet Google Scholar
Pisinger, D.: Linear time algorithms for knapsack problems with bounded weights. J. Algorithms 33(1), 1–14 (1999). https://doi.org/10.1006/jagm.1999.1034
Article MathSciNet Google Scholar
Voloch, N.: Mssp for 2-d sets with unknown parameters and a cryptographic application. Contemp. Eng. Sci. 10, 921–931 (2017). https://doi.org/10.12988/ces.2017.79101
Article Google Scholar
Woeginger, G.J., Yu, Z.: On the equal-subset-sum problem. Inf. Process. Lett. 42(6), 299–302 (1992). https://doi.org/10.1016/0020-0190(92)90226-L
Article MathSciNet Google Scholar

Download references

Acknowledgements

Work primarily conducted while Nikolaos Melissinos was affiliated with Université Paris-Dauphine, PSL University, and Manolis Vasilakis was affiliated with the School of Electrical and Computer Engineering, National Technical University of Athens.

Funding

Open access funding provided by HEAL-Link Greece. Aris Pagourtzis has been partially supported by project MIS 5154714 of the National Recovery and Resilience Plan Greece 2.0 funded by the European Union under the NextGenerationEU Program. Aris Pagourtzis and Stavros Petsalakis were supported in part by the PEVE 2020 basic research support program of the National Technical University of Athens. Nikolaos Melissinos is supported by the CTU Global postdoc fellowship program.

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, National Technical University of Athens, 15780, Zografou, Greece
Giannis Alonistiotis, Antonis Antonopoulos, Aris Pagourtzis & Stavros Petsalakis
Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague, Prague, Czech Republic
Nikolaos Melissinos
Archimedes Unit, Athena Research Center, 15125, Marousi, Greece
Aris Pagourtzis
Université Paris-Dauphine, CNRS UMR7243, LAMSADE, PSL University, 75016, Paris, France
Manolis Vasilakis

Authors

Giannis Alonistiotis
View author publications
You can also search for this author in PubMed Google Scholar
Antonis Antonopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Nikolaos Melissinos
View author publications
You can also search for this author in PubMed Google Scholar
Aris Pagourtzis
View author publications
You can also search for this author in PubMed Google Scholar
Stavros Petsalakis
View author publications
You can also search for this author in PubMed Google Scholar
Manolis Vasilakis
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors had an equal contribution in all parts of the paper. The name order is alphabetical.

Corresponding author

Correspondence to Aris Pagourtzis.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Alonistiotis, G., Antonopoulos, A., Melissinos, N. et al. Approximating subset sum ratio via partition computations. Acta Informatica 61, 101–113 (2024). https://doi.org/10.1007/s00236-023-00451-7

Download citation

Received: 06 December 2023
Accepted: 08 December 2023
Published: 12 January 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s00236-023-00451-7

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Approximating subset sum ratio via partition computations

Abstract

Similar content being viewed by others

Approximating Subset Sum Ratio via Subset Sum Computations

Approximation Schemes for Subset Sum Ratio Problems

Faster algorithms for k-subset sum and variations

1 Introduction

1.1 Related work

1.2 Our contribution

2 Preliminaries

Definition 1

Definition 2

3 Scheme for a restricted version

3.1 Outline of the algorithm

3.2 Solution involving exclusively large elements

Lemma 1

Proof

Lemma 2

Proof

3.3 General \((1+\varepsilon )\)-approximate solutions

Lemma 3

Proof

3.4 Adding small elements efficiently

4 Final algorithm

5 Complexity

5.1 Complexity of partition computations

5.1.1 Using exact partition computations

5.1.2 Using approximate partition computations

5.2 Total complexity

6 Conclusion and future work

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation