\(\ell _1\)-Sparsity approximation bounds for packing integer programs


We consider approximation algorithms for packing integer programs (PIPs) of the form \(\max \{\langle c, x\rangle : Ax \le b, x \in \{0,1\}^n\}\) where A, b and c are nonnegative. We let \(W = \min _{i,j} b_i / A_{i,j}\) denote the width of A which is at least 1. Previous work by Bansal et al. (Theory Comput 8(24):533–565, 2012) obtained an \(\varOmega (\frac{1}{\varDelta _0^{1/\lfloor W \rfloor }})\)-approximation ratio where \(\varDelta _0\) is the maximum number of nonzeroes in any column of A (in other words the \(\ell _0\)-column sparsity of A). They raised the question of obtaining approximation ratios based on the \(\ell _1\)-column sparsity of A (denoted by \(\varDelta _1\)) which can be much smaller than \(\varDelta _0\). Motivated by recent work on covering integer programs (Chekuri and Quanrud, in: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp 1596–1615. SIAM, 2019; Chen et al., in: Proceedings of the Twenty-seventh Annual ACM-SIAM Symposium on Discrete Algorithms, pp 1984–2003. SIAM, 2016) we show that simple algorithms based on randomized rounding followed by alteration, similar to those of Bansal et al. (Theory Comput 8(24):533–565, 2012) (but with a twist), yield approximation ratios for PIPs based on \(\varDelta _1\). First, following an integrality gap example from (Theory Comput 8(24):533–565, 2012), we observe that the case of \(W=1\) is as hard as maximum independent set even when \(\varDelta _1 \le 2\). In sharp contrast to this negative result, as soon as width is strictly larger than one, we obtain positive results via the natural LP relaxation. For PIPs with width \(W = 1 + \epsilon \) where \(\epsilon \in (0,1]\), we obtain an \(\varOmega (\epsilon ^2/\varDelta _1)\)-approximation. In the large width regime, when \(W \ge 2\), we obtain an \(\varOmega ((\frac{1}{1 + \varDelta _1/W})^{1/(W-1)})\)-approximation. We also obtain a \((1-\epsilon )\)-approximation when \(W = \varOmega (\frac{\log (\varDelta _1/\epsilon )}{\epsilon ^2})\). Viewing the rounding algorithms as contention resolution schemes, we obtain approximation algorithms in the more general setting when the objective is a non-negative submodular function.


Packing integer programs (abbr. PIPs) are an expressive class of integer programs of the form:

$$\begin{aligned} \text {maximize} \ \langle c, x\rangle \ \text {over} \ x \in \{0,1\}^n \ \text {s.t.} \ Ax \le b, \end{aligned}$$

where \(A \in {\mathbb {R}}_{\ge 0}^{m \times n}\), \(b \in {\mathbb {R}}_{\ge 0}^m\) and \(c \in {\mathbb {R}}_{\ge 0}^n\) all have nonnegative entriesFootnote 1. Many important problems in discrete and combinatorial optimization can be cast as special cases of PIPs. These include the maximum independent set in graphs and hypergraphs, set packing, matchings and b-matchings, knapsack (when \(m=1\)), and the multi-dimensional knapsack. The maximum independent set problem (MIS), a special case of PIPs, is NP-hard and unless \(P=NP\) there is no \(n^{1-\epsilon }\)-approximation where n is the number of nodes in the graph [14, 23]. For this reason it is meaningful to consider special cases and other parameters that control the difficulty of PIPs. Motivated by the fact that MIS admits a simple \(\frac{1}{\varDelta (G)}\)-approximation where \(\varDelta (G)\) is the maximum degree of G, previous work considered approximating PIPs based on the maximum number of nonzeroes in any column of A (denoted by \(\varDelta _0\)); note that when MIS is written as a PIP, \(\varDelta _0\) coincides with \(\varDelta (G)\). As another example, when maximum weight matching is written as a PIP, \(\varDelta _0 = 2\). Bansal et al. [1] obtained a simple and clever algorithm that achieved an \(\varOmega (1/\varDelta _0)\)-approximation for PIPs via the natural LP relaxation; this improved previous work of Pritchard [17, 18] who was the first to obtain an approximation for PIPs only as a function of \(\varDelta _0\). Moreover, the rounding algorithm in [1] can be viewed as a contention resolution scheme which allows one to get similar approximation ratios even when the objective is submodular [1, 8]. It is well-understood that PIPs become easier when the entries in A are small compared to the packing constraints b. To make this quantitative we consider the well-studied notion called the width defined as \(W := \min _{i,j: A_{i,j} > 0} b_i/A_{i,j}\). Bansal et al. obtain an \(\varOmega ( (\frac{1}{\varDelta _0})^{1/\lfloor W \rfloor })\)-approximation which improves as W becomes larger. Although they do not state it explicitly, their approach also yields a \((1-\epsilon )\)-approximation when \(W = \varOmega (\frac{1}{\epsilon ^2}\log (\varDelta _0/\epsilon ))\).

\(\varDelta _0\) is a natural measure for combinatorial applications such as MIS and matchings where the underlying matrix A has entries from \(\{0,1\}\). However, in some applications of PIPs such as knapsack and its multi-dimensional generalization which are more common in resource-allocation problems, the entries of A are arbitrary rational numbers (which can be assumed to be from the interval [0, 1] after scaling). In such applications it is natural to consider another measure of column-sparsity based on the \(\ell _1\) norm. Specifically we consider \(\varDelta _1\), the maximum column sum of A. Unlike \(\varDelta _0\), \(\varDelta _1\) is not scale invariant so one needs to be careful in understanding the parameter and its relationship to the width W. For this purpose we normalize the constraints \(Ax \le b\) as follows. Let \(W = \min _{i,j: A_{i,j} > 0} b_i/A_{i,j}\) denote the width as before (we can assume without loss of generality that \(W \ge 1\) since we are interested in integer solutions). We can then scale each row \(A_i\) of A separately such that, after scaling, the i’th constraint reads as \(A_i x \le W\). After scaling all rows in this fashion, entries of A are in the interval [0, 1], and the maximum entry of A is equal to 1. Note that this scaling process does not alter the original width. We let \(\varDelta _1\) denote the maximum column sum of A after this normalization and observe that \(1 \le \varDelta _1 \le \varDelta _0\). In many settings of interest \(\varDelta _1 \ll \varDelta _0\). We also observe that \(\varDelta _1\) is a more robust measure than \(\varDelta _0\); small perturbations of the entries of A can dramatically change \(\varDelta _0\) while \(\varDelta _1\) changes minimally.

Bansal et al. raised the question of obtaining an approximation ratio for PIPs as a function of only \(\varDelta _1\). They observed that this is not feasible via the natural LP relaxation by describing a simple example where the integrality gap of the LP is \(\varOmega (n)\) while \(\varDelta _1\) is a constant. Their example essentially shows the existence of a simple approximation preserving reduction from MIS to PIPs such that the resulting instances have \(\varDelta _1 \le 2\); thus no approximation ratio that depends only on \(\varDelta _1\) is feasible for PIPs unless \(P=NP\). These negative results seem to suggest that pursuing bounds based on \(\varDelta _1\) is futile, at least in the worst case. However, the starting point of this paper is the observation that both the integrality gap example and the hardness result are based on instances where the width W of the instance is arbitrarily close to 1. We demonstrate that these examples are rather brittle and obtain several positive results when we consider \(W \ge 1+\epsilon \) for any fixed \(\epsilon > 0\).

Our results

Our first result is on the hardness of approximation for PIPs that we already referred to. The hardness result suggests that one should consider instances with \(W > 1\). Recall that after normalization we have \(\varDelta _1 \ge 1\) and \(W \ge 1\) and the maximum entry of A is 1. We consider three regimes of W and obtain the following results, all via the natural LP relaxation, which also establish corresponding upper bounds on the integrality gap.

  1. (i)

    \(1 <W \le 2\). For \(W = 1+\epsilon \) where \(\epsilon \in (0,1]\) we obtain an \(\varOmega (\frac{\epsilon ^2}{\varDelta _1})\)-approximation.

  2. (ii)

    \(W \ge 2\). We obtain an \(\varOmega ( (\frac{1}{1+ \frac{\varDelta _1}{W}})^{1/(W-1)})\)-approximation which can be simplified to \(\varOmega ((\frac{1}{1+ \varDelta _1})^{1/(W-1)})\) since \(W \ge 1\).

  3. (iii)

    A \((1-\epsilon )\)-approximation when \(W = \varOmega (\frac{1}{\epsilon ^2}\log (\varDelta _1/\epsilon ))\).

Our results establish approximation bounds based on \(\varDelta _1\) that are essentially the same as those based on \(\varDelta _0\) as long as the width is not too close to 1. When the matrix A is a \(\{0,1\}\)-matrix, \(\varDelta _1 = \varDelta _0\), and previous integrality gap results based on \(\varDelta _0\) [1] show that the bounds we obtain are essentially tight modulo constant factors. We describe randomized algorithms which can be derandomized via standard techniques.

All our algorithms are based on a simple randomized rounding plus alteration framework that has been successful for both packing and covering problems. Our scheme is similar to that of Bansal et al. at a high level but we make a simple but important change in the algorithm and its analysis. This is inspired by recent work on covering integer programs [5] where \(\ell _1\)-sparsity based approximation bounds from [9] were simplified.

The rounding algorithms can be viewed as contention resolution schemes, and via known techniques [1, 8], we also obtain approximation algorithms for submodular objectives. We present the results for this generalization in Sect. 6.

Other related work

We note that PIPs are equivalent to the multi-dimensional knapsack problem. When \(m=1\) we have the classical knapsack problem, which admits a very efficient FPTAS (see [3]). There is a PTAS for any fixed m [12] but unless \(P=NP\) an FPTAS does not exist for \(m=2\). Approximation algorithms for PIPs in their general form were considered initially by Raghavan and Thompson [19] and refined substantially by Srinivasan [20]. Srinivasan obtained approximation ratios of the form \(\varOmega (1/n^{1/(W+1)})\) when A had entries from \(\{0,1\}\), and a ratio of the form \(\varOmega (1/n^{1/W})\) when A had entries from [0, 1]. Pritchard [17] was the first to obtain a bound for PIPs based solely on the column sparsity parameter \(\varDelta _0\). He used iterated rounding and his initial bound was improved in [18] to \(\varOmega (1/\varDelta _0^2)\). The current state of the art is due to Bansal et al. [1]. Previously we ignored constant factors when describing the ratio. In fact [1] obtains a ratio of \(\frac{1}{e\varDelta _0 + o(\varDelta _0)}\) by strengthening the basic LP relaxation.

In terms of hardness of approximation, PIPs generalize MIS and hence one cannot obtain a ratio better than \(n^{1-\epsilon }\) unless \(P=NP\) [14, 23]. Building on MIS, [4] shows that PIPs are hard to approximate within a \(n^{\varOmega (1/W)}\) factor for any constant width W. Hardness of MIS in bounded degree graphs [21] and hardness for k-set-packing [15] imply that PIPs are hard to approximate to within \(\varOmega (1/\varDelta _0^{1-\epsilon })\) and to within \(\varOmega ((\log \varDelta _0)/\varDelta _0)\) when \(\varDelta _0\) is a sufficiently large constant. These hardness results are based on \(\{0,1\}\) matrices for which \(\varDelta _0\) and \(\varDelta _1\) coincide.

There is a large literature on deterministic and randomized rounding algorithms for packing and covering integer programs and connections to several topics and applications including discrepancy theory. \(\ell _1\)-sparsity guarantees for covering integer programs (CIPs) were first obtained by Chen, Harris and Srinivasan [9] partly inspired by [13].

Recent (and extensive) work on submodular function maximization has demonstrated that several approximation algorithms for modular objectives can be generalized to non-negative submodular objectives. Of particular relevance is the approach via the multilinear relaxation followed by rounding via contention resolution schemes. We refer the reader to [8] for the framework. This allows us to extend our results to submodular objectives.

Hardness of approximating PIPs as a function of \(\varDelta _1\)

Bansal et al. [1] showed that the integrality gap of the natural LP relaxation for PIPs is \(\varOmega (n)\) even when \(\varDelta _1\) is a constant. One can use essentially the same construction to show the following theorem.

Theorem 1

There is an approximation preserving reduction from MIS to instances of PIPs with \(\varDelta _1 \le 2\).


Let \(G = (V,E)\) be an undirected graph without self-loops and let \(n = \left| V\right| \). Let \(A \in [0,1]^{n \times n}\) be indexed by V. For all \(v \in V\), let \(A_{v,v} = 1\). For all \(uv \in E\), let \(A_{u,v} = A_{v,u} = 1/n\). For all the remaining entries in A that have not yet been defined, set these entries to 0. Consider the following PIP:

$$\begin{aligned} \text {maximize}\ \langle x, {\mathbf {1}}\rangle \ \text {over} \ x \in \{0,1\}^n \ \text {s.t.} \ Ax \le {\mathbf {1}}. \end{aligned}$$

Let S be the set of all feasible integral solutions of (1) and \({\mathcal {I}}\) be the set of independent sets of G. Define \(g : S \rightarrow {\mathcal {I}}\) where \(g(x) = \{v : x_v = 1\}\). To show g is surjective, consider a set \(I \in {\mathcal {I}}\). Let y be the characteristic vector of I. That is, \(y_v\) is 1 if \(v \in I\) and 0 otherwise. Consider the row in A corresponding to an arbitrary vertex u where \(y_u = 1\). For all \(v \in V\) such that v is a neighbor to u, \(y_v = 0\) as I is an independent set. Thus, as the nonzero entries in A of the row corresponding to u are, by construction, the neighbors of u, it follows that the constraint corresponding to u is satisfied in (1). As u is an arbitrary vertex, it follows that y is a feasible integral solution to (1) and as \(I = \{v : y_v = 1\}\), \(g(y) = I\).

Define \(h : S \rightarrow {\mathbb {N}}_0\) such that \(h(x) = \left| g(x)\right| \). It is clear that \(\max _{x \in S} h(x)\) is equal to the optimal value of (1). Let \(I_{max}\) be a maximum independent set of G. As g is surjective, there exists \(z \in S\) such that \(g(z) = I_{max}\). Thus, \(\max _{x\in S} h(x) \ge \left| I_{max}\right| \). As \(\max _{x \in S} h(x)\) is equal to the optimum value of (1), it follows that a \(\beta \)-approximation for PIPs implies a \(\beta \)-approximation for maximum independent set.

Furthermore, we note that for this PIP, \(\varDelta _1 \le 2\), thus concluding the proof. \(\square \)

Unless \(P=NP\), MIS does not admit a \(n^{1-\epsilon }\)-approximation for any fixed \(\epsilon > 0\) [14, 23]. Hence the preceding theorem implies that unless \(P=NP\) one cannot obtain an approximation ratio for PIPs solely as a function of \(\varDelta _1\).

Round and alter framework

The algorithms in this paper have the same high-level structure. The algorithms first scale down the fractional solution x by some factor \(\alpha \), and then randomly round each coordinate independently. The rounded solution \(x'\) may not be feasible for the constraints. The algorithm alters \(x'\) to a feasible \(x''\) by considering each constraint separately in an arbitrary order; if \(x'\) is not feasible for constraint i, some subset S of variables are chosen to be set to 0. Each constraint corresponds to a knapsack problem and the framework (which is adapted from [1]) views the problem as the intersection of several knapsack constraints. A formal template is given in Fig. 1. To make the framework into a formal algorithm, one must define \(\alpha \) and how to choose S in the for loop. These parts will depend on the regime of interest.

Fig. 1

Randomized rounding with alteration framework

For an algorithm that follows the round-and-alter framework, the expected output of the algorithm is \({\mathbb {E}}\left[ \langle c, x''\rangle \right] = \sum _{j =1}^n c_j\cdot \Pr [x_j'' = 1]\). Independent of how \(\alpha \) is defined or how S is chosen, \(\Pr [x_j'' = 1] = \Pr [x_j'' = 1 | x_j' = 1]\cdot \Pr [x_j' = 1]\) since \(x_j'' \le x_j'\). Then we have

$$\begin{aligned} {\mathbb {E}}[\langle c, x''\rangle ] = \alpha \sum _{j=1}^n c_jx_j\cdot \Pr [x_j'' = 1 | x_j' = 1]. \end{aligned}$$

Let \(E_{ij}\) be the event that \(x_j''\) is set to 0 when ensuring constraint i is satisfied in the for loop. As \(x_j''\) is only set to 0 if at least one constraint sets \(x_j''\) to 0, we have

$$\begin{aligned} \Pr [x_j'' = 0 | x_j ' = 1] = \Pr \left[ \bigcup _{i \in [m]} E_{ij} | x_j' = 1\right] \le \sum _{i =1}^m \Pr [E_{ij} | x_j' = 1]. \end{aligned}$$

Combining these two observations, we have the following lemma, which applies to all of our subsequent algorithms.

Lemma 1

Let \({\mathcal {A}}\) be a randomized rounding algorithm that follows the round-and-alter framework given in Fig. 1. Let \(x'\) be the rounded solution obtained with scaling factor \(\alpha \). Let \(E_{ij}\) be the event that \(x_j''\) is set to 0 by constraint i. If for all \(j \in [n]\) we have \(\sum _{i =1}^m \Pr [E_{ij} | x_j' = 1] \le \gamma ,\) then \({\mathcal {A}}\) is an \(\alpha (1 - \gamma )\)-approximation for PIPs.

We will refer to the quantity \(\Pr [E_{ij}|x_j' = 1]\) as the rejection probability of item j in constraint i. We will also say that constraint i rejects item j if \(x_j''\) is set to 0 in constraint i.

Concentration inequalities and other useful inequalities

In the subsequent sections, for particular regimes of interest, we rely on Chernoff bounds to upper bound rejection probabilities. The following standard Chernoff bound is used to obtain a more convenient bound in Lemma 3. The proof of Lemma 3 follows directly from choosing \(\delta \) such that \((1 + \delta )\mu = W - \beta \) and applying Lemma 2. We include the proof for convenience.

Lemma 2

([16]) Let \(X_1,\ldots , X_n\) be independent random variables where \(X_i\) is defined on \(\{0, \beta _i\}\), where \(0 < \beta _i \le \beta \le 1\) for some \(\beta \). Let \(X = \sum _i X_i\) and denote \({\mathbb {E}}[X]\) as \(\mu \). Then for any \(\delta > 0\),

$$\begin{aligned} \Pr [X \ge (1 + \delta )\mu ] \le \left( \frac{e^\delta }{(1+\delta )^{1+\delta }}\right) ^{\mu /\beta } \end{aligned}$$

Lemma 3

Let \(X_1,\ldots , X_n \in [0,\beta ]\) be independent random variables for some \(0 < \beta \le 1\). Suppose \(\mu = {\mathbb {E}}[\sum _i X_i] \le \alpha W\) for some \(0< \alpha < 1\) and \(W \ge 1\) where \((1 - \alpha )W > \beta \). Then

$$\begin{aligned} \Pr \left[ \sum _i X_i > W - \beta \right] \le \left( \frac{\alpha e^{1-\alpha } W}{W - \beta }\right) ^{(W-\beta )/\beta }. \end{aligned}$$


Since the right-hand side is increasing in \(\alpha \), it suffices to assume \(\mu = \alpha W\). Choose \(\delta \) such that \((1+\delta )\mu = W - \beta \). Then \(\delta = (W - \beta - \mu )/\mu \). Because \(\mu = \alpha W\) and since \((1 - \alpha )W > \beta \), we have \(\delta = ((1-\alpha )W - \beta )/\mu > 0\). We apply the standard Chernoff bound in Lemma 3 to obtain

$$\begin{aligned} \Pr \left[ \sum _iX_i> W- \beta \right] = \Pr \left[ \sum _i X_i > (1+\delta )\mu \right] \le \left( \frac{e^\delta }{(1+\delta )^{1+\delta }}\right) ^{\mu /\beta }. \end{aligned}$$

Because \(1 + \delta = (W - \beta )/\mu \) and \(\delta = (W - \beta - \mu )/\mu \),

$$\begin{aligned} \left( \frac{e^\delta }{(1+\delta )^{1+\delta }}\right) ^{\mu /\beta } = \left( \frac{e^{W - \beta - \mu }}{((W - \beta )/\mu )^{W-\beta }}\right) ^{1/\beta }. \end{aligned}$$

Exponentiating the denominator,

$$\begin{aligned} \left( \frac{e^{W - \beta - \mu }}{((W - \beta )/\mu )^{W-\beta }}\right) ^{1/\beta } = \exp \left( \frac{1}{\beta } \left( W - \beta - \mu + (W-\beta )\ln \left( \frac{\mu }{W-\beta } \right) \right) \right) \end{aligned}$$

As \(\mu = \alpha W\),

$$\begin{aligned}&\exp \left( \frac{1}{\beta } \left( W - \beta - \mu + (W-\beta )\ln \left( \frac{\mu }{W-\beta } \right) \right) \right) \\&\quad = \exp \left( \frac{1}{\beta } \left( (1-\alpha )W - \beta + (W-\beta )\ln \left( \frac{\alpha W}{W-\beta } \right) \right) \right) \end{aligned}$$

We can rewrite the exponent to show that

$$\begin{aligned} \exp \left( \frac{1}{\beta } \left( (1-\alpha )W - \beta - (W-\beta )\ln \left( \frac{W-\beta }{\alpha W} \right) \right) \right) \le \left( \frac{\alpha e^{1-\alpha }W}{W - \beta }\right) ^{(W-\beta )/\beta }. \end{aligned}$$

\(\square \)

The following three lemmas are used in the proofs bounding the rejection probabilities for different regimes of width. The inequalities are easily verified via calculus. The proofs are included for the sake of completeness.

Lemma 4

Let \(x \in (0,1]\). Then \((1/e^{1/e})^{1/x} \le x\).


Taking logs of both sides of the stated inequality and rearranging, it suffices to show that \(\ln (1/e^{1/e}) \le x\ln x\) for \(x > 0\). \(x \ln x\) is convex and its minimum is \(-1/e\) at \(x = 1/e\). Since \(\ln (1/e^{1/e}) = -1/e\), the inequality holds. \(\square \)

Lemma 5

Let \(y \ge 2\) and \(x \in (0, 1]\) . Then \(x/y \ge (1/e^{2/e})^{y/2x}\).


We start with a simple rewriting of the statement. After taking logs and rearranging, it is sufficient to show

$$\begin{aligned} (x/y) \ln (x/y) \ge (1 / 2)\ln (1/e^{2/e}) = -1/e. \end{aligned}$$

Replacing x/y with z, we see that it suffices the prove \(z \ln z \ge -1/e\) for \(0 < z \le 1/2\). We note that \(x \ln x\) is convex and its minimum is \(-1/e\) at \(x = 1/e\). Thus, \(z \ln z \ge -1/e\). This concludes the proof. \(\square \)

Lemma 6

Let \(0 < \epsilon \le 1\) and \(x \in (0, 1]\). Then \(\epsilon x /2 \ge (\epsilon /e^{2/e})^{1/x}\).


Let \(d = e^{2/e}/2\) and observe that \(d > 1\). We first do a change of variables, replacing \(\epsilon / 2\) with \(\epsilon \) and x with \(x/\epsilon \). If we take a \(\log \) of both sides, then our reformulated goal is to show that

$$\begin{aligned} x\ln x \ge \epsilon \ln (\epsilon /d) \end{aligned}$$

for \(0 < \epsilon \le 1/2\) and \(x \in (0,\epsilon ]\). Letting \(f(y) = y \ln y\) and \(g(y) = y \ln (y/d)\), we want to show that \(f(x) \ge g(\epsilon )\). We will proceed by cases.

First, suppose \(0 < \epsilon \le d/e\). It is easy to show that f is decreasing on (0, 1/e] and increasing on \([1/e, \infty )\) and that g is decreasing on (0, d/e] and increasing on \([d/e, \infty )\). As f is decreasing on (0, 1/e], for \(0 < \epsilon \le 1/e\), we have \(f(x) \ge f(\epsilon )\) as \(x \le \epsilon \). As \(d > 1\), it follows that \(f(\epsilon ) \ge g(\epsilon )\). Therefore, \(f(x) \ge g(\epsilon )\) for \(0 < \epsilon \le 1/e\). Furthermore, as g is decreasing on [1/ed/e] and f is increasing on [1/ed/e], we have \(f(x) \ge g(\epsilon )\) for \(0 < \epsilon \le d/e\).

For the second case, suppose \(d/e < \epsilon \le 1/2\). Note that the minimum of f on the interval (0, 1/2] is \(f(1/e)= -1/e\). Thus, it would suffice to show that \(g(\epsilon ) \le -1/e\). As we noted previously that g is increasing on [d/e, 1/2], it would suffice to show that \(g(1/2) \le -1/e\). By definition of g, we see \(g(1/2) = -1/e\). Therefore, \(f(x) \ge g(\epsilon )\). This concludes the proof. \(\square \)

The large width regime: \(W \ge 2\)

In this section, we consider PIPs with width \(W \ge 2\). Recall that we assume \(A \in [0,1]^{m \times n}\) and \(b_i = W\) for all \(i \in [m]\). Therefore we have \(A_{i,j} \le W/2\) for all ij and from a knapsack point of view all items are “small”. We apply the round-and-alter framework in a simple fashion where in each constraint i the coordinates are sorted by the coefficents in that row and the algorithm chooses the largest prefix of coordinates that fit in the capacity W and the rest are discarded. We emphasize that this sorting step is crucial for the analysis and differs from the scheme in [1]. Figure 2 describes the formal algorithm.

Fig. 2

Round-and-alter in the large width regime. Each constraint sorts the coordinates in increasing size and greedily picks a feasible set and discards the rest

The key property for the analysis. The analysis relies on obtaining a bound on the rejection probability of coordinate j by constraint i. Let \(X_j\) be the indicator variable for j being chosen in the first step. We show that \(\Pr [E_{ij} \mid X_j = 1] \le c A_{ij}\) for some c that depends on the scaling factor \(\alpha \). Thus coordinates with smaller coefficients are less likely to be rejected. The total rejection probability of j, \(\sum _{i=1}^m \Pr [E_{ij}\mid X_j=1]\), is proportional to the column sum of coordinate j which is at most \(\varDelta _1\).

An \(\varOmega (1/\varDelta _1)\)-approximation algorithm

We show that \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}\) yields an \(\varOmega (1/\varDelta _1)\)-approximation if we set the scaling factor \(\alpha _1 = \frac{1}{c_1\varDelta _1}\) where \(c_1 = 4e^{1+1/e}\).

The rejection probability is captured by the following main lemma.

Lemma 7

Let \(\alpha _1= \frac{1}{c_1\varDelta _1}\) for \(c_1 = 4e^{1+1/e}\). Let \(i \in [m]\) and \(j \in [n]\). Then we have \(\Pr [E_{ij} | X_j = 1] \le \frac{A_{i,j}}{2\varDelta _1}\) in the algorithm \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}(A, b, \alpha _1)\).


At iteration i of \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}\), after the set \(\{A_{i,1}, \ldots , A_{i,n}\}\) is sorted, the indices are renumbered so that \(A_{i,1} \le \cdots \le A_{i,n}\). Note that j may now be a different index \(j'\), but for simplicity of notation we will refer to \(j'\) as j. Let \(\xi _\ell = 1\) if \(x_\ell ' = 1\) and 0 otherwise. Let \(Y_{ij} = \sum _{\ell = 1}^{j-1} A_{i,\ell }\xi _\ell \).

If \(E_{ij}\) occurs, then \(Y_{ij} > W - A_{i,j}\), since \(x_j''\) would not have been set to zero by constraint i otherwise. That is,

$$\begin{aligned} \Pr [E_{ij} | X_j = 1] \le \Pr [Y_{ij} > W - A_{i,j} | X_j = 1]. \end{aligned}$$

The event \(Y_{ij} >W - A_{i,j}\) does not depend on \(x_j'\). Therefore,

$$\begin{aligned} \Pr [Y_{ij} > W - A_{i,j} | X_j = 1] \le \Pr [Y_{ij} \ge W - A_{i,j}]. \end{aligned}$$

To upper bound \({\mathbb {E}}[Y_{ij}]\), we have

$$\begin{aligned} {\mathbb {E}}[Y_{ij}] = \sum _{\ell = 1}^{j-1} A_{i,\ell } \cdot \Pr [X_\ell = 1] \le \alpha _1\sum _{\ell = 1}^n A_{i,\ell }x_\ell \le \alpha _1W. \end{aligned}$$

As \(A_{i,j} \le 1\), \(W \ge 2\), and \(\alpha _1 < 1/2\), we have \(\frac{(1-\alpha _1)W}{A_{i,j}} > 1\). Using the fact that \(A_{i,j}\) is at least as large as all entries \(A_{i,j'}\) for \(j' < j\), we satisfy the conditions to apply the Chernoff bound in Lemma 3. This implies

$$\begin{aligned} \Pr [Y_{ij} > W - A_{i,j}] \le \left( \frac{\alpha _1e^{1-\alpha _1}W}{W-A_{i,j}}\right) ^{(W-A_{i,j})/A_{i,j}}. \end{aligned}$$

Note that \(\frac{W}{W-A_{i,j}} \le 2\) as \(W \ge 2\). Because \(e^{1-\alpha _1}\le e\) and by the choice of \(\alpha _1\), we have

$$\begin{aligned} \left( \frac{\alpha _1e^{1-\alpha _1}W}{W-A_{i,j}}\right) ^{(W-A_{i,j})/A_{i,j}} \le \left( 2e\alpha _1\right) ^{(W-A_{i,j})/A_{i,j}} = \left( \frac{1}{2e^{1/e}\varDelta _1}\right) ^{(W-A_{i,j})/A_{i,j}}. \end{aligned}$$

Then we prove the final inequality in two parts. First, we see that \(W \ge 2\) and \(A_{i,j} \le 1\) imply that \(\frac{W - A_{i,j}}{A_{i,j}}\ge 1\). This implies

$$\begin{aligned} \left( \frac{1}{2\varDelta _1}\right) ^{(W-1)/A_{i,j}} \le \frac{1}{2\varDelta _1}. \end{aligned}$$

Second, we see that

$$\begin{aligned} (1/e^{1/e})^{(W-A_{i,j})/A_{i,j}} \le (1/e^{1/e})^{1/A_{i,j}} \le A_{i,j} \end{aligned}$$

for \(A_{i,j} \le 1\), where the first inequality holds because \(W - A_{i,j} \ge 1\) and the second inequality holds by Lemma 4. This concludes the proof. \(\square \)

Theorem 2

When setting \(\alpha _1= \frac{1}{c_1\varDelta _1}\) where \(c_1 = 4e^{1+1/e}\), \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}(A, b, \alpha _1)\) is a randomized \((\alpha _1/2)\)-approximation algorithm for PIPs with width \(W \ge 2\).


Fix \(j \in [n]\). By Lemma 7 and the definition of \(\varDelta _1\), we have

$$\begin{aligned} \sum _{i = 1}^m \Pr [E_{ij} | X_j = 1] \le \sum _{i = 1}^m \frac{A_{i,j}}{2\varDelta _1} \le \frac{1}{2}. \end{aligned}$$

By Lemma 1, which shows that upper bounding the sum of the rejection probabilities by \(\gamma \) for every item leads to an \(\alpha _1(1-\gamma )\)-approximation, we get the desired result. \(\square \)

An \(\varOmega (\frac{1}{(1 +\varDelta _1/W)^{1/(W-1)}})\)-approximation

We improve the bound from the previous section by setting \(\alpha _1 = \frac{1}{c_2(1 +\varDelta _1/W)^{1/(W-1)}}\) where \(c_2 = 4e^{1+2/e}\). Note that the scaling factor becomes larger as W increases.

Lemma 8

Let \(\alpha _1= \frac{1}{c_2(1 +\varDelta _1/W)^{1/(W-1)}}\) for \(c_2 = 4e^{1 + 2/e}\). Let \(i \in [m]\) and \(j \in [n]\). Then in the algorithm \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}(A, b, \alpha _1)\), we have \(\Pr [E_{ij} | X_j = 1] \le \frac{A_{i,j}}{2\varDelta _1}\).


The proof proceeds similarly to the proof of Lemma 7. Since \(\alpha _1 < 1/2\), everything up to and including the application of the Chernoff bound there applies. This gives that for each \(i\in [m]\) and \(j \in [n]\),

$$\begin{aligned} \Pr [E_{ij} | X_j = 1] \le \left( 2e\alpha _1\right) ^{(W-A_{i,j})/A_{i,j}} = \left( \frac{1}{2e^{2/e}(1 +\varDelta _1/W)^{1/(W-1)}}\right) ^{(W-A_{i,j})/A_{i,j}}, \end{aligned}$$

where the equality is by choice of \(\alpha _1\). We prove the final inequality in two parts. First, note that \(\frac{W - A_{i,j}}{A_{i,j}} \ge W - 1\) since \(A_{i,j} \le 1\). Thus,

$$\begin{aligned} \left( \frac{1}{2(1 +\varDelta _1/W)^{1/(W-1)}}\right) ^{(W-A_{i,j})/A_{i,j}} \le \frac{1}{2^{W-1}(1 + \varDelta _1/W)} \le \frac{W}{2\varDelta _1}. \end{aligned}$$

Second, we see that

$$\begin{aligned} \left( \frac{1}{e^{2/e}}\right) ^{(W-A_{i,j})/A_{i,j}} \le \left( \frac{1}{e^{2/e}}\right) ^{W/(2A_{i,j})} \le \frac{A_{i,j}}{W} \end{aligned}$$

for \(A_{i,j} \le 1\), where the first inequality holds because \(W \ge 2\) and the second inequality holds by Lemma 5. \(\square \)

If we replace Lemma 7 with Lemma 8 in the proof of Theorem 2, we obtain the following stronger guarantee.

Theorem 3

When setting \(\alpha _1= \frac{1}{c_2(1 +\varDelta _1/W)^{1/(W-1)}}\) where \(c_2 = 4e^{1 + 2/e}\), for PIPs with width \(W \ge 2\), \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}(A, b, \alpha _1)\) is a randomized \((\alpha _1 /2)\)-approximation.

A \((1-O(\epsilon ))\)-approximation when \(W \ge \varOmega (\frac{1}{\epsilon ^2}\ln (\frac{\varDelta _1}{\epsilon }))\)

In this section, we give a randomized \((1-O(\epsilon ))\)-approximation for the case when \(W \ge \varOmega (\frac{1}{\epsilon ^2}\ln (\frac{\varDelta _1}{\epsilon }))\). We use the algorithm \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}\) in Fig. 2 with the scaling factor \(\alpha _1 = 1 - \epsilon \). The analysis follows the same structure as the analyses for the lemmas bounding the rejection probabilities from the previous sections.

Lemma 9

Let \(0< \epsilon < \frac{1}{e}\), \(\alpha _1 = 1 - \epsilon \), and \(W = \frac{2}{\epsilon ^2}\ln (\frac{\varDelta _1}{\epsilon }) + 1\). Let \(i \in [m]\) and \(j \in [n]\). Then in \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}(A, b, \alpha _1)\), we have \(\Pr [E_{ij} | X_j = 1] \le e\cdot \frac{\epsilon A_{i,j}}{\varDelta _1}\).


Renumber indices so that \(A_{i,1} \le \cdots \le A_{i,n}\) and if the index of j changes to \(j'\), we still refer to \(j'\) as j. Let \(Y_{ij} = \sum _{\ell = 1}^{j-1} A_{i,\ell } \xi _\ell \) where \(\xi _\ell = 1\) if \(x_\ell ' = 1\) and 0 otherwise. We first note that

$$\begin{aligned} \Pr [E_{ij} | X_j = 1] \le \Pr [Y_{ij} > W - A_{i,j}]. \end{aligned}$$

By the choice of \(\alpha _1\) and the fact that \(A_{i,j} \le 1\) and \(W = \frac{2}{\epsilon ^2}\ln (\frac{\varDelta _1}{\epsilon }) + 1\), we have \(((1- \alpha _1)W)/A_{i,j} \ge \epsilon W = \frac{2}{\epsilon } \ln (\frac{\varDelta _1}{\epsilon }) + \epsilon \). A direct argument via calculus shows \(\frac{2}{\epsilon }\ln (\frac{\varDelta _1}{\epsilon }) + \epsilon > 1\) for \(\epsilon \in (0,\frac{1}{e})\). Thus, \((1-\alpha _1)W > A_{i,j}\).

By the LP constraints, \({\mathbb {E}}[Y_{ij}] \le \alpha _1 W\). Then as \(A_{i,j'} \le A_{i,j}\) for all \(j' < j\), we can apply the Chernoff bound in Lemma 3 to obtain

$$\begin{aligned} \Pr [Y_{ij} \ge W - A_{i,j}] \le \left( \frac{\alpha _1 e^{1-\alpha _1} W}{W - A_{i,j}}\right) ^{(W-A_{i,j})/A_{i,j}}. \end{aligned}$$

We bound the right-hand side in two steps. First, as \(A_{i,j} \le 1\),

$$\begin{aligned} \left( \frac{W}{W- A_{i,j}}\right) ^{(W-A_{i,j})/A_{i,j}} \le \left( \frac{W}{W- 1}\right) ^{W - 1} \le e, \end{aligned}$$

where the last inequality follows from the fact that \((1-1/z)^{z-1} \ge 1/e\) for all \(z \ge 1\).

Second, by the choice of \(\alpha _1\),

$$\begin{aligned} \left( \alpha _1 e^{1-\alpha _1}\right) ^{(W-A_{i,j})/A_{i,j}} = \left( (1-\epsilon )e^{\epsilon }\right) ^{(W-A_{i,j})/A_{i,j}}. \end{aligned}$$

For \(0< \epsilon < \frac{1}{e}\), we have \(1 - \epsilon \le \exp (-\epsilon - \frac{\epsilon ^2}{2})\). As \(W = \frac{2}{\epsilon ^2}\ln (\frac{\varDelta _1}{\epsilon }) + 1\) and \(A_{i,j} \le 1\),

$$\begin{aligned} \left( (1-\epsilon ) e^\epsilon \right) ^{(W-A_{i,j})/A_{i,j}} \le \left( e^{-\epsilon ^2/2}\right) ^{\frac{2}{\epsilon ^2} \ln (\frac{\varDelta _1}{\epsilon })} \le \exp \left( -\frac{\ln (\frac{\varDelta _1}{\epsilon })}{A_{i,j}}\right) . \end{aligned}$$

Observe that \(\frac{1}{A_{i,j}} - \ln (\frac{e}{A_{i,j}}) \ge 0\). For \(A_{i,j} \in [0,1]\), a direct argument shows \(\frac{\ln (t)}{A_{i,j}} - \ln (\frac{t}{A_{i,j}})\) is increasing in t for \(t \ge e\). As \(\varDelta _1 / \epsilon > e\), we have \(\frac{\ln (\frac{\varDelta _1}{\epsilon })}{A_{i,j}} \ge \ln (\frac{\varDelta _1}{\epsilon A_{i,j}})\). Therefore,

$$\begin{aligned} \exp \left( -\frac{\ln (\frac{\varDelta _1}{\epsilon })}{A_{i,j}}\right) \le \exp \left( -\ln \left( \frac{\varDelta _1}{\epsilon A_{i,j}}\right) \right) = \frac{\epsilon A_{i,j}}{\varDelta _1}. \end{aligned}$$

This concludes the proof. \(\square \)

Lemma 9 implies that we can upper bound the sum of the rejection probabilities for any item j by \(e\epsilon \), leading to the following theorem.

Theorem 4

Let \(0< \epsilon < \frac{1}{e}\) and \(W = \frac{2}{\epsilon ^2}\ln (\frac{\varDelta _1}{\epsilon }) + 1\). When setting \(\alpha _1 = 1 - \epsilon \) and \(c = e+1\), \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}(A,b,\alpha _1)\) is a randomized \((1 - c\epsilon )\)-approximation algorithm.


Fix \(j \in [n]\). By Lemma 9 and the definition of \(\varDelta _1\),

$$\begin{aligned} \sum _{i =1}^m \Pr [E_{ij} | X_j = 1] \le \sum _{i=1}^m \frac{e\epsilon A_{i,j}}{\varDelta _1} \le e\epsilon . \end{aligned}$$

By Lemma 1, which shows that an upper bound on the rejection probabilities of \(\gamma \) leads to an \(\alpha _1(1 - \gamma )\)-approximation, we have an \(\alpha _1(1 - e\epsilon )\)-approximation. Then note that \(\alpha _1(1 - e\epsilon ) = (1 - \epsilon )(1-e\epsilon ) \ge 1 - (e+1)\epsilon \). This concludes the proof. \(\square \)

The small width regime: \(W = 1+\epsilon \)

We now consider the regime when the width is small. Let \(W = 1+ \epsilon \) for some \(\epsilon \in (0,1]\). We cannot apply the simple sorting based scheme that we used for the large width regime. We borrow the idea from [1] in splitting the coordinates into big and small in each constraint; now the definition is more refined and depends on \(\epsilon \). Moreover, the small coordinates and the big coordinates have their own reserved capacity in the constraint. This is crucial for the analysis. We provide more formal details below.

We set \(\alpha _2\) to be \(\frac{\epsilon ^2}{c_3\varDelta _1}\) where \(c_3 = 8e^{1+2/e}\). The alteration step differentiates between “small” and “big” coordinates as follows. For each \(i \in [m]\), let \(S_i = \{j : A_{i,j} \le \epsilon /2\}\) and \(B_i = \{j : A_{i,j} > \epsilon /2\}\). We say that an index j is small for constraint i if \(j \in S_i\). Otherwise we say it is big for constraint i when \(j \in B_i\). For each constraint, the algorithm is allowed to pack a total of \(1+\epsilon \) into that constraint. The algorithm separately packs small indices and big indices. In an \(\epsilon \) amount of space, small indices that were chosen in the rounding step are sorted in increasing order of size and greedily packed until the constraint is no longer satisfied. The big indices are packed by arbitrarily choosing one and packing it into the remaining space of 1. The rest of the indices are removed to ensure feasibility. Figure 3 gives pseudocode for the randomized algorithm \(\textsf {round}-\textsf {alter}-\textsf {small}-\textsf {width}\) which yields an \(\varOmega (\epsilon ^2 / \varDelta _1)\)-approximation.

Fig. 3

By setting the scaling factor \(\alpha _2 = \frac{\epsilon ^2}{c\varDelta _1}\) for a sufficiently large constant c, \(\textsf {round}-\textsf {alter}-\textsf {small}-\textsf {width}\) is a randomized \(\varOmega (\epsilon ^2 / \varDelta _1)\)-approximation for PIPs with width \(W = 1 + \epsilon \) for some \(\epsilon \in (0,1]\) (see Theorem 5)

It remains to bound the rejection probabilities. Recall that for \(j \in [n]\), we define \(X_j\) to be the indicator random variable \(\mathbb {1}(x_j' = 1)\) and \(E_{ij}\) is the event that j was rejected by constraint i.

We first consider the case when index j is big for constraint i. Note that it is possible that there may not exist any big indices for a given constraint. The same holds true for small indices.

Lemma 10

Let \(\epsilon \in (0,1]\) and \(\alpha _2 = \frac{\epsilon ^2}{c_3\varDelta _1}\) where \(c_3 = 8e^{1+2/e}\). Let \(i \in [m]\) and \(j \in B_i\). Then in \(\textsf {round}-\textsf {alter}-\textsf {small}-\textsf {width}(A, b, \epsilon , \alpha _2)\), we have \(\Pr [E_{ij} | X_j = 1] \le \frac{A_{i,j}}{2\varDelta _1}\).


Let \({\mathcal {E}}\) be the event that there exists \(j' \in B_i\) such that \(j' \ne j\) and \(X_{j'} = 1\). Observe that if \(E_{ij}\) occurs and \(X_j = 1\), then it must be the case that at least one other element of \(B_i\) was chosen in the rounding step. Thus,

$$\begin{aligned} \Pr [E_{ij} | X_j=1] \le \Pr [{\mathcal {E}}] \le \sum _{\begin{array}{c} \ell \in B_i\\ \ell \ne j \end{array}} \Pr [X_\ell = 1] \le \alpha _2\sum _{\ell \in B_i} x_{\ell }, \end{aligned}$$

where the second inequality follows by the union bound. Observe that for all \(\ell \in B_i\), we have \(A_{i,\ell } > \epsilon /2\). By the LP constraints, we have \(1 + \epsilon \ge \sum _{\ell \in B_i} A_{i,\ell } x_{\ell } > \frac{\epsilon }{2}\cdot \sum _{\ell \in B_i} x_{\ell }\). Thus, \(\sum _{\ell \in B_i} x_\ell \le \frac{1+\epsilon }{\epsilon /2} = 2/\epsilon + 2\).

Using this upper bound for \(\sum _{\ell \in B_i} x_{\ell }\), we have

$$\begin{aligned} \alpha _2 \sum _{\ell \in B_i} x_{\ell } \le \frac{\epsilon ^2}{c_3\varDelta _1}\left( \frac{2}{\epsilon } + 2\right) \le \frac{4\epsilon }{c_3\varDelta _1} \le \frac{A_{i,j}}{2\varDelta _1}, \end{aligned}$$

where the second inequality utilizes the fact that \(\epsilon \le 1\) and the third inequality holds because \(c_3 \ge 16\) and \(A_{i,j} > \epsilon /2\). \(\square \)

Next we consider the case when index j is small for constraint i. The analysis here is similar to that in the preceding section with width at least 2.

Lemma 11

Let \(\epsilon \in (0,1]\) and \(\alpha _2 = \frac{\epsilon ^2}{c_3\varDelta _1}\) where \(c_3 = 8e^{1+2/e}\). Let \(i \in [m]\) and \(j \in S_i\). Then in \(\textsf {round}-\textsf {alter}-\textsf {small}-\textsf {width}(A, b, \epsilon , \alpha _2)\), we have \(\Pr [E_{ij} | X_j = 1] \le \frac{A_{i,j}}{2\varDelta _1}\).


Renumbering as in the proof of Lemma 7 and defining \(Y_{ij} = \sum _{\ell = 1}^{j-1} A_{i,\ell }\cdot \mathbb {1}[x_\ell ' = 1]\) in the same manner, we have

$$\begin{aligned} \Pr [E_{ij} | X_j = 1] \le \Pr [Y_{ij} \ge \epsilon - A_{i,j}]. \end{aligned}$$

Let \(A_{i,\ell }' = \frac{2}{\epsilon }\cdot A_{i,\ell }\) for \(\ell \in [j]\). As \(A_{i,\ell } \le \epsilon /2\) for all \(\ell \in [j]\), we have \(A_{i,\ell }'\in [0,1]\). Let \(Y_{ij}' = \sum _{\ell = 1}^{j-1}A_{i,\ell }' \xi _\ell \). Then

$$\begin{aligned} \Pr [Y_{ij} \ge \epsilon - A_{i,j}] = \Pr [Y_{ij}' \ge 2 - A_{i,j}']. \end{aligned}$$

To upper bound \({\mathbb {E}}[Y_{ij}']\), we use the LP constraints and the value of \(\alpha _2\) to see that \({\mathbb {E}}[Y_{ij}'] \le \frac{2\epsilon (1 + \epsilon )}{c_3\varDelta _1}\). Let \(\alpha _2' = \frac{2\epsilon }{c_3\varDelta _1}\) and \(W = 2\). Then \({\mathbb {E}}[Y_{ij}'] \le \alpha _2'W\). With these parameter choices, we see that \((1-\alpha _2')W > A_{i,j}'\). Therefore, as \(A_{i,\ell }' \le A_{i,j}'\) for all \(\ell < j\), we can apply the Chernoff bound in Lemma 3 to obtain

$$\begin{aligned} \Pr [Y_{ij}' \ge 2 - A_{i,j}'] \le \left( \frac{\alpha _2'e^{1-\alpha _2'}W}{W - A_{i,j}'}\right) ^{(W-A_{i,j}')/A_{i,j}'}. \end{aligned}$$

Observe that \(e^{1-\alpha _2'} \le e\) and \(\frac{W}{W- A_{i,j}'} \le 2\) since \(W = 2\) and \(A_{i,j}' \le 1\). By our choice of \(\alpha _2'\),

$$\begin{aligned} \left( \frac{\alpha _2'e^{1-\alpha _2'}W}{W - A_{i,j}'}\right) ^{(W-A_{i,j}')/A_{i,j}'} \le \left( 2e\alpha _2'\right) ^{(W-A_{i,j}')/A_{i,j}'} = \left( \frac{\epsilon }{2e^{2/e}\varDelta _1}\right) ^{(W-A_{i,j}')/A_{i,j}'} \end{aligned}$$

Then note that \(\frac{W-A_{i,j}'}{A_{i,j}'} \ge \frac{1}{A_{i,j}'} \ge 1\) since \(W = 2\) and \(A_{i,j}' \le 1\). So

$$\begin{aligned} \left( \frac{\epsilon }{2e^{2/e}\varDelta _1}\right) ^{(W-A_{i,j}')/A_{i,j}'} = \left( \frac{1}{2\varDelta _1}\right) ^{(W-A_{i,j}')/A_{i,j}'}\cdot \left( \frac{\epsilon }{e^{2/e}}\right) ^{(W-A_{i,j}')/A_{i,j}'} \le \frac{1}{2\varDelta _1}\cdot \frac{\epsilon A_{i,j}'}{2}, \end{aligned}$$

where the inequality follows by Lemma 6. We have shown \(\Pr [E_{ij} | X_j = 1]\le \frac{\epsilon A_{i,j}'}{4\varDelta _1}\). Since \(A_{i,j}' = A_{i,j}\cdot \frac{2}{\epsilon }\), the result follows. \(\square \)

As Lemma 11 shows that the rejection probability is small, we can prove the following approximation guarantee much like in Theorems 2 and 3.

Theorem 5

Let \(\epsilon \in (0,1]\). When setting \(\alpha _2 = \frac{\epsilon ^2}{c_3\varDelta _1}\) for \(c_3 = 8e^{1+2/e}\), for PIPs with width \(W = 1 + \epsilon \), \(\textsf {round}-\textsf {alter}-\textsf {small}-\textsf {width}(A, b, \epsilon , \alpha _2)\) is a randomized \((\alpha _2/2)\)-approximation algorithm.


Fix \(j \in [n]\). Then by Lemmas 10 and 11 and the definition of \(\varDelta _1\), we have

$$\begin{aligned} \sum _{i = 1}^m \Pr [E_{ij} | X_j = 1] \le \sum _{i = 1}^m \frac{A_{i,j}}{2\varDelta _1} \le \frac{1}{2}. \end{aligned}$$

Recall that Lemma 1 gives an \(\alpha _2(1-\gamma )\)-approximation where \(\gamma \) is an upper bound on the sum of the rejection probabilities for any item. This concludes the proof. \(\square \)

An upper bound on the integrality gap We showed in Theorem 5 that the integrality gap is \(\varOmega (\epsilon ^2/\varDelta _1)\). The example that we used in the proof of Theorem 1 can be easily adapted to show that the gap is \(O(\epsilon )\) when \(\varDelta _1 = O(1)\). It is an interesting open problem to resolve the integrality gap as a function of \(\epsilon \).

Approximating with a submodular objective

We show that the results from the previous sections can be generalized to the case where the objective function is a nonnegative submodular set function. Recall that a real-valued set function \(f : 2^N \rightarrow {\mathbb {R}}\) over a finite ground set N is submodular iff \(f(A) + f(B) \ge f(A \cup B) + f(A \cap B)\) for all \(A,B\subseteq N\). Here we are interested in non-negative submodular functions \(f : 2^N \rightarrow {\mathbb {R}}_+\). Also of interest are monotone submodular functions which satisfy the additional property that \(f(A) \le f(B)\) for all \(A \subset B\).

The formal problem we are interested in is the following. Let \(N = \{1,2,\ldots , n\}\) be the ground set and let \(f : 2^N \rightarrow {\mathbb {R}}_+\) be a nonnegative submodular set function. We assume that one has oracle access to f; given any set \(S \subseteq N\), we can obtain f(S) in constant time. Our goal is to approximate the following problem:

$$\begin{aligned} \text {maximize } f(S) \text { over } S \subseteq N \text { s.t.} \sum _{j\in S} A_{i,j} \le b_i, \ \forall i \in [m]. \end{aligned}$$

Equivalently, if we let \({\mathbf {1}}_S \in \{0,1\}^n\) denote the characteristic vector of a set \(S \subseteq N\) then we wish to approximate the problem: \(\max f(S) \text { s.t. } A {\mathbf {1}}_S \le b\).

The rounding algorithms from the previous sections fall under the framework of contention resolution schemes (CR schemes) that allow one to extend the results to submodular objectives via the multilinear relaxation approach. We briefly outline this framework and follow [8].

Multilinear relaxation and CR scheme based rounding. Let N be a finite ground set and \(f:2^N \rightarrow {\mathbb {R}}_+\) be a non-negative submodular set function. Let \({\mathcal {I}}\subseteq 2^N\) be a downward closed family of setsFootnote 2 which abstractly models some constraints. We are then interested in the optimization problem \(\max _{S \in {\mathcal {I}}} f(S)\). The multilinear relaxation approach for approximating this problem is to solve a continuous optimization problem \(\max _{x \in P_{{\mathcal {I}}}} F(x)\) and then round it. Here \(P_{{\mathcal {I}}} \supseteq {\text {conv}}\{{\mathbf {1}}_S : S \in {\mathcal {I}}\}\) is a convex set that serves as a relaxation for the constraint set, and F is a continuous extension of f to \([0,1]^n\). Specifically, \(F:[0,1]^n \rightarrow {\mathbb {R}}_+\) is the multilinear extension of f and is defined as

$$\begin{aligned} F(x) = \sum _{S \subseteq N} f(S) \prod _{i \in S} x_i\prod _{j \not \in S}(1 - x_j). \end{aligned}$$

We say that \(P_{{\mathcal {I}}}\) is solvable if one can efficiently optimize linear functions over it. For a scalar \(\alpha \in (0,1]\) we use \(\alpha P_{{\mathcal {I}}}\) to denote the set \(\{\alpha x \mid x \in P_{{\mathcal {I}}}\}\). Let \(\mathrm{OPT}\) denote the value of the relaxation \(\max _{x \in P_{{\mathcal {I}}}} F(x)\) which provides an upper bound on the optimum integral solution. Finding \(\mathrm{OPT}\) is NP-Hard even for the simple cardinality constraint, however, randomized constant factor approximations are known whenever \(P_{{\mathcal {I}}}\) is solvable and f is a nonnegative submodular function. In particular for any \(\alpha \in (0,1]\) one can obtain a point \(x \in \alpha P_{{\mathcal {I}}}\) such that \(F(x) \ge (1-1/e^{\alpha })\mathrm{OPT}\) when f is monotone [22] and such that \(F(x) \ge \alpha e^{-\alpha } \mathrm{OPT}\) when f is non-negative [11]Footnote 3.

The second step in devising an algorithm is to round the fractional solution. We focus on a particular strategy based on CR schemes. It is motivated by the definition of F(x) as the expected value \({\mathbb {E}}[f(R(x))]\), where R(x) is a random set obtained by independently picking each \(i \in N\) with probability \(x_i\). Thus randomly rounding x preserves the objective F(x) in expectation. However the resulting set R(x) can violate the constraints. Thus we would like to alter R(x) to a feasible set \(R'\) while not losing too much in the objective. We would also like to scale x down by an \(\alpha \) factor for some \(\alpha \in (0,1]\) and work with \(R(\alpha x)\) since this can be useful in the alteration step as we have seen for PIPs.

In [8] a formal and abstract definition of alteration schemes called CR schemes was provided, and it was shown that these schemes can be used in conjunction with the multilinear relaxation. In this setting an alteration scheme for \(P_{{\mathcal {I}}}\) is viewed as a (potentially randomized) algorithm \({\mathcal {A}}\) that takes as input \(x \in P_{{\mathcal {I}}}\) and a set \(B \subseteq N\) satisfying \(B \subseteq {\text {support}}(x)\)Footnote 4 and outputs a feasible set \({\mathcal {A}}(B,x) \in {\mathcal {I}}\).

Definition 1

Let \(\alpha ,\beta \in (0,1]\). We say that an algorithm \({\mathcal {A}}\) is a (randomized) \((\alpha ,\beta )\)-balanced CR scheme for \(P_{{\mathcal {I}}}\) if \({\mathcal {A}}\) returns a (random) set \({\mathcal {A}}(B,x)\) for all inputs \(x \in \alpha P_{{\mathcal {I}}}\) and \(B \subseteq N\) and satsifies the following properties:

  1. 1.

    for all \(B \subseteq N\), with probability 1, \({\mathcal {A}}(B,x) \subseteq B \cap {\text {support}}(x)\) and \({\mathcal {A}}(B,x) \in {\mathcal {I}}\).

  2. 2.

    for all \(i \in {\text {support}}(x)\), \(\Pr [i \in {\mathcal {A}}(R(x), x) \mid i \in R(x)] \ge \beta \).

  3. 3.

    for all \(B_1 \subseteq B_2 \subseteq N\) and \(i \in B_1\), \(\Pr [i \in {\mathcal {A}}(B_1, x)] \ge \Pr [i \in {\mathcal {A}}(B_2,x)]\).

The first property guarantees that the output of \({\mathcal {A}}\) is a feasible set. The second property gives a lower bound on the probability of an element being in the output conditioned on it being chosen in the first randomized rounding step. The last property requires that the alteration scheme is monotone from each element’s perspective.

The following is a paraphrased version of the results in [8] that combines the algorithms for solving the multilinear relaxation followed by rounding with a CR scheme.

Theorem 6

([8]) Consider the problem \(\max _{S \in {\mathcal {I}}} f(S)\) for a non-negative submodular function \(f:2^N \rightarrow {\mathbb {R}}_+\) and its multilinear relaxation \(\max _{x \in P_{{\mathcal {I}}}} F(x)\). Combining an \((\alpha e^{-\alpha })\)-approximation to \(\max _{x \in \alpha P_{{\mathcal {I}}}}F(x)\) and an \((\alpha , \beta )\)-balanced CR scheme, one can obtain a randomized \((\alpha e^{-\alpha }\beta )\)-approximation to \(\max _{S \in {\mathcal {I}}} f(S)\). The approximation ratio improves to \((1 - \frac{1}{e^\alpha })\beta \) when f is additionally monotone.

Applying to packing constraints. We now apply the preceding theorem to the setting of packing constraints formalized earlier as (2). We have \({\mathcal {I}}= \{ S \subseteq N \mid A {\mathbf {1}}_S \le b\}\) and we use the natural LP relaxation \(P_{{\mathcal {I}}} = \{ x \in [0,1]^n \mid Ax \le b\}\). Clearly \(P_{{\mathcal {I}}}\) is solvable via linear programming.

We now interpret the rounding algorithms in the preceding sections as CR schemes. In terms of the language used in the previous sections, \(\alpha \) is the scaling factor used in obtaining the rounded solution and \(1 - \beta \) is the upper bound on the rejection probability of every item. We note that the alteration schemes are deterministic.

Let \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by-sorting-sub}(A,b,\alpha _1, x, B)\) be the same algorithm as \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by}-\textsf {sorting}\) but now \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by-sorting-sub}\) is given a fractional solution x to \(\max _{x \in \alpha _1P_{{\mathcal {I}}}} F(x)\) and B will be the rounded solution \(R(\alpha _1 x)\). We make the same changes to \(\textsf {round}-\textsf {alter}-\textsf {small}-\textsf {width}\) to obtain the algorithm \(\textsf {round}-\textsf {alter}-\textsf {small}-\textsf {width}-\textsf {sub}(A,b,\epsilon ,\alpha _2, x, B)\).

Lemma 12

  1. 1.

    For \(W \ge 2\), \(\textsf {round}-\textsf {and}-\textsf {alter}-\textsf {by-sorting-sub}(A,b,\alpha _1,x,B)\) is an \(\left( \alpha _1, \frac{1}{2}\right) \)-balanced CR scheme where \(\alpha _1 = \frac{1}{c_2(1 +\varDelta _1/W)^{1/(W-1)}}\) and \(c_2 = 4e^{1 + 2/e}\).

  2. 2.

    For \(W = 1 + \epsilon \), \(\textsf {round}-\textsf {alter}-\textsf {small}-\textsf {width}-\textsf {sub}(A,b,\epsilon ,\alpha _2,x,B)\) is an \(\left( \alpha _2,\frac{1}{2}\right) \)-balanced CR scheme where \(\alpha _2=\frac{\epsilon ^2}{c_3\varDelta _1}\) and \(c_3 = 8e^{1 + 2/e}\).


We only prove (1) as the proof for (2) is similar. \(\textsf {round}-\textsf {alter}-\textsf {small}-\textsf {width}-\textsf {sub}\) takes an integral solution and guarantees feasibility by satisfying each constraint individually by setting variables to 0 if necessary, so (1) of Definition 1 is satisfied. Let \(E_{ij}\) be the event that item j is rejected by constraint i. By Lemma 8, the rejection probability of item j is at most \(\sum _{i =1}^m \Pr [E_{ij} | x_j' = 1] \le \sum _{i=1}^m \frac{A_{i,j}}{2\varDelta _1} \le \frac{1}{2}\). In the notation of Definition 1, this implies (2) of Definition 1 is satisfied with \(\beta = 1/2\). (3) of Definition 1 is also satisfied as the probability that an item is rejected only decreases if less items are chosen in the rounding step.

Combining Lemma 12 and Theorem 6, we immediately get the following result.

Theorem 7

Let \({\mathcal {I}}= \{ S \subseteq N \mid A {\mathbf {1}}_S \le b\}\) and let \(f : 2^N \rightarrow {\mathbb {R}}_+\) be a nonnegative submodular set function.

  1. 1.

    Assume \(W \ge 2\). Let \(\alpha _1= \frac{1}{c_2(1 +\varDelta _1/W)^{1/(W-1)}}\) where \(c_2 = 4e^{1 + 2/e}\). There exists an \(((\alpha _1e^{-\alpha _1})\frac{1}{2})\)-approximation to \(\max _{S \in {\mathcal {I}}} f(S)\). Assuming f is also monotone, there exists a \(((1 - \frac{1}{e^{\alpha _1}})\frac{1}{2})\)-approximation.

  2. 2.

    Let \(\epsilon \in (0,1]\) and assume \(W = 1 + \epsilon \). Let \(\alpha _2 = \frac{\epsilon ^2}{c_3\varDelta _1}\) where \(c_3 = 8e^{1+2/e}\). There exists an \(((\alpha _2e^{-\alpha _2})\frac{1}{2})\)-approximation to \(\max _{S \in {\mathcal {I}}}f(S)\). Assuming f is also monotone, there exists a \(((1 - \frac{1}{e^{\alpha _2}})\frac{1}{2})\)-approximation.

We close this section with two remarks. First, CR schemes for different classes of constraints can be composed gracefully [8] and hence the ones here could be useful in conjunction with schemes for other constraints. Second, certain dependent rounding schemes for matroid and matroid intersection constraints satisfy concentration bounds, similar to Chernoff bounds, for non-negative sums; since our analysis for PIPs relied essentially only on Chernoff bounds, one can extend the analysis even under dependent rounding. This allows for some applications when one combines packing constraints with matroid or matroid intersection type constraints. We refer the reader to [6, 7] for more details.


  1. 1.

    We can allow the variables to have general integer upper bounds instead of restricting them to be boolean. As observed in [1], one can reduce this more general case to the \(\{0,1\}\) case without too much loss in the approximation.

  2. 2.

    We say a family of subsets \({\mathcal {I}}\subseteq 2^N\) is downward closed if for all \(A \subseteq B \subseteq N\), if \(B \in {\mathcal {I}}\), then \(A \in {\mathcal {I}}\).

  3. 3.

    For non-negative functions there have been subsequent improvements in the approximation ratio [2, 10] but the dependence on \(\alpha \) is unclear and since the precise approximation ratios are not the focus in this paper, we confine ourselves to the simpler algorithm and bound from [11].

  4. 4.

    For \(x \in [0,1]^n\) we use \({\text {support}}(x)\) to denote the set \(\{i \mid x_i > 0\}\).


  1. 1.

    Bansal, N., Korula, N., Nagarajan, V., Srinivasan, A.: Solving packing integer programs via randomized rounding with alterations. Theory Comput. 8(24), 533–565 (2012). https://doi.org/10.4086/toc.2012.v008a024

    MathSciNet  MATH  Google Scholar 

  2. 2.

    Buchbinder, N., Feldman, M.: Constrained submodular maximization via a nonsymmetric technique. Math. Oper. Res. 44(3), 988–1005 (2019)

    MathSciNet  MATH  Google Scholar 

  3. 3.

    Chan, T.M.: Approximation schemes for 0–1 knapsack. In: 1st Symposium on Simplicity in Algorithms (2018)

  4. 4.

    Chekuri, C., Khanna, S.: On multidimensional packing problems. SIAM J. Comput. 33(4), 837–851 (2004)

    MathSciNet  MATH  Google Scholar 

  5. 5.

    Chekuri, C., Quanrud, K.: On approximating (sparse) covering integer programs. In: Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1596–1615. SIAM (2019)

  6. 6.

    Chekuri, C., Vondrak, J., Zenklusen, R.: Dependent randomized rounding via exchange properties of combinatorial structures. In: 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pp. 575–584. IEEE (2010)

  7. 7.

    Chekuri, C., Vondrák, J., Zenklusen, R.: Multi-budgeted matchings and matroid intersection via dependent rounding. In: Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1080–1097. SIAM (2011)

  8. 8.

    Chekuri, C., Vondrák, J., Zenklusen, R.: Submodular function maximization via the multilinear relaxation and contention resolution schemes. SIAM J. Comput. 43(6), 1831–1879 (2014)

    MathSciNet  MATH  Google Scholar 

  9. 9.

    Chen, A., Harris, D.G., Srinivasan, A.: Partial resampling to approximate covering integer programs. In: Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1984–2003. SIAM (2016)

  10. 10.

    Ene, A., Nguyen, H.L.: Constrained submodular maximization: beyond 1/e. In: 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pp. 248–257. IEEE (2016)

  11. 11.

    Feldman, M., Naor, J., Schwartz, R.: A unified continuous greedy algorithm for submodular maximization. In: 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, pp. 570–579. IEEE (2011)

  12. 12.

    Frieze, A., Clarke, M.: Approximation algorithms for the m-dimensional 0–1 knapsack problem: worst-case and probabilistic analyses. Eur. J. Oper. Res. 15(1), 100–109 (1984)

    MathSciNet  MATH  Google Scholar 

  13. 13.

    Harvey, N.J.: A note on the discrepancy of matrices with bounded row and column sums. Discrete Math. 338(4), 517–521 (2015)

    MathSciNet  MATH  Google Scholar 

  14. 14.

    Håstad, J.: Clique is hard to approximate within \(n^{1- \epsilon }\). Acta Math. 182(1), 105–142 (1999)

    MathSciNet  MATH  Google Scholar 

  15. 15.

    Hazan, E., Safra, S., Schwartz, O.: On the complexity of approximating k-set packing. Comput. Complex. 15(1), 20–39 (2006)

    MathSciNet  MATH  Google Scholar 

  16. 16.

    Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, Cambridge (2005)

    Google Scholar 

  17. 17.

    Pritchard, D.: Approximability of sparse integer programs. In: European Symposium on Algorithms, pp. 83–94. Springer (2009)

  18. 18.

    Pritchard, D., Chakrabarty, D.: Approximability of sparse integer programs. Algorithmica 61(1), 75–93 (2011)

    MathSciNet  MATH  Google Scholar 

  19. 19.

    Raghavan, P., Thompson, C.D.: Randomized rounding: a technique for provably good algorithms and algorithmic proofs. Combinatorica 7(4), 365–374 (1987)

    MathSciNet  MATH  Google Scholar 

  20. 20.

    Srinivasan, A.: Improved approximation guarantees for packing and covering integer programs. SIAM J. Comput. 29(2), 648–670 (1999)

    MathSciNet  MATH  Google Scholar 

  21. 21.

    Trevisan, L.: Non-approximability results for optimization problems on bounded degree instances. In: Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing, pp. 453–461. ACM (2001)

  22. 22.

    Vondrák, J.: Optimal approximation for the submodular welfare problem in the value oracle model. In: Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing, pp. 67–74. ACM (2008)

  23. 23.

    Zuckerman, D.: Linear degree extractors and the inapproximability of max clique and chromatic number. In: Proceedings of the Thirty-Eighth Annual ACM Symposium on Theory of Computing, pp. 681–690. ACM (2006)

Download references

Author information



Corresponding author

Correspondence to Manuel R. Torres.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

C. Chekuri and K. Quanrud supported in part by NSF Grant CCF-1526799. M. Torres supported in part by fellowships from NSF and the Sloan Foundation. A preliminary version of this work appeared in the proceedings of the 20th conference on Integer Programming and Combinatorial Optimization (IPCO 2019)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chekuri, C., Quanrud, K. & Torres, M.R. \(\ell _1\)-Sparsity approximation bounds for packing integer programs. Math. Program. 183, 195–214 (2020). https://doi.org/10.1007/s10107-020-01472-7

Download citation


  • Approximation algorithms
  • Sparse packing integer programs
  • Randomized rounding
  • Submodular optimization

Mathematics Subject Classification

  • Primary 68W25
  • Secondary 90C59