The smoothed number of Pareto-optimal solutions in bicriteria integer optimization

A well-established heuristic approach for solving bicriteria optimization problems is to enumerate the set of Pareto-optimal solutions. The heuristics following this principle are often successful in practice. Their running time, however, depends on the number of enumerated solutions, which is exponential in the worst case. We study bicriteria integer optimization problems in the model of smoothed analysis, in which inputs are subject to a small amount of random noise, and we prove an almost tight polynomial bound on the expected number of Pareto-optimal solutions. Our results give rise to tight polynomial bounds for the expected running time of the Nemhauser-Ullmann algorithm for the knapsack problem and they improve known results on the running times of heuristics for the bounded knapsack problem and the bicriteria shortest path problem.


Introduction
We study integer optimization problems with two objective functions (say profit and weight) that are to be optimized simultaneously. A common approach for solving such problems is generating the set of Pareto-optimal solutions, also known as the Pareto set. Pareto-optimal solutions are optimal compromises of the two criteria in the sense that any improvement of one criterion implies an impairment of the other. In other words, a solution x is Pareto-optimal if there does not exist another solution y that dominates x, in the sense that y has at least the same profit and at most the same weight as x and at least one of these inequalities is strict. Generating the Pareto set is of great interest in many scenarios and widely used in practice. This approach fails to yield reasonable results in the worst case because even integer optimization problems with a simple combinatorial structure can have exponentially many Pareto-optimal solutions. In practice, however, generating the Pareto set is often feasible because typically the number of Pareto-optimal solutions does not attain its worst-case bound.
The discrepancy between practical experience and worst-case results motivates the study of the number of Pareto-optimal solutions in a more realistic scenario. One possible approach is to study the average number of Pareto-optimal solutions rather than the worst-case number. In order to analyze the average, one has to define a probability distribution on the set of instances with respect to which the average is taken. In most situations, however, it is not clear how to choose a probability distribution that reflects typical inputs. In order to bypass the limitations of worst-case and average-case analysis, Spielman and Teng [22] introduced the notion of smoothed analysis. They consider a semi-random input model in which first an adversary specifies an input that is afterwards slightly perturbed at random. The perturbation is motivated by the observation that in most practical applications instances are to some extent influenced by random events like, for example, measurement errors or numerical imprecision. Intuitively, the perturbation rules out pathological worst-case instances that are rarely observed in practice but dominate the worst-case analysis.
We consider bicriteria integer optimization problems in the framework of smoothed analysis. For this we assume that an adversary specifies an arbitrary set S ⊆ D n of solutions, where D ⊆ Z denotes a finite set of integers, and two objective functions, profit p : S → R and weight w : S → R. We assume that the profit is to be maximized while the weight is to be minimized. This assumption is without loss of generality as our results are not affected by changing the optimization direction of any of the objective functions. In our model, the weight function w can be chosen arbitrarily by the adversary, whereas the profit function p has to be linear of the form p(x) = p 1 x 1 + · · · + p n x n .
In a classical worst-case analysis, the adversary can choose the coefficients p 1 , . . . , p n exactly so as to maximize the number of Pareto-optimal solutions. If he chooses these coefficients and the objective function w such that p(x) = w(x) for all solutions x ∈ S and such that all solutions from S have pairwise different profits then all solutions from S are Pareto-optimal. In the model of smoothed analysis considered in this article, the adversary is less powerful: instead of being able to specify the coefficients exactly, he can only specify a probability distribution for each coefficient according to which it is chosen independently of the other coefficients. Allowing arbi-trary distributions would include deterministic instances as a special case and hence, in order for the model to make sense, we need to ensure that the adversary is not too powerful and cannot concentrate the probability mass in a too small region. We achieve this by restricting the adversary to probability distributions that can be described by probability density functions that are bounded from above by some parameter φ ≥ 1. The parameter φ can be seen as a measure specifying how close the analysis is to a worst-case analysis: the larger φ, the more concentrated the probability mass can be and hence, the closer the analysis is to a worst-case analysis. To avoid that the effect of the perturbation is diminished by scaling, we assume that the distributions are normalized such that the expected absolute value of each profit is bounded from above by a constant.
To illustrate this model, let us consider the following example more reminiscent of Spielman and Teng's original model of smoothed analysis: first the adversary chooses an arbitrary vector of profits ( p 1 , . . . , p n ) ∈ [−1, 1] n , and then an independent Gaussian random variable with mean 0 and standard deviation σ ≤ 1 is added to each profit p i . In this example, the standard deviation σ takes over the role of φ: the smaller σ , the closer the analysis is to a worst-case analysis. For Gaussians with mean 0 and standard deviation σ , the maximum density is 1/( √ 2πσ ) < 1/σ and the expected absolute value is σ √ 2/π < σ ≤ 1. Hence, this model of Gaussian perturbations is covered by our model for φ = 1/σ . In our model the adversary is even more powerful because he is not only allowed to choose the expected value of each profit p i but also the type of noise as long as the density is bounded from above by φ. In particular, the adversary could choose for each profit p i an interval I i ⊆ [0, 1] of length 1/φ from which it is chosen independently uniformly at random.
The smoothed number of Pareto-optimal solutions is defined to be the maximum expected number of Pareto-optimal solutions for any choice of the set of solutions S, the weight function w, and the distributions of the profits subject to the bounds on the maximum density and the expected absolute value. We present a new method for bounding the smoothed number of Pareto-optimal solutions, which yields an upper bound that is polynomial in the number n of variables, the parameter max a∈D |a|, and the maximum density φ. This immediately implies polynomial upper bounds on the smoothed running time of several heuristics for generating the Pareto set of problems like the bounded knapsack problem. Previous results of this kind were restricted to the case of binary optimization problems. For this special case, our method yields an improved upper bound, which matches the known lower bound.

Related work
Multiobjective optimization is a well-studied research area. Several algorithms for generating the Pareto set of various optimization problems such as the (bounded) knapsack problem [11,17], the bicriteria shortest path problem [6,21], and the bicriteria network flow problem [9,12] have been proposed. The running times of these algorithms depend crucially on the number of Pareto-optimal solutions and hence none of them runs in polynomial time in the worst case. In practice, however, generating the Pareto set is tractable in many situations. For instance, Müller-Hannemann and Weihe [13] study the number of Pareto-optimal solutions in multi-criteria shortest path problems experimentally. They consider networks that arise from computing the set of best train connections (in view of travel time, fare, and number of train changes) and conclude that in this application scenario generating the complete Pareto set is tractable even for large instances. For more examples, we refer the reader to [8].
One way of coping with the bad worst-case behavior is to relax the requirement of finding the complete Pareto set. Papadimitriou and Yannakakis present a general framework for finding approximate Pareto sets. A solution x is ε-dominated by another solution y if p(x)/ p(y) ≤ 1 + ε and w(y)/w(x) ≤ 1 + ε. We say that P ε is an εapproximation of a Pareto set P if for any solution x ∈ P there is a solution y ∈ P ε that ε-dominates it. Papadimitriou and Yannakakis [18] show that for any Pareto set P, there is an ε-approximation of P with polynomially (in the input size and 1/ε) many points. Furthermore, they give a sufficient and necessary condition for the existence of an FPTAS for approximating the Pareto set of a multi-criteria optimization problem. Vassilvitskii and Yannakakis [24] and Diakonikolas and Yannakakis [7] investigate the problem of computing ε-approximate Pareto sets of small size.
After the seminal work of Spielman and Teng about the simplex algorithm [22], smoothed analysis has proven to be a good tool for narrowing the gap between practical experience and theoretical results, and it has been used to explain the practical success of algorithms in various areas. Some of these results are summarized in the surveys by Spielman and Teng [23] and and by Manthey and Röglin [15].
Beier and Vöcking [5] initiated the study of the smoothed number of Pareto-optimal solutions. They consider the special case of our model in which variables can take only binary values, i.e., S ⊆ {0, 1} n , and show that the smoothed number of Paretooptimal solutions is bounded from above by O(n 4 φ). Furthermore, they present a lower bound of Ω(n 2 ) on the expected number of Pareto-optimal solutions for profits that are chosen uniformly from the interval [0, 1]. Brunsch et al. [2] improve this lower bound to Ω(n 2 φ).
Röglin and Teng [20] study the same binary model, except that they allow any finite number d of perturbed linear objective functions (plus one arbitrary adversarial objective function) and that they require the densities according to which the coefficients are chosen to have a bounded support in [−1, 1]. They obtain an upper bound of (nφ) f (d) for the smoothed number of Pareto-optimal solutions, where f grows exponentially in d. This result has been improved by Moitra and O'Donnell [14] to O(n 2d φ d(d+1)/2 ) and afterwards by Brunsch and Röglin [3] to O(n 2d φ d ) under the additional assumption that all densities are quasiconcave. Under this assumption they also prove an upper bound of O((n 2d φ d ) c ) for the c-th moment of the smoothed number of Pareto-optimal solutions for any constant c ∈ N, which gives rise to non-trivial concentration bounds. Furthermore they also consider the case that S ⊆ {0, 1, . . . , k} n and they analyze the effect of zero-preserving perturbations, in which the adversary can choose for each coefficient either a φ-bounded density function according to which it is chosen or set it deterministically to zero. The analyses in these articles are significantly more complicated than the upper bound presented in this article because they are targeted to problems with more than two objective functions.
Brunsch et al. [2] also prove a lower bound of Ω(n d−1.5 φ d ) for the smoothed number of Pareto-optimal solutions in the case of d perturbed linear objective functions.

Model and notation
Let D ⊆ Z be a finite set of integers and let S ⊆ D n denote an arbitrary set of solutions. We are interested in the cardinality of the set P ⊆ S of Pareto-optimal solutions with respect to the objective functions profit p : S → R and weight w : S → R, which are to be maximized and minimized, respectively. While the objective function w can be chosen arbitrarily by an adversary, the objective function p is assumed to be linear of the form p(x) = p 1 x 1 + . . . + p n x n , where x = (x 1 , . . . , x n ) T ∈ S. By abuse of notation, let p not only denote the objective function, but also the vector ( p 1 , . . . , p n ) T . Then the profit p(x) of a solution x ∈ S can be written as p · x.
We assume that each p i is a random variable that (independently of the other p j ) follows a density f i with f i (x) ≤ φ i for all x ∈ R. Furthermore, we denote by μ i the expected absolute value of p i , i.e., For given parameters D, n, φ, and μ we assume that an adversary chooses the set S, the objective function w, and the densities f i so as to maximize the expected number of Pareto-optimal solutions. We refer to the largest expected number of Pareto-optimal solutions he can achieve as the smoothed number of Pareto-optimal solutions (with respect to D, n, φ, and μ).
We denote by [n] the set {1, . . . , n}, we use the notation d = |D| and Δ = max{a − b | a, b ∈ D}, and we denote by H n the nth harmonic number, i.e., H n = n i=1 1/i.

Our results
In this article, we present a new approach for bounding the smoothed number of Pareto-optimal solutions for bicriteria integer optimization problems. This approach follows to a large extent the analysis in [4] with one improvement from [19], which removes a factor of H d . Altogether we obtain the following bound.
This implies that the smoothed number of Pareto-optimal solutions is bounded from above by O(Δd · n 2 φμ). For D = {0, . . . , k} and constant expected absolute value μ, the bound simplifies to O(n 2 φ · k 2 ). For the binary case D = {0, 1} the bound further simplifies to O(n 2 φ). This improves significantly upon the previously known bound of O(n 4 φ) due to Beier and Vöcking [5]. The bound O(n 2 φ) follows also from the result of Moitra and O'Donnell [14] (which appeared after the conference version of this article [4]). However, the proof of Theorem 1 is much simpler than the analysis of Moitra and O'Donnell, which is targeted to problems with more than two objective functions. It is also much simpler than the analysis of Beier and Vöcking [5] and it allows unbounded coefficients p i (as long as the expected absolute value is bounded), whereas all results for more than two objectives assume that the coefficients have bounded support. Furthermore, all results for more than two objective functions in the literature either consider only the binary case (Moitra and O'Donnell [14]) or the dependence on k is much worse than k 2 (it is k 32 in Brunsch and Röglin [3]). Hence none of these results implies Theorem 1.
Additionally, we also present two lower bounds on the smoothed number of Paretooptimal solutions. In the following lower bound we use the term ranking to refer to the objective function w. The higher the ranking, the smaller is the weight. In the proof of this lower bound we assume that the profits are chosen uniformly at random from the interval [−1, 1] and hence, the lower bound holds for any φ with φ ≥ 1. The lower bound matches the upper bound in terms of n and k. We also obtain a slightly weaker lower bound for the case that the adversary is restricted to linear weight functions.
If the profits are drawn according to the uniform distribution over some interval [0, a] with a > 0, then the above term equals the expected number of Pareto-optimal solutions.
While Theorem 2 shows that the upper bound in Theorem 1 is tight in terms of n and k, it is an open problem to find a lower bound that is additionally tight in terms of φ.

Knapsack Problem
The Nemhauser-Ullmann algorithm solves the knapsack problem by enumerating the set of Pareto-optimal solutions [17]. This means that the capacity of the knapsack is neglected and all knapsack fillings that are Pareto-optimal with respect to profit and weight are enumerated. Then, the optimal solution of the knapsack problem is the Pareto-optimal solution with the highest weight not exceeding the capacity. The running time of this algorithm on an instance with n items is Θ( n i=1 q i ), where q i denotes the number of Pareto-optimal solutions of the knapsack instance that consists only of the first i items.
We assume that an adversary can choose arbitrary weights and that he chooses a probability distribution for every profit arbitrarily subject to the bounds on the maximum density and the expected absolute value. Using their bound of O(n 4 φ) on the smoothed number of Pareto-optimal solutions and linearity of expectation, Beier and Vöcking [5] show that the smoothed running time of the Nemhauser-Ullmann algorithm is bounded by O(n 5 φ). Here the term smoothed running time refers to the maximum expected running time that can be achieved by choosing weights and probability distributions for the profits. Based on our improved bound on the expected number of Pareto-optimal solutions presented in Theorem 1, we conclude the following corollary.

Corollary 1 The smoothed running time of the Nemhauser-Ullmann algorithm for the knapsack problem is O(n 3 φ).
For uniformly distributed profits Beier and Vöcking present a lower bound on the expected running time of Ω(n 3 ). Hence, we obtain tight bounds on the running time of the Nemhauser-Ullmann algorithm in terms of the number of items n.

Bounded Knapsack problem
In the bounded knapsack problem, a number k ≥ 2 and a set of n items with weights and profits are given, and it is assumed that k − 1 copies of each of the n items are available. We assume that an adversary can choose arbitrary weights and that he chooses a probability distribution for every profit arbitrarily subject to the bounds on the maximum density and the expected absolute value. Then according to Theorem 1 the expected number of Pareto-optimal solutions is bounded from above by O(n 2 k 2 φ).
This observation alone does not yet imply that the Pareto set can be computed efficiently. However, Kellerer et al. [10] describe how an instance of the bounded knapsack problem with n items can be transformed into an instance of the knapsack problem with nK items, where K = Θ(log k). We will call the items of this instance of the knapsack problem virtual items in the following. In the transformation, K numbers 1 , . . . , K ∈ {0, 1, . . . , k −1} with K i=1 i = k −1 are chosen and every item i in the bounded knapsack instance with profit p i and weight w i is replaced by K virtual items with profits 1 p i , . . . , K p K and weights 1 w i , . . . , K w K . Using this transformation, the bounded knapsack problem can be solved by the Nemhauser-Ullmann algorithm in running time Θ( nK i=1 q i ), where q i denotes the number of Pareto-optimal solutions of the knapsack instance that consists only of the first i virtual items.
In order to obtain an upper bound on the expected value of q i , we can directly use Theorem 1. To see this, observe that the knapsack instance with only the first i virtual items can be viewed as an instance of the bounded knapsack problem in which only certain multiplicities of every item are allowed. Hence, for an appropriately chosen set S i of solutions Theorem 1 applies and yields a bound of O(n 2 k 2 φ) for the expected value of q i . Based on this, we obtain the following corollary. Bicriteria shortest path problem Different algorithms have been proposed for enumerating the Pareto set in bicriteria shortest path problems [6,21]. An instance of the bicriteria shortest path problem is described by a graph with n nodes and m edges in which each edge has certain costs and a certain length. The costs and the length of a path are then simply the sum of the costs and lengths of its edges, and the goal is to compute all Pareto-optimal paths. Given this Pareto set, one can in particular solve the constrained shortest path problem in which a budget is given and the goal is to find the shortest path whose costs do not exceed the given budget. As in the knapsack problem, the optimal solution to this problem is the Pareto-optimal solution with the largest costs not exceeding the budget.
Corley and Moon [6] suggest a modified version of the Bellman-Ford algorithm for enumerating the Pareto set of the bicriteria shortest path problem. Beier [1] shows that the running time of this algorithm is O(nmU ) where U is an upper bound on the number of Pareto-optimal solutions in certain sub-problems. These subproblems can be described by sets S of solutions that are subsets of {0, 1} m . Given the bound on the smoothed number of Pareto-optimal solutions [5], Beier concludes that the smoothed running of this modified Bellman-Ford algorithm is O(nm 5 φ) if either the costs or the lengths of the edges are perturbed. Based on Theorem 1, we obtain the following improved bound.

Corollary 3 The smoothed running time of the modified Bellman-Ford algorithm is O(nm 3 φ) if either the costs or the lengths of the edges are perturbed.
In the following two sections, we prove the upper and lower bounds on the smoothed number of Pareto-optimal solutions.

Upper bound on the smoothed number of Pareto-optimal solutions
Since the profits are continuous random variables, the probability that there exist two solutions with exactly the same profit is zero. Hence, we can ignore this event and assume that no two solutions with the same profit exist. Furthermore, we assume without loss of generality that there are no two solutions with the same weight. If the adversary specifies a weight function in which two solutions have the same weight, we apply an arbitrary tie-breaking, which cannot decrease the expected number of Paretooptimal solutions. We will now prove the upper bound on the smoothed number of Pareto-optimal solutions.
Proof We start the proof by defining d subsets of the Pareto set. We say that a Paretooptimal solution x belongs to class a ∈ D if there exists an index i ∈ [n] with x i = a such that the succeeding Pareto-optimal solution y satisfies y i = a, where succeeding Pareto-optimal solution refers to the Pareto-optimal solution with the smallest weight among all solutions with higher profit than x (see Fig. 1). The Pareto-optimal solution with the highest profit, which does not have a succeeding Pareto-optimal solution, is not contained in any of the classes, but every other Pareto-optimal solution belongs to at least one of these classes. Let q denote the number of Pareto-optimal solutions and let q a denote the number of Pareto-optimal solutions in class a. Since q ≤ 1 + a∈D q a , linearity of expectation implies Fig. 1 The solutions from S are depicted as points, where a solution z ∈ S corresponds to the point at (w(z), p(z)). Black points correspond to Pareto-optimal solutions and they dominate all solutions below the step function, which are depicted in gray. The Pareto-optimal solution y succeeds the Pareto-optimal solution x, which is class-a Pareto-optimal because there exists an index i with x i = a and y i = a The following lemma, whose proof can be found below, shows an upper bound for the expected number of class-0 Pareto-optimal solutions.

Lemma 1 The smoothed number of class-0 Pareto-optimal solutions is at most
To conclude the proof of the theorem, we show that counting the expected number of class-a Pareto-optimal solutions for a ∈ D with a = 0 can be reduced to counting the expected number of class-0 Pareto-optimal solutions.
Starting from the original set S, we obtain a modified set S a by subtracting the vector (a, . . . , a) from each solution vector x ∈ S, that is, This way, the profit of each solution x in S a is smaller than the profit of its counterpart x + (a, . . . , a) in S by exactly a n i=1 p i if we extend the linear profit function p from S to S a . Let us additionally define a weight function w * : S a → R that assigns to every solution x ∈ S a the weight that w assigns to its counterpart in S.
Claim A solution in S a is Pareto-optimal with respect to p and w * if and only if its counterpart in S is Pareto-optimal with respect to p and w.
Proof Let x and y be two solutions from S and let x a and y a denote their counterparts in S a . Then w(x) = w * (x a ) and w(y) = w * (y a ). Furthermore p(x) = p(x a ) + a n i=1 p i and p(y) = p(y a ) + a n i=1 p i . Hence, p(x) > p(y) if and only if p(x a ) > p(y a ). Overall this implies that x dominates y if and only if x a dominates y a . Since this is the case for every pair of solutions, the claim follows.
A solution x is class-a Pareto-optimal in S if and only if the corresponding solution x −(a, . . . , a) is class-0 Pareto-optimal in S a . Hence, the number q a of class-a Paretooptimal solutions in S corresponds to the number q 0 (S a ) of class-0 Pareto-optimal solutions in S a , which can be bounded by Lemma 1. Note that by definition of D a we have |D a | = |D| = d as well as max{b Combining Equation (1) and Lemma 1 yields that E[q] is at most which proves the theorem.
To conclude the proof of Theorem 1, we prove Lemma 1.

Proof of Lemma 1
We assume 0 ∈ D as otherwise there are no class-0 solutions. This assumption yields |a| ≤ Δ for all a ∈ D.
The main part of the proof is an upper bound on the probability that there exists a class-0 Pareto-optimal solution whose profit lies in a small interval [t − ε, t), for some given t ∈ R and ε > 0. Roughly speaking, if ε is smaller than the smallest profit difference of any two Pareto-optimal solutions, then this probability equals the expected number of class-0 Pareto-optimal solutions in the interval [t − ε, t). Then we can divide R into intervals of length ε and sum these expectations to obtain the desired bound on the expected number of Pareto-optimal solutions.
Let t ∈ R be chosen arbitrarily. We define x * to be the solution from S with the lowest weight among all solutions satisfying the constraint p · x ≥ t, that is, If there does not exist a solution x ∈ S with p · x ≥ t then x * does not exist. Otherwise, the solution x * is Pareto-optimal. Letx denote the Pareto-optimal solution that precedes x * , that is,x See Fig. 2 for an illustration of these definitions. We aim at bounding the probability thatx is a class-0 Pareto-optimal solution whose profit falls into the interval [t − ε, t).
For this we classify class-0 Pareto-optimal solutions to be ordinary or extraordinary. Considering only ordinary solutions allows us to prove a bound that depends not only on the length ε of the interval but also on |t|, the distance to zero. This captures the intuition that it becomes increasingly unlikely to observe solutions whose profits are much larger than the expected profit of the most profitable solution. The final bound is We would like to mention that the classification into ordinary and extraordinary solutions is only necessary because we allowed density functions with unbounded support for the p i . If all densities would have a bounded support on, e.g., [−1, 1] then the separate treatment of extraordinary solutions would not be necessary.
We classify solutions to be ordinary or extraordinary as follows. Let x be a class-0 Pareto-optimal solution and let y be the succeeding Pareto-optimal solution, which must exist by definition. We say that x is extraordinary if for all indices i ∈ [n] with x i = 0 and y i = 0, all Pareto-optimal solutions z that precede x satisfy z i = 0. In other words, for those indices i that make x class-0 Pareto-optimal, y is the Pareto-optimal solution with the smallest profit that is independent of p i (see Fig. 3). We classify a class-0 Pareto-optimal solutions as ordinary if it is not extraordinary. For every index i ∈ [n], there can be at most one extraordinary class-0 Pareto-optimal solution. In the following, we restrict ourselves to solutionsx that are ordinary, and we denote by P 0 the set of ordinary class-0 Pareto-optimal solutions. We define the loser gap to be the slack of the solutionx from the threshold t (see Figure 2), that is, The converse is not true because it might be the case thatx / ∈ P 0 and that there exists another solution x ∈ P 0 with p · x ∈ [t − ε, t). If, however, ε is smaller than the minimum profit difference of any two Pareto-optimal solutions, then the existence of a solution x ∈ P 0 with p · x ∈ [t − ε, t) impliesx = x and hence Λ(t) ≤ ε. Let F(ε) denote the event that there are two Pareto-optimal solutions whose profits differ by at most ε, then In the following, we estimate, for a given b > 0, the expected number of Paretooptimal solutions whose profits lie in the interval (−b, b]. For this, we partition the interval (−b, b] into 2bm sub-intervals of length 1/m each, and we let the number 2bm of sub-intervals tend infinity. For m ∈ N and i ∈ {0, . . . , 2bm − 1}, we set Since the number of Pareto-optimal solutions is always bounded by |S| ≤ (d) n , we obtain The probability that two given solutions have a profit difference of at most ε can be bounded from above by 2εφ. In order to see this, consider two solutions x = y and choose an index i with x i = y i . Then use the principle of deferred decisions and assume that all p j with j = i are already fixed arbitrarily. Then the event | p · x − p · y| ≤ ε is equivalent to the event that p i takes a value in a fixed interval (depending on the p j with j = i) of length 2ε. Since the density of p i is bounded from above by φ, the probability of the event | p · x − p · y| ≤ ε is at most 2εφ. Hence, a union bound over all pairs of solutions (there are at most d 2n pairs) yields which tends to 0 when m tends to infinity. Hence, it holds Under the condition ¬F(1/m), every interval I m i can contain at most one Paretooptimal solution, and hence, under this condition, the probability that I m i contains a Pareto-optimal solution from P 0 equals the expected number of Pareto-optimal solutions from P 0 in I m i , yielding together with (2) and (3) that the expected number of ordinary class-0 Pareto-optimal solutions with profits in The only missing part is to analyze the probability of the event Λ(t) ≤ ε for given t ∈ R and ε > 0, which is done in the following lemma.
Lemma 2 For all t ∈ R and ε > 0, Lemma 2 yields the following upper bound on (4): We consider Pr Δ · n i=1 | p i | ≥ |t| as a function of t. Because the density of p i is bounded from above, this function is continuous. Therefore by the definition of the Riemann integral, we can rewrite the previous limit as This term is an upper bound on the expected number of ordinary class-0 Pareto-optimal solutions in the interval (−b, b]. Letting b tend to infinity and using that the expected absolute value of profit p i is μ i yield that the expected number of ordinary class-0 Pareto-optimal solutions can be bounded from above by Since there are at most n extraordinary class-0 Pareto-optimal solutions , this proves the lemma.
We conclude the proof of Theorem 1 and Lemma 1 by proving Lemma 2.

Proof of Lemma 2
In order to analyze the probability of the event Λ(t) ≤ ε, we define a set of auxiliary random variables such that Λ(t) is guaranteed to always take a value also taken by at least one of the auxiliary random variables. Then we analyze the auxiliary random variables and use a union bound to conclude the desired bound for Λ(t).
and v ∈ D. We denote by x * (i) the solution from S x i =0 with lowest weight with profit at least t, that is, For each i ∈ [n] we define the set L i as follows. If there does not exist a solution x ∈ S x i =0 with p · x ≥ t then x * (i) does not exist. If x * (i) does exist and there also exists a solution in S x i =0 with profit smaller than t, then L i is defined as the set that consists of all solutions from v∈D S x i =v that have smaller weight than x * (i) , otherwise L i = ∅. Letx (i) denote the Pareto-optimal solution from the set L i with the highest profit, that is,x Note that we must have p ·x (i) < p · x * (i) because otherwise p ·x (i) ≥ p · x * (i) ≥ t and w ·x (i) < w · x * (i) , contradicting the choice of x * (i) . Finally, we define for each i ∈ [n], the auxiliary random variable Observe that the definitions of Λ(t) and Λ i (t) are very similar. The only difference in the definitions of x * and x * (i) is that we require x * (i) i = 0. The only difference in the definitions ofx andx (i) is that we requirex The reason for these additional constraints is that they will help us to apply the principle of deferred decisions. Intuitively, even if all p j with j = i are fixed arbitrarily, the randomness of p i suffices to bound the probability of the event Λ i (t) ∈ (0, ε]. Proof Assume that Λ(t) ≤ ε. Then by definition, x * andx exist andx ∈ P 0 , i.e.,x is an ordinary class-0 Pareto-optimal solution. Sincex is class-0 Pareto-optimal and x * is the succeeding Pareto-optimal solution, there exists an index i ∈ [n] such that (a) x * i = 0 andx i = v = 0 for some v ∈ D , and (b) there exists a solution x ∈ S x i =0 with profit smaller than t. The second condition is a consequence of the assumption thatx is not extraordinary, that is, there exists a Pareto-optimal solution z with z i = 0 that has smaller profit than x and hence smaller profit than t (this is important because otherwise by definition L i = ∅). Recall that x * (i) is defined to be the solution with the smallest weight in S x i =0 with p · x ≥ t. As x * ∈ S x i =0 , x * = x * (i) . Moreover, L i consists of all solutions from v∈D S x i =v that have smaller weight than x * . Thus,x ∈ L i . By construction,x has the highest profit among the solutions in L i and therefore,x (i) =x and Λ i (t) = Λ(t).
We continue the proof by analyzing the probability of the event Λ i (t) ∈ (0, ε]. If Λ i (t) ∈ (0, ε], which implies Λ i (t) = ∞, then the following three events must occur simultaneously: The solutionx (i) exists and its profit lies in the interval [t − ε, t).
The events E 1 and E 2 depend only on the profits p j , j = i. The existence and identity ofx (i) depends additionally on p i , but the profits p j , j = i, determine a setX (i) of at most d − 1 candidate solutions such thatx (i) ∈X (i) ifx (i) exists.
For each i ∈ [n] and v ∈ D , we partition the set L i = v∈D L (i,v) as follows. If L i = ∅, then we define L (i,v) = ∅. Otherwise L (i,v) consists of all solutions from L i ∩ S x i =v , i.e., all solutions from S x i =v that have smaller weight than x * (i) . Letx (i,v) denote the Pareto-optimal solution from the set L (i,v) with the highest profit, that is, LetX (i) denote the set that contains allx (i,v) that exist (i.e., for which L (i,v) = ∅): For all v ∈ D the existence and identity ofx (i,v) is completely determined by the profits p j , j = i. Hence, if we fix all profits except for p i , thenx (i,v) is fixed and its profit is κ (i,v) + vp i for some constant that depends only on the profits p j with j = i.
With these definitions the event E 3 is equivalent to the event To analyze the probability that, Then in order for E 3 to be true at least one of the following events must occur: Proof Consider an element v ∈ D for which the maximum in the definition of p < is taken. For this v, we have κ (i,v) In order for E (3,<0) to be true, we therefore must have p i ∈ [p < , p < + ε].
GivenX (i,>0) = ∅, we analogously let Proof Consider an element v ∈ D for which the minimum in the definition of p > is taken. For this v, we have κ (i,v) In order for E (3,>0) to be true we therefore must have p i ∈ [p > , p > + ε].
Note that p < and p > depend only on the profits already fixed, i.e., the profits p j with j = i. Since the events E 1 and E 2 are independent of p i , we obtain 1 The event E 2 implies −Δ n j=1 | p j | < t. This is equivalent to Δ n j=1 | p j | ≥ −t because the probability that the inequality is satisfied with equality is zero. For t ≤ 0, the event E 1 implies Δ n j=1 | p j | ≥ t, and hence, for every t ∈ R, one of the events implies Δ n j=1 | p j | ≥ |t|. This yields Since the events E 1 and E 2 do not depend on p i and in our analysis of E 3 we assumed that all p j with j = i can be arbitrarily fixed, we obtain To conclude the proof, we apply a union bound and Lemma 3:

Lower bounds on the smoothed number of Pareto-optimal solutions
In this section we first present a lower bound of Ω(n 2 k log k) on the smoothed number of Pareto-optimal solutions for S = {0, . . . , k} n , generalizing a bound for the binary domain presented in [5]. Afterwards we prove a stronger bound of Ω(n 2 k 2 ) under stronger assumptions. The weaker bound provides a vector of weights w 1 , . . . , w n such that the bound holds for a linear weight function w · x. For the stronger bound we can only prove that there is some weight function w : S → R for which the bound holds but this function might not be linear.

Lower bound for linear weight functions
For linear weight functions, we prove the following lower bound on the expected number of Pareto-optimal solutions restated in Theorem 3. Similarly, a lower bound of Ω(n 2 k log k) can be obtained for the case that f is the density of a Gaussian random variable with mean 0. Since all weights w i are larger than 0, a solution with a negative profit cannot be contained in a Pareto-optimal solution. Hence, we can ignore those items. Restricted to the interval [0, ∞) the density of a Gaussian random variable with mean 0 is non-increasing and hence we can apply Theorem 3 when taking into account that with high probability at least a constant fraction of the random variables take positive values.

Proof of Theorem 3
The set S = {0, . . . , k} n corresponds to the solution set of the bounded knapsack problem in which up to k identical copies of each item can be put into the knapsack. For the sake of a simple presentation, we describe our construction in terms of this knapsack problem. We fix the weights of all items by setting w i = (k +1) i for all i ∈ [n]. This way, the lexicographic order of the solutions in S is the same as the order defined by the weight w · x of solutions. Since the density function of the profits is assumed to be non-increasing, the distribution function F : R ≥0 → [0, 1] is concave as F = f . Furthermore, F(0) = 0. Observe that such a function is sub-additive, that is, Let S j denote the set of the first (k + 1) j solutions in the lexicographic order, which are exactly those solutions that contain only copies of the items 1, . . . , j. We define P j = k j i=1 p i and we denote by P j the set of Pareto-optimal solutions over S j . Observe that the last solution in S j has profit P j and it is Pareto-optimal with probability 1.
For any given α > 0, let X j α denote the number of Pareto-optimal solutions in P j with profit at least P j − α, not counting the last solution in this sequence, which

] it holds Pr
Claim For j = 1, the base case of the induction, we have Proof Since w 1 > 0 and p 1 > 0 with probability one, we have P 1 = S 1 = {(i, 0, . . . , 0) | i ∈ {0, . . . , k}}. By definition X 1 α counts the number of solutions in P 1 with profit at least P 1 − α = kp 1 − α, not counting the last solution (k, 0, . . . , 0). Hence, This implies that X 1 α = holds if and only if the event A 1 occurs. Hence, These blocks are depicted as rectangles. Block 0 (the leftmost block) corresponds to S j−1 and the other blocks are shifted copies of block 0. The points inside the blocks depict the Pareto-optimal solutions from P j−1 . Let ≥ 1. Our choice of the weights guarantees that each solution in block has more weight than any solution from a block < and hence solutions from block cannot dominate solutions from a block < . On the other hand any solution from a block < has profit at most P j−1 + ( − 1) p j and hence solutions from block with a larger profit cannot be dominated by solutions from a block < . Since every block is a shifted copy of block 0, these solutions correspond to solutions from block 0 with a profit larger than P j−1 − p j (there are three such solutions in the example). Hence, exactly these solutions give rise to one new Pareto-optimal solution in each of the k following blocks. Pareto-optimal solutions in P j are marked in black Now we consider the case j > 1. We group the solutions in S j into k + 1 blocks, with block ∈ {0, . . . , k} containing all solutions with x j = . Block 0 corresponds to S j−1 . Each Pareto-optimal solution in S j−1 with profit in the interval (P j−1 − p j , P j−1 ] gives rise to one new Pareto-optimal solution in each of the k following blocks (see Fig. 4). In the event A where the last inequality follows from the induction hypothesis. We can further rewrite this term as where the inequality is due to the fact that the function F is sub-additive. If every profit is chosen uniformly at random from some interval [0, a] with a > 0, then this term equals exactly the expected number of Pareto-optimal solutions. Now let Y j = |P j | − |P j−1 | denote the number of new Pareto-optimal solutions in P j . Observe that Y j = k X j−1 p j + k. This follows from the fact that each Paretooptimal solution in P j−1 with profit in the interval (P j−1 − p j , P j−1 ] gives rise to k new Pareto-optimal solutions (see Fig. 4). The additive k is due to the fact that the last solution in P j−1 is not counted in X j−1 p j but yields k new solutions in P j . Since p j and X j−1 α are independent, the induction hypothesis implies Furthermore, the number of Pareto-optimal solutions in P n is q = 1 + n j=1 Y j . The additional 1 is due to the first solution (0, . . . , 0), which is always Pareto-optimal. Therefore, The random variable F( p j ) is uniformly distributed over the interval [0, 1]. To see this, observe that for any α ∈ [0, 1] we have where F −1 denotes the inverse function of F. This function is not unique in general because the distribution F is not injective in general. However, the argument works for any choice of If the profits are drawn according to the uniform distribution over some interval [0, a] with a > 0, then the above inequality holds with equality.

Lower bound for general weight functions
Every weight function induces a ranking on the set of solutions, and in the following, we use the terms weight function and ranking synonymously. We assume that k is a function of n with (5(c + 1) + 1) log n ≤ k ≤ n c for some constant c. We use the probabilistic method to show that, for each sufficiently large n ∈ N, a ranking exists for which the expected number of Pareto-optimal solutions is lower bounded by Ω(n 2 k 2 ). That is, we create a ranking at random (but independently of the profits) and show that the expected number of Pareto-optimal solutions (where the expectation is taken over both the random ranking and the random profits) satisfies the desired lower bound. This implies that, for each sufficiently large n ∈ N, there must exist a deterministic ranking on {0, . . . , k} n for which the expected number of Pareto-optimal solutions (where the expectation is now taken only over the random profits) is Ω(n 2 k 2 ). Before we describe how the ranking is created, we want to give a short overview on the ideas of the proof. In order to show a lower bound of Ω(n 2 k 2 ) on the expected number of Pareto-optimal solutions for S = {0, . . . , k} n , we will use a similar approach as in Sect. 3.1. However, in order to obtain the higher bound, we need a larger number of items. We will use the existing original items to create new items called virtual items. As virtual items we will allow specific subsets of the n original items. We will randomly choose some of the virtual items and again similar to Sect. 3.1 we will create a ranking which can be represented as a linear function on binary combinations of the virtual items. Note that we might not be able to represent such a ranking by a linear function on the original items. The proof that this creates an expected number of at least Ω(n 2 k 2 ) Pareto-optimal solutions consists of two parts. In the first part we show that it is likely that the random set of virtual items creates a feasible instance and that, in case the profits of the original items are chosen uniformly at random, the distribution of the profits of the possible virtual items is likely to be close to a uniform distribution. In the second part we then show how to apply a similar proof as for Theorem 3 for a set of randomly chosen virtual items.
In order to describe how the ranking is created, we define virtual items. Let [n] be the set of original items and assume that we have k instances of each of these n items. A virtual item is a vector x ∈ D n . Intuitively, adding the virtual item x to the knapsack corresponds to inserting x i instances of the ith original item into the knapsack for every i ∈ [n].
Assume that a sequence x (1) , . . . , x ( ) of virtual items is given. Based on this sequence, we create a ranking on the set of solutions D n similar to the ranking used in Theorem 3 but for the binary case in which every virtual item can be "contained" at most once in every solution. That is, we create a ranking such that solutions that "contain" the ith virtual item cannot dominate solutions that "consist" only of a subset of the first i − 1 virtual items. Let S 0 = {(0, . . . , 0)} and assume that the solution (0, . . . , 0) has the highest rank, i.e., that it cannot be dominated by any other solution. Let S i denote the set of solutions that can be obtained by adding a subset of the first i virtual items, that is, In the ranking we define, each solution from S * i is ranked lower than every solution from S i−1 . It remains to define the ranking among two solutions x, y ∈ S * i . The solutions x and y can uniquely be written as x = x + x (i) and y = y + x (i) for some x , y ∈ S i−1 . Based on this observation, we define the ranking between x and y to be the same as the one between x and y . Furthermore, we define the ranking in such a way that all solutions in S \ S are ranked lower than all solutions in S . Hence, we do not need to consider the solutions in S \ S anymore. For a given sequence of virtual items, this yields a fixed ranking among the solutions in S .

Lemma 4
The probability that the sequence of virtual items is not valid because more than k copies of one original item are contained in the virtual items is at most 1/(nk) 5 .
Proof For i ∈ [n], let L i denote the number of instances of item i that are contained in the virtual items. We can bound the probability that L i exceeds k by A union bound yields We prove Theorem 2 in two steps. First we prove the following lemma about how the profits of the virtual items in V are distributed, where the profit of a virtual item x ∈ {0, 1} n is defined as p · x. Observe that scaling all profits by the same factor does not change the number of Pareto-optimal solutions. Hence, we can assume that the profits are chosen uniformly at random from the interval [−u, u] for an arbitrary u > 0.

Lemma 5
If the profits p 1 , . . . , p n of the original items are chosen independently uniformly at random from the interval [−n c+1 , n c+1 ], then there exist constants γ > 0 and p > 0 depending only on c such that with probability at least p, for each j ∈ {0, . . . , n c+1 − 1}, the set V contains at least n/γ virtual items whose profits lie in the interval ( j, j + 1).
Let us remark that we scaled the profits of the original items to the interval [−n c+1 , n c+1 ] in the lemma above only to keep the notation less cumbersome. The benefit of scaling is that we can consider intervals ( j, j + 1) for integer values of j. Throughout the entire remainder of this section we will assume that the profits of the original items are chosen uniformly from [−n c+1 , n c+1 ] without mentioning this explicitly anymore. This applies in particular to the proof of Theorem 2 below. This assumption is without loss of generality because scaling all profits does not change the set of Pareto-optimal solutions. Furthermore, we adapt the lower bound of Ω(n 2 ) in [5] for the binary case from uniformly random profits to profits that are chosen only "nearly" uniformly at random. To make this more precise, consider a knapsack instance with n items in which the ith item has weight 2 i and the profits of the items are chosen independently according to a probability distribution F : R → R ≥0 . Assume that F consists of two components, that is, there exists a constant δ > 0 such that F = δ · U + (1 − δ) · G for two probability distributions U and G. Furthermore, assume that U has the property that for each j ∈ {0, . . . , T − 1} it holds Pr [X ∈ ( j, j + 1)] = 1/T for a random variable X distributed according to U and some T ≥ n.

Lemma 6
The expected number of Pareto-optimal solutions in the aforementioned scenario is at least δ 2 n 2 /128. Together Lemmas 4, 5, and 6 and the upper bound on the expected number of Pareto-optimal solutions presented in Theorem 1 imply Theorem 2.

Proof of Theorem 2
Assume that the ranking on the set of solutions is determined as described above, that is, the ranking is induced by randomly chosen virtual items from V . Let F 1 denote the event that there exists some j ∈ {0, . . . , n c+1 − 1} for which less than n/γ elements in V have a profit in ( j, j + 1). Due to Lemma 5, the probability of the event F 1 is at most 1 − p. Intuitively, the failure event F 1 occurs if the profits p 1 , . . . , p n are chosen such that the profit distribution of the virtual items is not uniform enough.
We first analyze the number q of Pareto-optimal solutions in a different random experiment. In this random experiment, we do not care if the sequence of virtual items is valid, that is, we assume D = {0, . . . , }, for which the sequence is always valid. We later use q as an auxiliary random variable to analyze the number of Pareto-optimal solutions for the setting D = {0, . . . , k}, which we actually care about.
Proof Let X denote the random variable that describes the profit of a uniform random virtual item chosen from V . We argue in the following that under the assumption that F 1 does not occur, we can write the distribution of X in a form such that Lemma 6 is applicable.
Under the assumption that F 1 does not occur, for every j ∈ {0, . . . , n c+1 −1} at least n/γ virtual items have a profit in the interval ( j, j +1). For each interval ( j, j +1) we choose exactly n/γ virtual items with a profit in that interval arbitrarily. We call these virtual items good and all other virtual items bad. Altogether there are n c+2 /γ good items. Since there are (n ) c+2 virtual items in V in total, the probability that one of the good items is chosen is δ = n c+2 /(γ n c+2 ) = (c + 2) c+2 /γ . Under the condition that a good item is chosen, which happens with probability δ, the probability that X takes a value in the interval ( j, j + 1) is exactly 1/n c+1 for every j ∈ {0, . . . , n c+1 − 1}. This corresponds to the distribution U in the setting of Lemma 6.
The number of virtual items in the sequence is = nk/(2e(c + 2)). These virtual items are chosen independently and if we assign the ith virtual item x (i) a weight of 2 i we can apply Lemma 6, yielding that the expected number of Pareto-optimal solutions of a knapsack instance with the virtual items is at least Altogether, we have shown E q ¬F 1 ≥ κ n 2 k 2 .
Observe that it can happen that there exist different subsets of the virtual items that represent the same original solution, i.e., there exist I , J ⊆ {1, . . . , } with I = J such that i∈I x (i) = j∈J x ( j) . Due to the definition of S * i as S i \ S i−1 , only the solution with highest ranking among these is considered in our construction. Since all solutions of the knapsack instance that represent the same original solution have the same profit only the one with the highest ranking can be a Pareto-optimal solution. Hence, leaving out the other solutions in our construction does not affect the number of Pareto-optimal solutions. Now we take into account that the sequence of virtual items might not be a valid sequence for D = {0, . . . , k} because more than k copies of one original item are contained in the virtual items. Let F 2 denote the event that the sequence of virtual items is not allowed because it contains more than k instances of one item. Due to Lemma 4, we know that Pr [F 2 ] ≤ 1/(nk) 5 . Remember that if this failure event occurs, the ranking is set to an arbitrary ranking on D n . Let q denote the number of Pareto-optimal solutions. By definition of q and the failure event F 2 , we know that E[ q | ¬F 2 ] = E q ¬F 2 . Furthermore, since F 2 does not affect the choice of the profits, we can use Theorem 1 to bound E q F 2 , but we have to take into account that in the modified random experiment for which q is defined we have D = {0, . . . , }. Hence, we obtain E q F 2 ≤ κ n 4 k 2 for a sufficiently large constant κ .
Putting these results together yields for a sufficiently large constant κ.

Proof of Lemma 5
In order to prove Lemma 5, we analyze an auxiliary random experiment first. A well-studied random process is the experiment of placing n balls uniformly and independently at random into m bins. In this random allocation process, the expected load of each bin is n/m and one can use Chernoff bounds to show that in the case n ≥ m it is unlikely that there exists a bin whose load deviates by more than a logarithmic factor from its expectation. In this section, we consider a random experiment in which the locations of the balls are chosen as linear combinations of independent random variables. Since the same random variables appear in linear combinations for different balls, the locations of the balls are dependent in a special way.
Let c ∈ N with c ≥ 2 be an arbitrary constant and assume that we are given n independent random variables that are chosen uniformly at random from the interval [−n c+1 , n c+1 ]. We assume that n is a multiple of c + 2, and we partition the set of random variables into c + 2 sets with n = n/(c + 2) random variables each. For i ∈ {1, . . . , c + 2} and j ∈ {1, . . . , n }, let p i j denote the j-th random variable in the ith group.
For every ∈ [c + 2], we consider a random experiment in which the set of balls is [n ] and the bins are the intervals (− n c+1 , − n c+1 + 1), . . . , ( n c+1 − 1, n c+1 ). In the following, bin j denotes the interval ( j, j + 1). Hence, the number of balls is (n ) and the number of bins is 2 n c+1 . Instead of placing these balls independently into the bins, the location of a ball a ∈ [n ] is chosen to be p 1 a 1 +· · ·+ p a , that is, it is placed in bin p 1 a 1 + · · · + p a . We will refer to this random process as round in the following. We show that despite these dependencies, the allocation process generates a more or less balanced allocation with constant probability. We use the following weighted Chernoff bound whose proof can be found in Appendix A.
Lemma 7 Let X 1 , . . . , X n be independent discrete random variables with values in [0, z] for some z > 0. Let X = n i=1 X i and μ = E[X ]. Then for every x > 0, We will first study how the random experiments for different values of are related. In round 1, there are n balls that are placed at the positions p 1 1 , . . . , p 1 n . These positions are chosen independently and uniformly at random from the interval [−n c+1 , n c+1 ]. The bins correspond to the intervals ( j, j + 1) for j ∈ {−n c+1 , n c+1 − 1}. Hence, each ball is placed in each bin with probability 1/(2n c+1 ) and the process in round 1 corresponds to the well-studied setting that n balls are independently and uniformly at random allocated to 2n c+1 bins. We define F 1 to be the event that there exists a bin that contains more than one ball after the first round. Since c ≥ 2, a union bound over the n 2 ≤ (n ) 2 pairs of balls implies because the probability that two specific balls are assigned to the same bin is 1/(2n c+1 ). Let ≥ 2. We can describe round as follows: Replace each ball from round − 1 by n identical copies. Then these n copies are moved, where the location of the j-th copy is obtained by adding p j to the current location. Let X j denote the number of balls in bin j after the th round.
For ∈ {2, . . . , c}, let F denote the event that after round there exists a bin that contains more than (2c + 4) −1 balls. Furthermore, let F c+1 denote the event that after round c + 1 there exists a bin that contains more than ln n balls.

Lemma 8 After round c+1, the average number of balls per bin is a constant depending on c, and with probability 1 − o(1), the maximal number of balls in any bin is bounded from above by ln n.
Proof The number of balls in round c + 1 is (n ) c+1 and the number of bins is 2(c + 1)n c+1 . Hence, the average number of balls per bin is Let ∈ {2, . . . , c}. Assume that the random variables in the first − 1 groups are already fixed in such a way that the event F −1 does not occur. Under this assumption, also the variables X −1 j are fixed and have values of at most (2c + 4) −2 . Consider a bin j after round − 1 and assume that to all elements in that bin the d-th element of the th group is added. The locations of the balls obtained this way are in the interval ( j + p d , j + p d + 2), that is, they lie either in bin j + p d or j + p d + 1. Hence, we can bound X j by Hence, when the random variables in the first −1 groups are fixed such that F −1 does not occur, then X j is bounded by the sum of independent discrete random variables Y −1 j, p d that take only values from the set {0, . . . , 2(2c + 4) −2 }. The expected value of X j is bounded from above by (n ) /(2n c+1 ) < 1/n. Altogether, this implies that we can use Lemma 7 to bound the probability that X j exceeds its expectation. We obtain Applying a union bound over all 2 n c+1 bins j yields Now consider round c + 1. The expected value of X c+1 j is bounded from above by (n ) c+1 /(2n c+1 ) < 1 and the same arguments as for the previous rounds show that X c+1 j can be bounded by the sum of independent random variables with values from the set {0, . . . , 2(2c + 4) c−1 } when the random variables in the first c groups are fixed such that F c does not occur. Hence, we can again apply Lemma 7 to obtain Let F c+1 denote the event that after round c + 1 there exists a bin that contains more than ln n balls. Applying a union bound over all 2(c + 1)n c+1 bins yields Pr F c+1 | ¬F c ≤ (2(c + 1)n c+1 ) · n − ln ln n/(2(2c+4) c−1 )+1 = o(1). Now we can bound the probability that F c+1 occurs as Based on Lemma 8, we prove the following lemma about the allocation after round c + 2, which directly implies Lemma 5. To see this implication, observe that the auxiliary random experiment that we analyze in this section corresponds exactly to the setting of Lemma 5, where the virtual items correspond to the balls in round c + 2 and the intervals ( j, j +1) in Lemma 5 correspond to the bins. The random variables p i j for i ∈ {1, . . . , c + 2} and j ∈ {1, . . . , n } with n = n/(c + 2) correspond to the profits p 1 , . . . , p n of the original items.

Lemma 9
For every constant c ≥ 2, there exist constants γ > 0 and p > 0 such that with probability at least p the above described process yields after round c + 2 an allocation of the (n ) c+2 balls to the 2(c + 2)n c+1 bins in which every bin j ∈ {0, . . . , n c+1 − 1} contains at least n/γ balls.
Proof In order to analyze the last round, we need besides ¬F c+1 one additional property that has to be satisfied after round c + 1. Let Y denote the number of balls after round c +1 that are assigned to bins j with j ∈ {0, . . . , n c+1 −1}. The probability that a fixed ball a ∈ [n ] c+1 is placed in one of these bins is at least 1/(2(c + 1)). Hence, the expected value of Y is at least (n ) c+1 /(2(c + 1)). Let Y denote the number of balls after round c + 1 that are not assigned to bins in {0, . . . , n c+1 − 1}. The expected value of Y is at most (n ) c+1 (2c + 1)/(2c + 2). Applying Markov's inequality yields which we already used in the proof of Lemma 8, takes only values in the interval {0, . . . , 2 ln n} because each X c+1 i is at most ln n. We apply Lemma 7 to bound the probability that X c+2 j deviates from its mean: Let F denote the event that there exists a bin j ∈ {0, . . . , n c+1 − 1} whose load is smaller than n/(2(c + 2) c+5 ). We can bound the probability of F by Pr F | ¬F c+1 ∩ ¬G ≤ n c+1 · 2 e n/(4(c+2) c+5 ln n) = o (1).

Altogether this implies
which yields the lemma.

Proof of Lemma 6
Beier and Vöcking [5] prove a lower bound of Ω(n 2 ) on the expected number of Pareto-optimal knapsack fillings for exponentially growing weights and profits that are chosen independently and uniformly at random from the interval [0, 1]. In this section, we adapt their proof for a random experiment in which the profits are chosen only "nearly" uniformly at random. Assume that we are given n items and that the ith item has weight w i = 2 i . Furthermore, let T ∈ N be given and assume that T ≥ n. In order to determine the profit p i of the ith item, first one of the intervals (0, 1), (1, 2), . . . , (T − 1, T ) is chosen uniformly at random. Then an adversary is allowed to choose the exact profit within the randomly chosen interval. We call an item whose profit is chosen this way a nearly uniform item. We prove that also in this scenario the expected number of Pareto-optimal solutions is lower bounded by Ω(n 2 ).

Lemma 10
For instances consisting of n nearly uniform items, the expected number of Pareto-optimal solutions is bounded from below by n 2 /16.

Proof
The proof follows along the lines of the proof of Theorem 3 for the binary case. Let P j denote the set of Pareto-optimal solutions over the first j items, and let P j denote the total profit of the first j items. For j ∈ [n] and α ≥ 0, let X j α denote the number of Pareto-optimal solutions from P j with profits in the interval [P j − α, P j ). Observe that p j > α implies X For integral α ∈ [T ], the adversary cannot influence the event p j < α, as the interval from which he is allowed to pick values for p j lies either completely left or completely right of α. Hence, for α ∈ [T ] we can bound the expected value of X j α recursively as follows: In the following, we prove by induction on j that for every α ∈ [T ], For j = 1 and α ∈ [T ], we obtain Using the induction hypothesis and (5) This yields the following lower bound on the expected number of Pareto-optimal solutions: We further generalize the scenario that we considered above and analyze the expected number of Pareto-optimal solutions for instances that do not only consist of nearly uniform items but also of some adversarial items. To be more precise, we assume that the profit of each item is chosen as follows: First a coin is tossed which comes up head with probability δ > 0. If the coin comes up head, then the profit of the item is chosen as for nearly uniform items, that is, an interval is chosen uniformly at random and after that an adversary may choose an arbitrary profit in that interval. If the coin comes up tail, then an arbitrary non-integer profit can be chosen by an oblivious adversary who does not know the outcomes of the previous profits.

Proof of Lemma 6
First of all, we show that the presence of adversarial items does not affect the lower bound for the expected number of Pareto-optimal solutions. That is, we show that if there aren nearly uniform items and an arbitrary number of adversarial items, one can still apply Lemma 10 to obtain a lower bound ofn 2 /16 on the expected number of Pareto-optimal solutions. For this, consider the situation that the first j items are nearly uniform items and that item j + 1 is an adversarial item. Due to Lemma 10, we obtain that the expected value of X j α is bounded from below by j · α/(2T ) for every α ∈ [T ]. We show that the expected value of X j+1 α is lower bounded by the same value. For this, consider the two alternatives that the adversary has. He can either choose p j+1 > α or p j+1 < α. In the former case, we have X j α = X j+1 α . In the latter case, we have Hence, the adversarial profit of item j + 1 does not affect the lower bound for the expected number of Pareto-optimal solutions. One can apply this argument inductively to show the desired lower bound ofn 2 /16.
In expectation the numbern of nearly uniform items is δn and applying a Chernoff bound yields that with high probabilityn ≥ δn/2. For sufficiently large n, we can bound the probability thatn < δn/2 from above by 1/2. Hence, with probability 1/2 the expected number of Pareto-optimal solutions is at least (δn/2) 2 /16, and hence, the expected number of Pareto-optimal solutions is bounded from below by (δn) 2 /128. Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

A Weighted Chernoff bound
In this section, we prove Lemma 7 (which corresponds to Exercise 4.19 from [16]). Typically Chernoff bounds are formulated for sums of independent Poisson trials. In this section, we derive a Chernoff bound for general discrete random variables. The proof is based very closely on the one for sums of Poisson trials in [16]. In fact, the only part which needs to be exchanged is an upper bound on the moment generating function.
For a random variable X , let M X (t) = E e t X denote its moment generating function. Assume that X is the sum of independent random variables X 1 , . . . , X n , where each X i is a discrete random variable taking only values in [0, 1]. Fix an index i and consider the random variable X i . Let p : W → R ≥0 be its distribution, where W ⊆ [0, 1] is a countable set.
We can write the moment generating function of X i as follows: where the last inequality follows from the convexity of the function f (x) = e t x since e tw = f (w) = f (w · 1 + (1 − w) · 0) ≤ w · f (1) + (1 − w) · f (0) = w · e t + 1 − w.
Inequality (6) yields where in the last inequality we have used the fact that, for any y ∈ R, 1 + y ≤ e y . Since the random variables X 1 , . . . , X n are assumed to be independent, the moment generating function of X is simply the product of the moment generating functions of the X i 's. Hence, we obtain Now, we are ready to prove the following Chernoff bound. For any δ > 0, we can set t = ln(1 + δ) > 0 to get

This yields the theorem since
By an appropriate scaling, we obtain the following variant of Theorem 4.