New bounds for scheduling on two unrelated selfish machines

We consider the minimum makespan problem for $n$ tasks and two unrelated parallel selfish machines. Let $R_n$ be the best approximation ratio of randomized monotone scale-free algorithms. This class contains the most efficient known algorithms for truthful scheduling on two machines. We propose a new $Min-Max$ formulation for $R_n$, as well as upper and lower bounds on $R_n$ based on this formulation. For the lower bound, we exploit the pointwise approximations of cumulative distribution functions (CDFs). For the upper bound, we construct randomized algorithms using distributions with piecewise rational CDFs. Our method improves upon the existing bounds on $R_n$ for small $n$. In particular, we obtain almost tight bounds for $n=2$ showing that $|R_2-1.505996|<10^{-6}$.


Introduction and main results.
Scheduling on unrelated parallel machines is a classical discrete optimization problem; the goal is to allocate n independent indivisible tasks to m simultaneously working, unrelated, not identical machines, minimizing the time to complete all the tasks. This time is called makespan, so the scheduling problem is also called makespan minimization. Lenstra et al. [8] proved that the problem is NP-complete, moreover, no polynomial-time algorithm can achieve approximation ratio less than 3 2 unless P = N P . For m = 2 there is a linear-time algorithm by Potts [17] and a polynomial-time algorithm by Shchepin and Vakhania [19] which provide a 3 2 -approximation. Both algorithms are based on solving discrete linear optimization problems by relaxing integrality constraints and applying rounding techniques.
We analyze the version of makespan minimization in the setting of algorithmic mechanism design. Every machine belongs to a rational agent who is paid for performing tasks. So not only the tasks, but also the payments should be distributed among the machines. The processing times in this setting are private information of the agents, and the agents can lie about this information to maximize their utilities. The problem combines game theory and scheduling and occurs in economic models where jobs need to be optimally distributed among selfish participants. This approach was first proposed by Nisan and Ronen [15] to model interactions in the Internet, such as routing and information load balancing.
We restrict our attention to the case m = 2 and a particular type of task allocation algorithms: randomized monotone, task independent, scale-free (further called randomized MIS) algorithms. These algorithms were used to improve upper bounds on approximation ratios in scheduling in earlier research [1,9,10,11,15]. MIS algorithms have a number of useful properties, we briefly discuss them further in this section. For more detail about MIS algorithms and the works where they were analyzed, see Section 2.1. The first useful property of MIS algorithms is monotonicity. Since processing times of each agent are only known to this agent, the agents may lie about their times to increase their utility. Monotone algorithms allow to prevent agents from this kind of cheating. A deterministic algorithm is monotone if it assigns higher load to a machine as long as running times on this machine decrease. The earlier mentioned linear programming 1.5-approximations for two machines by Potts [17] or Shchepin and Vakhania [19] do not guarantee monotonicity. Second, every monotone algorithm on two machines with finite approximation ratio is task independent. For this reason, we restrict our analysis to such algorithms. A deterministic algorithm is task independent if for any task its allocation does not change as long as none of its own processing times on machines changes. Finally, a deterministic algorithm is scale-free if scaling all running times by some positive number does not influence the output. Following Lu [9], we note that for m = 2, scale-free and task independent together imply that allocation of each task depends only on this task's running times ratio.
Nisan and Ronen [15] show that no deterministic monotone algorithm can achieve approximation ratio less than 2, but randomized algorithms can do better in expectation. Thus we consider randomized MIS algorithms. Here and further we say that randomized allocation algorithm has some property, e.g., monotonicity, if this property holds with probability one.
Denote by R n the best expected worst-case approximation ratio of randomized MIS algorithms for makespan minimization on two machines with n tasks. Further we call R n the best approximation ratio for simplicity. This paper conveys the following contributions: 1. A new M in−M ax formulation of optimization problem for R n , provided in (5).

2.
A unified approach to construct upper and lower bounds on R n , described in Section 4. This approach can be also used for general M in−M ax problems where the objective is minimized over the set of functions.
In formulation (5), the outer minimization is over multivariate cumulative distribution functions (CDFs) and the inner maximization is over the positive orthant in two dimensions. This problem is in general not tractable, so we build bounds on the optimal value. Lower bounds are the result of restricting inner maximization to a finite subset of the positive orthant. To obtain upper bounds, we restrict outer minimization to the set of piecewise constant CDFs.
3. New upper and lower bounds on R n for n ∈ {1, 2, 3} and the task allocation algorithms corresponding to the given upper bounds. This means improvement in the known upper bounds for truthful scheduling on two machines (see Table 1).
4. Almost tight bounds on R 2 (see Table 1). For n = 2 tasks, the initial problem (5) simplifies to a new problem (20) where the outer minimization is over univariate CDFs. Using piecewise rational CDFs, upper and lower bounds are obtained with the gap not larger than 10 −6 . The outline of the paper is as follows. In section 2 we introduce randomized MIS algorithms for two machines, describe results from earlier research and formulate the basic optimization problem for R n . In Section 3 we exploit symmetry of this problem to analyze performance of MIS algorithms and to prove our M in−M ax formulation (5) for R n . In section 4 bounds on the optimal value of the M in−M ax problem are constructed and computed for several small n. In section 5 we analyze the case with two tasks in more detail and improve the bounds for two tasks. Section 6 concludes the paper. Section 7 provides the proofs omitted throughout the rest of the paper. All computations are done in MATLAB R2017a on a computer with the processor Intel ® Core TM i5-3210M CPU @ 2.5 GHz and 7.7 GiB of RAM. Linear programs are solved with IBM ILOG CPLEX 12.6.0 solver. All MATLAB codes would be provided by request to the first author.

Preliminaries.
Throughout the paper we work with the set of real numbers R. Unless otherwise specified, lower-case letters denote numbers, bold lower-case letters denote vectors, and capital letters denote matrices. Let R n be the set of real vectors with n entries. The notation R n + and R n ++ is used for nonnegative and strictly positive real vectors respectively. The input for the minimum makespan problem with m machines and n tasks is a matrix of processing times T = (T ij ), i ∈ [m], j ∈ [n]. We describe the solution to the problem by a task allocation matrix X ∈ {0, 1} m×n , such that X ij = 1 if task j is processed on machine i and X ij = 0 otherwise. For given X and T , the makespan of machine i is the total makespan is M (X, T ) := max i M i (X, T ), and the optimal makespan for T is For an allocation algorithm A and input T , X A,T ∈ {0, 1} m×n denotes the output of A on T . If A is randomized, X A,T is a random variable. The (expected) worst-case approximation ratio of algorithm A equals to the supremum of the (expected) value of the ratio M (X A,T ,T ) over all time matrices T . A comprehensive discussion on randomized algorithms can be found in Motwani and Raghavan [12]. If A is randomized, M (X A,T , T ) is replaced by its expectation.
In algorithmic mechanism design, agents who own the machines are paid to perform the tasks. To solve the minimum makespan problem in this setting, an allocation mechanism is used which consists of two elements: an algorithm that allocates tasks and an algorithm that assigns payments. The time matrix T is needed for both algorithms, but it is private information of the agents, who can lie about their processing times to increase their utilities. To avoid uncertainty related to this fact, one can construct a combination of payment and task allocations that motivates the agents to reveal their true processing times. Mechanisms with this property are called truthful, they are discussed in detail by, for example, Mu'alem and Schapira [13], Nisan and Ronen [15] or Saks and Yu [18].
Saks and Yu [18] showed that all truthful mechanisms have monotone task allocation algorithms, therefore we restrict our analysis to these algorithms. We use the definition of monotonicity by Koutsoupias and Vidali [7]; a deterministic task allocation algorithm is called monotone if for every two processing time matrices T and T ′ which differ only on machine i, That is, higher load is assigned to a machine as long as running times on this machine decrease. We consider randomized monotone algorithms as randomizing helps to achieve better expected performance in comparison to the deterministic case, which was mentioned already in the seminal paper by Nisan and Ronen [15].
We work with randomized monotone algorithms which are in addition task independent and scale-free. We call them randomized MIS algorithms. For such algorithms there exists a payment allocation procedure which makes the full allocation mechanism truthful. The procedure is described by Lu and Yu [9,10,11]. Hence payments are not mentioned further in this paper.
Next we define what it means for an algorithm to be task independent and scale-free. We use the definitions given by Lu [9]. A deterministic algorithm is task independent if allocation of a task depends only on its own running times. To be precise, for any two time matrices T and T ′ such that T ij = T ′ ij for task j and all i ∈ [m], allocation of task . Any monotone allocation algorithm for two machines with finite approximation ratio is task independent (Dobzinski and Sundararajan [4]). Thus, looking for the smallest approximation ratio, we can restrict our attention to such algorithms without loss of generality. A deterministic algorithm is scale-free if multiplication of all running times by the same positive number does not change the allocation. That is, if for any T ∈ R 2×n ++ and λ > 0 the outputs of the algorithm on the inputs T and λT are identical.
For m = 2 performance of MIS algorithms depends only on the worst two running time ratios, which is reflected in Section 3.1, and Proposition 3 in particular. Deterministic MIS algorithms for m = 2 were characterized by Lu [9]: Theorem 1.[ Lu [9]] A deterministic MIS algorithm for scheduling two unrelated machines is of the following form. For every task j ∈ [n] assign a threshold z j ∈ R ++ and one the following two conditions: T 1j < z j T 2j or T 1j ≤ z j T 2j . The task goes to the first machine if and only if the corresponding condition is satisfied.
From Theorem 1, each randomized MIS algorithm is equivalent to an algorithm which randomly assigns a threshold z j and a condition T 1j < z j T 2j or T 1j ≤ z j T 2j to each task j and then proceeds as in the deterministic case.
Let P n be the family of Borel probability measures such that supp(P) ⊆ R n ++ for all P ∈ P n , where supp(P) is the support of P. Further in the paper, the notion of a probability measure and the notion of the corresponding probability distribution are interchangeable, depending on which notion is more suitable for a given context. For a P ∈ P n define algorithm A P as follows: Algorithm 1: Monotone, task independent, scale-free task allocation algorithm for 2 machines Input: processing time matrix T = (T ij ) ∈ R 2×n ++ Output: allocation X ∈ {0, 1} 2×n 1. Draw a vector of thresholds [z 1 , z 2 , ..., z n ] from P 2. For each task j = 1, 2, ..., n do 3. If Else: X 1j ← 0, X 2j ← 1

Output X
Denote the family of all algorithms of the form above by A Pn : A Pn := {A P : P ∈ P n }.
Proposition 1. The best approximation ratio over all randomized MIS algorithms is the best approximation ratio over all algorithms in A Pn . The proof of the proposition is provided in Section 7.1. As we are interested in the best approximation ratio of randomized MIS algorithms, Proposition 1 lets us restrict ourselves to the family A Pn .

Related work.
The best approximation ratio for all monotone task allocation algorithms is not known. For deterministic task allocation algorithms with n tasks and m machines, the ratio lies in the interval [1+φ, m] where φ is the golden ratio. The lower bound is obtained by Koutsoupias and Vidali [7] and holds when n and m tend to infinity (bounds for several particular m are provided as well), the upper bound is computed by Nisan and Ronen [15]. For randomized algorithms, the best approximation ratio lies in the interval [2 − 1 m , 0.83685m]. The lower bound is established by Mu'alem and Schapira [13], the upper bound is found by Lu and Yu [10]. The gap between the bounds grows with m, and the case with the smallest number of machines, m = 2, attracts much attention.
For m = 2 the best approximation ratio of deterministic monotone algorithms for any finite n is equal to 2 (Nisan and Ronen [15]). The ratio for randomized monotone algorithms lies in the interval [1.5, 1.5861]. The upper bound is obtained by Chen et al. [1] using algorithm A P in which thresholds are drawn independently. The lower bound comes from the earlier mentioned paper by Mu'alem and Schapira [13]. Tighter lower bounds are known for certain cases. Lu [9] shows that algorithms from the family A Pn (and thus, from Proposition 1, all randomized MIS algorithms) cannot achieve a ratio better than 25 16 (= 1.5625) for sufficiently large n. Chen et al. [1] prove that algorithm A P cannot do better than 1.5852 when P is a product measure, i.e., when the thresholds are drawn independently.
The cases with m = 2 and small n > 2 are not well studied. Chen et al. [1] present upper bounds for various values of n, but their computations may be not precise. To find the bounds, Chen et al. [1] have to solve nonconvex optimization problems, and the numerical method they use does not guaranty global optimality.
The case with m = 2, n = 2 is the simplest one, but even here the best approximation ratio is unknown. The ratio for algorithms from the family A Pn for two tasks lies in the interval [1.505949, 1.5068]. The upper bound is computed by Chen et al. [1], the lower bound is proposed by Lu [9]. Notice that Lu [9] states that the lower bound is 1.506, but we repeated the calculations from this paper and obtained the number 1.505949. Thus, when reporting results, this number is used as the currently best lower bound. We improve this bound and show that |R 2 − 1.505996| < 10 −6 , in particular, R 2 < 1.506.
The existing bounds are obtained using ad hoc procedures, while this paper develops a unified approach to construct bounds on R n for any fixed n. The approach provides new upper and lower bounds for m = 2 and n ∈ {2, 3, 4}, which improves the earlier mentioned upper bounds for truthful scheduling on two machines.

Best approximation ratio of randomized MIS algorithms.
Let n be fixed. Consider a measure P ∈ P n and the corresponding algorithm A P ∈ A Pn . Let X P,T ∈ {0, 1} 2×n be a randomized allocation produced by A P on an instance T . Let E P [ ] and P P [ ] denote expectation and probability over the measure P respectively. For every P, we define the expected makespan of A P on T as follows: Recall that M * (T ), defined in (1), denotes the optimal makespan for T . Let R n (P, T ) be the (expected) approximation ratio of A P on T and R n (P) be the worst-case approximation ratio: Then we have It could happen that for some P the ratio R n (P, T ) is unbounded in T . We are not interested in those cases as we know that R n ≤ 1.5861 (see Section 2.1). To avoid technicalities, we work on R + = R + ∪ {∞} so that the supremum sup T ∈R 2×n ++ R n (P, T ) is always defined.
3 Using symmetry of the problem for optimization.
In this section we exploit the fact that problem (2) is invariant under permuting the tasks to simplify formulation (2) obtaining formulation (5) in Section 3.1. (2) is invariant under permuting the machines as well. Using this type of symmetry is more complex, but provides no additional improvement of the computed bounds.

Remark 1. It can be shown that problem
Let S n be the group which acts on R n by permuting elements of a vector x ∈ R n . We prove that problem (2) is convex and invariant under the action of S n . Therefore, to find infimum in (2), it is enough to optimize over the distributions invariant under this action. This approach is regularly used in convex programming, see de Klerk et al. [2], Dobre and Vera [3] or Gatermann and Parrilo [5].
First, we define the action of S n on P n . Given P ∈ P n , π ∈ S n and a random variable z ∼ P, consider the transformation z → zπ. Define Pπ ∈ P n as the distribution of zπ. Next, we examine R n (P), the objective of problem (2). Theorem 6 in Section 7.2 states that R n (P) is invariant under the action of S n . Lemma 5 in Section 7.3 shows that R n (P) is convex. Let C n ⊂ P n be the family of probability measures invariant under the actions of S n : Invariance and convexity of R n (P) and P n imply the main result of this section: The proof of Theorem 2 is provided in Section 7.3. Further we use the following straightforward property of invariant distributions: Let P ∈ C n . Then P has a cumulative distribution function (CDF) invariant under variable permutations. Moreover, for 0 < k < n, all k-variate marginal distributions are identical. In particular, P is a joint distribution of n identically distributed random variables.

New formulation for the best approximation ratio.
This section provides a reformulation for problem (2) based on the invariance results from the previous section. By Proposition 2, if P ∈ C n , then all univariate marginal distributions of P are identical and all bivariate marginal distributions of P are identical. Thus one univariate and one bivariate marginal distributions are enough to describe them all. Denote the corresponding univariate and bivariate CDFs by F P and H P respectively. Using these CDFs, define First, consider the following result by Chen et al. [1]: [ Chen et al. [1], using results of Lu and Yu [11]] For P ∈ C n define φ P as in (4), then for any T ∈ R 2×n , Note that this upper bound is defined by only two tasks out of n. Using Proposition 3, we show the following formulation for R n (P): Theorem 3. For P ∈ C n define φ P as in (4), then The proof of Theorem 3 is provided in Section 7.4.
Remark 2. Theorem 3 implies that the worst-case approximation ratio for n tasks and P ∈ P n is the worst-case approximation ratio for two tasks and the bivariate marginal distribution of P.
Corollary 1. Define φ P as in (4) and C n as in (3), then Proof. The result follows from Theorem 2 and Theorem 3.
Proof. The result follows from Corollary 1.
4 Upper and lower bounds on the best approximation ratio.
To find R n using problem (5), one needs to optimize over distribution functions. This is computationally intractable, therefore we construct upper and lower bounds on the optimal value of the problem. The idea is to restrict the attention to some subset of feasible distributions or some subset of R 2 , over which it is easier to solve problem (5).
1. For the lower bound, we take a finite set S ⊂ R ++ and find the supremum in (5) for x, y ∈ S only. One conventional approach to lower bounds is to propose several good-guess time tables, build a randomized instance over them and apply Yao's minimax principle (see Motwani and Raghavan [12] for more detail). The currently best lower bound for (m = 2, n = 2) is obtained by Lu [9] using the given approach. Mu'alem and Schapira [13] use Yao's minimax principle as well to find the best general lower bound for any m. Our approach is different as we evaluate randomized algorithms on deterministic instances.
2. For the upper bound, we find a good-guess distribution and solve the inner maximization problem for this distribution. The distribution is build using the solution to the lower bound problem for n ∈ {2, 3, 4}. For n = 2, a more efficient approach is proposed in the next section.
To implement the ideas above, we define some notions following Nelsen [14]. We need this to characterize the CDFs which we use to construct the bounds. For x, y ∈ R n such that x i ≤ y i for all i ∈ [n], an n-box B xy is defined as follows:  [14]] A function G : R n ++ → [0, 1] is a CDF of P ∈ P n if and only if G ∈ G n (R ++ ). Lemma 1. Let S ⊆ R ++ be a finite set. Then g ∈ G n (S) if and only if there exists G ∈ G n (R ++ ) such that g = G| S n . That is, g is a restriction of G to S n .
Proof. If there is G ∈ G n (R ++ ) such that g = G| S n , then g ∈ G n (S) by definition of G n (R ++ ). On the other hand, let g ∈ G n (S) and consider a number a > max{s : s ∈ S}.
Let S a = S ∪ {a} and define a new functionĝ : S a n → [0, 1] such thatĝ(z) = g(z) for z ∈ S n . For z / ∈ S n , construct a new vector y by replacing all occurrences of a in z with ∞ and defineĝ(z) = g(y). Consider the following piecewise constant function: It is straightforward to show that G ∈ G n (R ++ ) and g = G| S n . See Figure 1 for an illustration for the case n = 1. Finally, for a finite S ⊂ R ++ and a function g ∈ G n (S), define for all x, y ∈ S.
By Lemma 1, φ g = φ P | S 2 for some P ∈ P n . Lemma 1 implies the following upper and lower bounds on R n : Theorem 4. Let P ∈ C n . For a finite S ⊂ R ++ and a function g ∈ G n (S), define φ g as in (8). Then R n ≤ R n (P) = sup x,y∈R ++ φ P (x, y).
Proof. The upper bound (10) follows immediately from Corollary 1. To prove the lower bound (9), we first note that every P ∈ C n has a CDF G P ∈ G n (R ++ ) invariant under variable permutations by Proposition 2. At the same time, every G ∈ G n (R ++ ) invariant under variable permutations corresponds to some P G ∈ C n . Using this fact and Corollary 1, we obtain The last equality holds by Lemma 1: if G ∈ G n (R ++ ) is invariant under variable permutations, then so is the g := G| S n , and if g ∈ G n (S) is invariant under variable permutations, then so is the G in (7).

Implementing the bounds.
To compute the lower bound R n (S) for any given finite S, we use the epigraph form of formulation (9): for all π ∈ S n , z ∈ S n This is a finite linear problem since S is finite. We use invariance of g (the second constraint) and Conditions 3-4 in the definition of G n (S) to reduce the number of variables in the problem (the size of g). To ensure that g is n-increasing (6), it is enough to consider only x, y ∈ S n such that x i , y i are sequential points in S for all i ∈ [n]. This helps to reduce the number of constraints in the problem. To compute the upper bound R n (P) using formulation (10), we first construct P. Given a set S and the solution g to the lower bound problem (9) on S, we use the distribution P g which corresponds to the CDF (7) based on g. To construct this CDF, we choose a number a > max{s : s ∈ S}, as explained in the proof of Lemma 1. Further in this section work with S a = S ∪ {a}. To solve (10) for P g , we define the following set of intervals: This set of intervals covers R + , and we can write R n (P g )= sup x,y∈R ++ φ Pg (x, y) = max We solve the inner maximization problem in (11) for each pair i, j ∈ {|S| + 1}. To simplify computing the optimal x, y in the case when the line xy = 1 crosses the rectangle I i ×I j , we restrict our attention to S of an appropriate type. Consider a collection of k−1 positive real numbers r 1 < r 2 < ... < r k−1 < 1, let S k,a splits R + into the set of 2k+1 intervals I S k . First, consider a pair of intervals I i , I j ∈ I S k such that i / ∈ {1, 2k+1} and j / ∈ {1, 2k+1}. Due to the choice of S k , the line xy = 1 crosses the rectangle I i × I j if and only if i + j = 2k + 1. Denote the bivariate marginal CDF of P g by H g and the univariate marginal CDF by F g . Then we can write φ Pg (4) as follows: By construction of (7), That is, marginal CDFs are constant on I i ×I j and I i respectively. As the range of a CDF is [0, 1], we conclude that for x ∈ I i , y ∈ I j , φ P g (x, y) is non-increasing in x and non-decreasing in y. The latter holds since for any P ∈ P 2 invariant under S 2 , and x, y ∈ R, Hence the optimal value of the inner maximization problem in (11) can be obtained by first substituting CDFs (14) into the function (13) and then substituting x = s i−1 , y = s j . Note that this optimum is not attained. For the case i + j = 2k + 1, i.e., when the line xy = 1 crosses the rectangle I i × I j , the result holds due to the choice of S k . When i ∈ {1, 2k+1} or j ∈ {1, 2k+1}, the function φ Pg (x, y) simplifies, and such cases are solved separately. The solution approach resembles the one from the previous paragraph.
To obtain numerical results, in this section we use uniform sets of the form Table 2 shows the best obtained bounds and the k used to compute these bounds. In Table 2, the lower bounds R n (S u k ) defined in (9) are rounded down, the upper bounds R n (P g ) defined in (10) are rounded up. All upper bounds are verified with exact arithmetics using MATLAB symbolic package. First, the optimal solution g to problem (9), the elements of the set S u k and the number a are rounded to the 8 th digit. Next the rounded values are transformed into rational numbers and R n (P g ) is computed as a rational number. By Lemma 1 and Theorem 4, the rounded g provides the algorithm A Pg with the worst-case approximation ratio R n (P g ). The upper bound for n = 2 in Table 2 is worse than the best existing upper bound. We improve our result further in the next section.
5 More precise bounds for the case of two tasks.
We analyze the case with n = 2 tasks and m = 2 machines in more detail. Now, to obtain an upper bound, we do not simply substitute some distribution as we did before, but we optimize over a subset of C n . Moreover, as a side result of this optimization, we obtain a non-uniform set S k which produces a better lower bound than the one from Table  2 of Section 4.1.
When n = 2, we can simplify problem (5). To do this, given F ∈ G 1 (R ++ ), define H is a copula, i.e., there is P H,F ∈ P 2 for which H is CDF and F is marginal CDF. See Nelsen [14] for the detailed description of copulas and their properties. Moreover, by construction P H,F ∈ C 2 . Using this fact, we prove the following result: Theorem 5.
Consider any P ∈ C 2 with the univariate CDF F P . Define φ P (x, y) as in (4). From (15) for all x, y ∈ R ++ , On the other hand, for all F ∈ G 1 (R ++ ) we can construct the copula H from (17) with the corresponding distribution P H,F ∈ C 2 . Hence Remark 5. Nelsen [14] shows that for n > 2 the function is not a CDF. We could not propose any other suitable n-variate CDF which would have a bivariate margin H from (17). As a result, the proof of Theorem 5 fails when applied to n > 2.

Improving the upper bound for two tasks.
To improve our upper bound on R 2 , we optimize the objective from (18) in Theorem 5 over the family of piecewise rational univariate CDFs. The domain of each CDF is split into pieces by the set of points S k of the form (12). Using S k , we introduce a set of intervals: Note that this set is built with the points from S k only, which is different from Section 4.1. Now, given a family of continuous functions F, we define a family C F (S k ) of CDFs which "piecewisely" belong to F as functions F : R ++ → [0, 1] such that By construction, F is a CDF and thus C F (S k ) ⊂ G 1 (R ++ ). Let R 2 C F (S k ) be the upper bound on R 2 obtained by restricting minimization in problem (18) to C F (S k ). Since F has additional symmetry, the formulation for R 2 C F (S k ) can be simplified: .
Proof. Let F ∈ C F (S k ). Denote the objective in problem (18) by φ F (x, y). Then By construction of (19), Now let I i , I j ∈ I S k andx ∈ I i ,ŷ ∈ I j be such thatxŷ < 1. The set S k is finite, therefore there is a sequence {(x t , y t )} ∞ t=1 such that for all t the following holds: Finally, F is right continuous in (x,ŷ), and so is φ F (x,ŷ). Hence The last inequality follows from 1 /yt 1 /xt > 1. The result holds for allx,ŷ withxŷ < 1, which implies (20).

Implementing the new upper bound for two tasks
Further we choose F to be the family of linear functions of one variable: The corresponding family C F (S k ) includes, in particular, the CDFs from earlier research [1,9,10,15], where piecewise CDFs with the domains split into 2, 4 or 6 intervals are investigated. We observe that upper bounds are better when the domains are split more times or when each piece has a more complex form than just a constant function, i.e., when c 1 can be nonzero. So we improve the existing upper bounds by using the larger number of pieces and letting c 1 be non-zero for each piece. Define Let X := (x, y) ∈ R 2 ++ : xy ≥ 1 . Consider two formulations for R 2 C F (S k ) which follow from Proposition 5: for all i + j ≥ 2k + 1.
The last line applies since for S k of the form (12), xy ≥ 1 holds only for x ∈ I i , y ∈ I j with i + j ≥ 2k + 1. We construct problems (25) and (26) to approximate R 2 C F (S k ) with precision 10 −8 . Relaxations of (25) are used to find lower bounds on R 2 C F (S k ) , feasible solutions to (26) are used to find upper bounds on R 2 C F (S k ) .
For F ∈ C F (S k ) with (19) and (23), optimization problem (25) is linear in its variables t, and has infinitely many constraints: each (x, y) ∈ X induces a linear constraint. Such problems can be well approximated using the cutting-plane approach introduced by Kelley [6]. Namely, we start with a finite set Y ⊂ X and restrict the set of constraints in (25) to its finite subset generated by (x, y) ∈ Y. As a result, we obtain a finite linear problem. Denote its optimal solution by F , t. Then t is a lower bound on Next, we substitute F in (26) and find a feasible t. We compute the supremum for each pair i, j ∈ [2k] with i+j ≥ 2k+1 using all critical points of φ F (x, y) from (24) restricted to I i ×I j . We consider possible critical points from the first order conditions and from the boundary points in which xy ≥ 1 holds. More detail about this procedure is presented in Section 7.5. Since the set X ij = (x, y) ∈ I i ×I j : xy ≥ 1 is convex and φ F (x, y) restricted to I i ×I j is continuous, the critical point with the highest value of φ F (x, y) provides the supremum. Maximum over these suprema over all i, j ∈ [2k] can be used as a feasible t = t for (26). Thus t is an upper bound on R 2 C F (S k ) . Let (x * , y * ) be a point in which φ F (x, y) from (24) reaches t. If |t − t| > 10 −8 , we proceed from the beginning by restricting formulation (25) to the updated set Y ← Y ∪ {(x * , y * )}. Otherwise we stop.
The problem is solved on uniform sets S u k of the form (16). On each set we define the piecewise function F of the form (19) with inputs (23). We initialize the cutting-plane procedure using Y = (x, y) : x, y ∈ S u k , xy ≥ 1 . The best obtained upper bound is indicated in bold in Table 3, it is stronger than the currently best upper bound 1.5068. We verify the upper bound 1.5059964 using exact arithmetics in a similar way as we do it for the upper bounds in Table 2 of Section 4.1.

Improving the lower bound for two tasks.
The cutting-plane approach from Section 5.1 generates not only the upper bound with the corresponding CDF, but also the set of points Y. Using Y, we build a new set S * k of the form (12), which is not uniform (16). We consider all (x, y) ∈ Y involved in the binding constraints of problem (25) at the last cutting-plane iteration, take the corresponding x, y and their inverses, order ascending, round to the 8 th digit and obtain S * k with k = 82. For this set, the lower bound R n (S * k ) from (9) is 1.5059953, which improves our lower bound from Table 2. As a result, the lower and upper bounds become very close to each other:

Conclusion.
This paper analyzes randomized monotone, task independent, scale-free (randomized MIS) approximation algorithms for the minimum makespan problem on two unrelated parallel selfish machines. We propose a new M in−M ax formulation (5) to find R n , the best approximation ratio of randomized MIS algorithms. Minimization goes over distributions and maximization goes over R 2 ++ . The problem is generally intractable, therefore we build upper and lower bounds on the optimal value. For the lower bound, the initial problem is solved on a finite subset of R 2 ++ . Using the resulting solution, we construct a piecewise constant cumulative distribution function (CDF) for which the worst-case performance is easy to estimate. In this way we obtain an upper bound on R n . We implement this approach and find upper and lower bounds for n ∈ {2, 3, 4} tasks.
For n = 2, the best CDF is known as a function of univariate margins (copula). We parametrize these margins as piecewise rational functions of degree at most one. The resulting upper bound problem (20) is a linear semi-infinite problem. We solve it by the cutting-plane approach. This approach provides the upper bound 1.5059964 and the CDF for which the algorithm achieves this bound.
As a side result of the cutting-plane approach, we obtain a better lower bound 1.5059953, so |R 2 − 1.505996| ≤ 10 −6 . Finally, we indicate that the existing lower bound 1.506 obtained by Lu [9] for randomized MIS algorithms is computed incorrectly, and the actual lower bound provided by the approach of Lu [9] is 1.505949.
This work leaves several main questions for further research. First, using the unified approach suggested in this paper, the bounds for m = 2 machines could still be improved. For example, by using column generation in the lower bound problem (9) or parametrizing distributions of more than two variables in the upper bound problem (10). Second, piecewise-and pointwise-constructions we use are suitable for other problems with optimization over low dimensional functions. Finally, we consider only the case m = 2, but there are algorithms for m > 2 machines with similar properties, e.g., by Lu and Yu [11]. So our approach could be used to analyze approximation ratios for more than two machines. 7 Proofs.

Proof of Proposition 1.
To prove the proposition, we need the following additional result: Lemma 2. Let cP n be the set of probability distributions in P n with non-atomic univariate margins. The set cP n is dense in P n .
Proof. First notice that when n = 1, it is enough to show density of non-atomic measures on R, which follows from, e.g., Chapter 2, Corollary 8.1. in Parthasarathy [16]. For n > 1, consider a Dirac measure in P n . This is a product measure of n univariate Dirac measures. Hence there is a sequence of product measures in cP n which has the given Dirac measure in P n as a limit. Since the set of finite sums of Dirac measures is dense in P n (Chapter 2, Theorem 6.3. in Parthasarathy [16]), cP n is dense in P n .
Proof of Proposition 1. Fix n. Let L := {" ≤ ", " < "} n . This is a finite set with 2 n elements. Let P L denote the set of all probability distributions over L. By Theorem 1, a randomized MIS algorithm draws a vector from a distribution on L×R n ++ . Denote the set of all such distributions by D n . Every D ∈ D n can be encoded as a combination of a marginal distribution on L and 2 n distributions from P n obtained by conditioning D on each α ∈ L. Hence D n is in bijection with P L × P n 2 n where D ∈ D can be viewed as D δ,ω ∈ D such that (δ, ω) ∈ P L × P n The first equality in the second line holds since for every P ∈ cP n the events with thresholds equal to time ratios have measure zero. Hence we can fix δ = δ < . 7.2 Proof of invariance of R n (P) under the action of S n For this proof we need several auxiliary results. Recall that S n is the permutation group defined in Section 3. We say that S n acts on a matrix in R 2×n by permuting the n columns of this matrix. Namely, for A ∈ R 2×n and π ∈ S n we define the action of π ∈ S n on A by Aπ := (A i,jπ ), where jπ is the notation for π(j). We first show that for any T ∈ R 2×n ++ the optimal makespan M * (T ) is invariant under the action of S n on T and the expected makespan M (P, T ) is invariant under the action of S n on T and P. This implies invariance of R n (P, T ).
Proof. Consider a time table T ∈ R 2×n ++ and an action π ∈ S n . Let X * = (X * ij ) be an optimal allocation matrix for T . We have T ij X * ij = M * (T ).
Analogously, considering the time table T π and the action π −1 ∈ S n , we obtain Next we analyze M (P, T ).
Proof. For z ∈ R n ++ let A z be the algorithm in A Pn with the thresholds fixed at z. Let X z,T and M (z, T ) be the output and the makespan of A z on the time table T respectively. Let y = zπ. Then A z sends task j to machine i on the table T if and only if A y sends task (jπ) to machine i on the table (T π). As a result, T ij X z,T ij = T i,jπ X y,T π i,jπ for all i, j and T ij X y,T π ij = M y, T π The statement (27) holds for all z ∼ P. Therefore, which by Lemma 3 implies R n (P, T ) = R n (Pπ, T π).
Now we can prove the main result of this section, invariance of R n (P): Theorem 6. For any P ∈ P n and π ∈ S n , R n (P) = R n (Pπ).

Proof of Theorem 2.
To prove Theorem 2, we need the following lemma: Lemma 5. R n (P) is convex, that is, for α ∈ (0, 1) and P, P ′ n ∈ P n , R n αP + (1 − α)P ′ ≤ αR n (P) + (1 − α)R n (P ′ ) Proof. By construction of αP + (1 − α)P ′ and A Pn , for any T ∈ R 2×n ++ , Proof of Theorem 2. As C n ⊆ P n , inf P∈Cn R n (P) ≥ inf P∈Pn R n (P). To prove the opposite inequality, we show that for any distribution P ∈ P n there is a distribution Q ∈ C n such that R n (Q) ≤ R n (P). Given P ∈ P n , consider the convex combination By construction, Q ∈ C n and R n (Q)
Next we consider three possible cases for i, j. For each of them we substitute F (x), F (y) from (19) and find the analytical solution to the system ∂φ F (x,y) ∂x = 0, ∂φ F (x,y) ∂y = 0. For this purpose we use Wolfram|Alpha [20]. Further the obtained solution is denoted by (x * , y * ).
In this case F (x) = 1 − c 0 i − c 1 i x, F (y) = c 0 j + c 1 j/y and y ∈ (0, 1). Hence The latter holds since F (y) ≤ 1, F (y) is nondecreasing by construction, and y ∈ (0, 1). The sign of the derivative with respect to x does not depend on x. The function φ F (x, y) is nondecreasing in y and is either non-increasing or non-decreasing in x. We not know this in advance, so we use the set (s i−1 , s j ), (s i , s j ) as possible critical points.
As in Case 1 , the sign of the derivative with respect to x does not depend on x, hence φ F (x, y) is non-increasing or non-decreasing in x. We do not know this in advance, so we start with the set {s i−1 , s i }×{s j−1 , s j , y * } as possible critical points. We check all resulting pairs for feasibility and exclude the infeasible ones.
In this case F (x) = c 0 i + c 1 i /x, F (y) = 1 − c 0 j − c 1 j y, and The sign of the derivatives is unknown, so we start with the set {s i−1 , s i , x * }×{s j−1 , s j , y * } as possible critical points. We check all resulting pairs for feasibility and exclude the infeasible ones.
When x ∈ I 1 or y = I 2k , by construction of (19) φ F (x, y) simplifies even more. In our computations we analyze these situations separately.