1 Introduction

The field of Algorithmic Mechanism Design [19, 20, 22] focuses on optimization problems where the input is provided by self-interested agents that participate in the mechanism by reporting their private information. These agents are utility maximizers, so they may misreport their private information to the mechanism if that results in a more favorable outcome. Given the agents’ reports as input, a mechanism is a function that maps the input to allocations and payments if monetary transfers are allowed. The goal of the mechanism designer is twofold. On the one hand, the objective is to motivate agents always to report truthfully, regardless of what strategies the other agents follow; on the other hand, the aim is to optimize a specific objective function that measures the quality of the outcome, subject to a polynomial-time implementability constraint. However, these objectives are usually incompatible. Therefore, we often need to trade one objective to achieve the other. One standard approach is to maintain the truthfulness property of the mechanism, and approximately optimize the specific objective function (e.g., social welfare maximization, revenue maximization, or cost minimization). The approximation ratio is the canonical measure for evaluating the performance of a truthful mechanism towards this goal. It compares the performance of the truthful mechanism against the optimal mechanism, which is not necessarily truthful, over all possible inputs.

The approximation ratio is defined in the worst-case sense, which resembles the worst-case time complexity of the algorithms. These are strong but very pessimistic measures. On the one hand, if it is possible to obtain a small worst-case ratio, it is a very solid guarantee of the mechanism’s performance, no matter what the inputs are. On the other hand, if it turns out to be a large value, one can hardly be certain about the mechanism’s performance as it may perform well on most inputs and perform poorly on only a few inputs. To address this issue,  Deng et al. [7] and Gao and Zhang [9] propose an alternative measure, the average-case approximation ratio, that compares the performance of the truthful mechanism against the optimal mechanism, averaged over all possible inputs when they follow a specific distribution. Although average-case analysis is usually more complex than the worst-case analysis, it complements the worst-case analysis.

In this paper, we study the problem of scheduling unrelated machines without money. Scheduling is one of the primary problems in algorithmic mechanism design. In the general setting, the problem is to schedule a set of tasks to n unrelated machines with private processing times to minimize the makespan. The machines (alternatively, speaking of agents in game-theoretical settings) are rational and want to minimize their execution time. They may achieve this by misreporting their processing times to the mechanism. No monetary payments are allowed in this problem. The objective is to design truthful mechanisms with a good approximation ratio. One important version of the problem is studied by Koutsoupias [11] in which the machines are bounded by their declarations. More specifically, if a machine declares a time that is longer than its actual time for a task and is allocated the task, then its processing time must be the declared value in practice. This is because machines are under observation during the execution of the task and cannot afford to be caught lying about the execution times (an unaffordable penalty would apply). Koutsoupias [11] devises a truthful-in-expectation mechanism which achieves the tight bound of \(\frac{n+1}{2}\) for one task, and generalizes to \(\frac{n(n+1)}{2}\) for multiple tasks when the objective is minimizing makespan and \(\frac{n+1}{2}\) when the objective is minimizing social cost. We note that the tight bound instance is obtained when the ratio of the minimum value of the processing times against the maximum value of the processing times approaches 0. Obviously, this instance is very unlikely to occur in practice. Therefore, it would be interesting to understand how well the optimal mechanism given in [11] performs on average when the instances are chosen from a certain distribution.

1.1 Our Contribution

This paper provides a novel perspective on the performance of the mechanism developed by Koutsoupias [11]. In particular, we show the following results:

  • The average-case approximation ratio of the mechanism devised by Koutsoupias [11] is upper bounded by a constant when the inputs are independent and identically distributed random variables.

In contrast, the worst-case approximation ratio of the mechanism shown in [11] is \(\frac{n+1}{2}\), which is asymptotically different.

A major criticism about average-case analysis is that the results usually depend on the inputs’ distribution, as real-world data are not guaranteed to follow a particular distribution. Even for the same mechanism, when applied to different application areas, the real-world distribution may vary. For example, Deng et al. [7] shows that their average-case result holds for a uniform distribution; the positive results in the Bayesian analysis of auctions usually need to assume that the hazard rate function is monotone non-decreasing. In this paper, we develop powerful techniques to show a constant bound of the average-case approximation ratio for any i.i.d distribution. Our results complement the worst-case approximation ratio of the known mechanisms, which is asymptotically large.

1.2 Related Work

The field of Algorithmic Mechanism Design was initiated by Nisan and Ronen [19] and is further advanced by Procaccia and Tennenholtz [22] to approximate mechanism design without money. For a more detailed investigation, we refer the reader to Nisan et al. [20].

The scheduling problem has been extensively studied. However, after nearly two decades, we still have little progress in resolving this challenge. The known approximation ratio upper bounds are rather negative, and there is a large gap between the lower bounds and upper bounds. For the model presented by Nisan and Ronen [19] where payments are allowed to facilitate designing truthful mechanisms, the best known upper bound is given in their original paper and is achieved by allocating each task independently using the classical VCG mechanism, while the best known lower bound is 2.61 [12]. Ashlagi et al. [3] prove that the upper bound of n is tight for anonymous mechanisms. For randomized mechanisms, the best known upper bound is \(\frac{n+1}{2}\) shown by Mu’alem and Schapira [18]. For the special case of related machines, where each machine’s private information is a single value, Archer and Tardos [1] gives a randomized 3-approximation mechanism. Lavi and Swamy [15] show a constant approximation ratio for the special case that the processing times of each task can take one of two fixed values. Yu [25] generalizes this result to two-range-values, while together with Lu and Yu [17] and Lu [16], they show constant bounds for the case of two machines. For the case that payments are not allowed, Koutsoupias [11] first considers the setting that the machines are bounded by their declarations. This is influenced by the notion of impositions that appears in [8] for the facility location problems, as well as the notion of verification that appears in [4]. Penna and Ventre [21] present a general construction of collusion-resistant mechanisms with verification that return optimal solutions for a wide class of mechanism design problems, including the scheduling problem. The mechanism presented in [11] has a tight approximation ratio bound of \(\frac{n+1}{2}\) for the single-task setting; by running the mechanism independently on multiple tasks a tight bound of \(\frac{n+1}{2}\) can be achieved for social cost minimization and an upper bound of \(\frac{n(n+1)}{2}\) can be achieved for the makespan minimization. Kovács and Vidali [14] further apply mechanism design with monitoring techniques to the truthful RAM allocation problem. There are some works on characterizing truthful mechanisms for scheduling problems, such as Kovács and Vidali [13], as well as scheduling with uncertain execution time, such as Conitzer and Vidali [6].

In [7], the authors propose to study the average-case and smoothed approximation ratios and conduct these analyses on the one-sided matching problem. They show that, although the asymptotically best truthful mechanism for the problem is Random Priority and its worst-case approximation ratio is bounded by \(\varTheta {(\sqrt{n})}\), Random Priority has a constant average-case approximation ratio when the inputs follow a uniform distribution, and it has a constant smoothed approximation ratio. Gao and Zhang [9] extend the constant approximation results to more general distributions.

Notably, the average-case approximation ratio analysis takes a similar but fundamentally different approach to the Bayesian analysis. In the Bayesian auction design literature [5, 10], the focus is on how well a truthful mechanism can approximately maximize the expected revenue when instances are taken from the entire input space. More specifically, the dominant approach in the study of Bayesian auction design is the ratio of expectations. A more detailed comparison of the two metrics will be given in the next section after their definitions are given.

2 Preliminaries

In the problem of scheduling unrelated machines without payment, there are a set of self-interested machines (alternatively, speaking of self-interested agents in game theoretical settings) and a set of tasks. The general setting comprises n machines and m tasks. In this paper we consider the setting of a single task. The machines are lazy and prefer not to execute any tasks. There are no monetary tools to incentivize machines to execute tasks. Machine i needs time (or cost) \(t_i\) to execute the task, \(i\in [n]\). These \(t_i\)’s are independent of each other. For m tasks, there could be two different objectives. One is to allocate the task to machines so that the makespan is minimized; the other is to allocate the task so that the social cost is minimized. The makespan is the total length of the schedule, and the social cost is the sum of all agents’ costs. In the single-task setting, these two objectives are identical. Obviously, allocating the task to the machine with minimum execution time is the optimal solution. However, the mechanism has no access to the values \(t_i\). Instead, each machine reports an execution time \(\tilde{t}_i\) to the mechanism, where \(\tilde{t}_i\) is not necessarily equal to \(t_i, \forall i\in [n]\). A mechanism is a (possibly randomized) algorithm which computes an allocation based on the declarations \(\tilde{t}_i\) of the machines. Denote the output of the mechanism by \(\mathbf{p}=(p_i)_{i\in [n]}\), where \(p_i\) is an indicator variable in deterministic mechanisms and is the probability of machine i getting allocated to execute the task in randomized mechanisms. We follow the standard literature and consider the case that machines are bound by their reports. That is, the cost of machine i for the task is \(\max \{t_i,\tilde{t}_i\}\). So in case a machine i declares \(\tilde{t}_i \ge t_i\) and it is allocated the task, then its actual cost is the declared value \(\tilde{t}_i\) and not \(t_i\). This is in the spirit that machines are being observed during the execution of the task and cannot afford to be caught lying about the execution times (a high penalty would apply). Therefore, the expected cost of machine i is \(c_i=c_i(t_i,\tilde{\mathbf{t}})=p_i(\tilde{\mathbf{t}}) \max (t_i,\tilde{t}_i)\). In approximate mechanism design, we restrict our interest to the class of truthful mechanisms. A mechanism is truthful if for any report of other agents, the expected cost \(c_i\) of agent i is minimized when \(\tilde{t}_i = t_i\). We note that this weak notion of truthfulness, truthful-in-expectation, enables us to consider a richer class of mechanisms than universal truthfulness. However, as mentioned in the Introduction, even with this rich class of truthful mechanisms, the performance of these mechanisms is still very limited in terms of approximation ratio. The canonical measure of efficiency of a truthful mechanism \(\mathrm {M}\) is the worst-case approximation ratio,

$$\begin{aligned} r_{\text {worst}}(\mathrm {M}) = \sup _{\mathbf{t} \in \mathcal {T}} \frac{SC_{\mathrm M}(\mathbf{t})}{SC_{\mathrm {OPT}}(\mathbf{t})}, \end{aligned}$$

where \(SC_{\mathrm {OPT}}(\mathbf{t})= \min _\mathbf{p \in \mathcal {P}}\sum _{i=1}^{n}c_i\) is the optimal social cost which is essentially the minimum \(t_i\), for all \(i\in [n]\); \(SC_{\mathrm M}(\mathbf{t})\) is the social cost of the mechanism \(\mathrm {M}\) on the input \(\mathbf{t}\); and \(\mathcal {T}\) is the input space. This ratio compares the social cost of the truthful mechanism \(\mathrm {M}\) against the social cost of the optimal mechanism \(\mathrm {OPT}\) over all possible inputs \(\mathbf{t}\).

In [11], the author devises the following randomized mechanism.

Mechanism M: Given the input \(\mathbf{t}=(t_1,\ldots ,t_n)\), without loss of generality, let the values of \(t_i\)’s be in ascending order \(0< t_1 \le t_2 \le \cdots \le t_n\). Then the allocation probabilities are

$$\begin{aligned} p_1&=\frac{1}{t_1} \int _{0}^{t_1} \prod _{i=2}^{n} \left( 1-\frac{y}{t_i}\right) dy, \\ p_k&=\frac{1}{t_1 t_k} \int _{0}^{t_1} \int _{0}^{y} \prod _{\begin{array}{c} i=2,\ldots ,n \\ i\ne k \end{array}} \left( 1-\frac{x}{t_i} \right) \,dx\,dy, \text {for} \,\ k\ne 1. \end{aligned}$$

Note that this is a symmetric mechanism, so it suffices to describe it when \(0<t_1 \le t_2 \le \cdots \le t_n\). It is shown in [11] that this mechanism is truthful and achieves an approximation ratio tight bound of \(\frac{n+1}{2}\).

Analogously to the definition of the average-case approximation ratio of mechanisms for social welfare maximization in [7], we define it for social cost minimization as follows:

$$\begin{aligned} r_{\text {average}}({\mathrm{M}}) = {\mathbb{E}}_{\mathbf{t} \sim {\mathrm {D}}} \left[ \frac{SC_{\mathrm{M}}({\mathbf{t}})}{SC_{\mathrm {OPT}}({\mathbf{t}})} \right] , \end{aligned}$$

where the input \(\mathbf{t}=(t_1,\ldots ,t_n)\) is chosen from a distribution \(\mathrm {D}\). Hence, the metric we study in this paper is the expectation of the ratio.

2.1 Comparison with the Bayesian Approach

In Bayesian mechanism design [5, 10], there is also a prior distribution from which the agent types come from. However, the objective is to characterize the maximum ratio (for some given distribution of the agent types) of the expected social welfare (or social cost) of a truthful mechanism over the expected social welfare (or social cost) of the optimal mechanism. So, the metric in the study of the Bayesian approach is the ratio of expectations. That is, the objective is to characterize the ratio r in the following formula,

$$\begin{aligned} r \cdot \mathop {\mathop {\mathbb {E}}} \left[ SC_{\mathrm {OPT}}(\mathbf{t}) \right] \le \mathop {\mathop {\mathbb {E}}} \left[ {SC_{\mathrm {M}}(\mathbf{t})} \right] . \end{aligned}$$

Therefore, in Bayesian approach, the optimal mechanism is in respect of the entire input space. In other words, it outputs the optimal solution in expectation, when the inputs are taken over the entire prior distribution. In contrast, in the analysis of average-case approximation ratio, the optimal mechanism is in respect of each individual input instance.

In light of this difference, also due to the fact that the expectation of the ratio is a nonlinear function of the two random variables, the analysis of average-case approximation ratio introduces more technical challenges. In some specific scenarios, a constant average-case approximation ratio would imply a constant approximation ratio under the Bayesian approach.

3 Average-Case Approximation Ratio

In this section we show that the average-case approximation ratio of the mechanism M is upper bounded by a constant, when the inputs \(t_i\)’s follow any independent and identical distribution \(\mathrm {D}[t_{\min }, \infty )\), where \(t_{\min }\) is the minimum processing time for the task.

Let h be a constant, and denote event \(\mathrm {A}=\{ t_{\frac{n}{2}} \le h \cdot t_{\min }\}\). So, it corresponds to the case that the \(\frac{n}{2}\)-th order statistic of the inputs \(t_i\)’s is less than or equal to \(h \cdot t_{\min }\). Firstly, we show that if \(\mathrm {A}\) is true, then the social cost of the mechanism \(\mathrm {M}\) is upper bounded by a constant times \(t_1\).

Lemma 1

For any constant \(h>0\) , given that event \(\mathrm {A}\) holds, we have

$$\begin{aligned} SC_{\mathrm {M}}(\mathbf{t}) \le (2h + 1) t_1. \end{aligned}$$

Proof

The expected cost of the mechanism \(\mathrm {M}\) is

$$\begin{aligned} SC_{\mathrm {M}}(\mathbf{t})&= \sum _{i=1}^{n} p_i \cdot t_i = \int _{0}^{t_1} \prod _{i=2}^{n} \left( 1-\frac{y}{t_i} \right) dy \\&\quad +\, \sum _{k=2}^{n} \frac{1}{t_1} \int _{0}^{t_1} \int _{0}^{y} \prod _{\begin{array}{c} i=2,\ldots ,n \\ i\ne k \end{array}} \left( 1-\frac{x}{t_i} \right) dx dy \end{aligned}$$

Since \(1-\frac{y}{t_i} \le 1\) for any \(y\in \left[ 0,t_1\right] , i=2,\dots ,n\), we can simply bound the first term by

$$\begin{aligned} \int _{0}^{t_1} \prod _{i=2}^{n} \left( 1-\frac{y}{t_i} \right) dy \le \int _{0}^{t_1} 1 dy = t_1 \end{aligned}$$

Because event \(\mathrm {A}\) holds, i.e., \(t_{\frac{n}{2}} \le h \cdot t_{\min }\), we have \(\prod _{\begin{array}{c} i=2,\ldots ,n \\ i\ne k \end{array}} \left( 1-\frac{x}{t_i} \right) \le \left( 1-\frac{x}{h \cdot t_{\min }} \right) ^{\frac{n}{2}-2} \cdot 1^{\frac{n}{2}}, \forall k=2,\ldots ,n\). So we can bound the second term as follows.

$$\begin{aligned}&\sum _{k=2}^{n} \frac{1}{t_1} \int _{0}^{t_1} \int _{0}^{y} \prod _{\begin{array}{c} i=2,\ldots ,n \\ i\ne k \end{array}} \left( 1-\frac{x}{t_i} \right) \,dx\,dy \\&\quad \le \sum _{k=2}^{n} \frac{1}{t_1} \int _{0}^{t_1} \int _{0}^{y} \left( 1-\frac{x}{h \cdot t_{\min }} \right) ^{\frac{n}{2}-2} \cdot 1^{\frac{n}{2}}\,dx\,dy \\&\quad \le \frac{n-1}{t_1} \int _{0}^{t_1} \int _{0}^{y} \left( 1-\frac{x}{h \cdot t_1} \right) ^{\frac{n}{2}-2}\,dx\,dy \\&\quad = \frac{n-1}{t_1} \int _{0}^{t_1} \left( \frac{2h t_1}{n-2} - \frac{2h t_1}{n-2}\left( 1-\frac{y}{h t_1} \right) ^{\frac{n}{2}-1} \right) dy \\&\quad = \frac{n-1}{t_1} \left[ \frac{2h t_1^2}{n-2} + \frac{4h^2 t_1^2}{n(n-2)} \left( \left( 1 - \frac{1}{h} \right) ^{\frac{n}{2}} -1 \right) \right] \\&\quad \le \frac{n-1}{n-2} \cdot 2h t_1 \end{aligned}$$

The last term approaches \(2h t_1\) as n approaches infinity. So, \(SC_{\mathrm {M}}(\mathbf{t}) = \sum _{i=1}^{n} p_i \cdot t_i \le (2h + 1) t_1.\) \(\square\)

Since \(SC_{\mathrm {OPT}}({\mathbf{t}}) = t_1\), we get the following Corollary.

Corollary 1

When event \(\mathrm {A}\) holds, we have

$$\begin{aligned} {\mathbb{E}}_{\mathbf{t}\sim \mathrm{D}} \left[ \frac{SC_{\mathrm{M}}({\mathbf{t}})}{SC_{\mathrm{OPT}}({\mathbf{t}})} \right] \le 2h + 1 . \end{aligned}$$

Obviously, Lemma 1 and Corollary 1 hold regardless of the distribution.

Secondly, we show that there exists a constant h such that event \(\mathrm {A}\) occurs with a large probability. Intuitively, the larger h is, the higher the probability that event \(\mathrm {A}\) occurs. We will need the following Lemma to find such an h.

Lemma 2

For any \(n>1\) , we have \({{n}\atopwithdelims (){n/2}} \le \frac{e}{\pi \sqrt{n}} \cdot 2^n\) , where e is the base of the natural logarithm.

Proof

According to the estimation by [24],

$$\begin{aligned} n!=\sqrt{2\pi } n^{n+\frac{1}{2}} e^{-n+r(n)} , \end{aligned}$$

where \(\frac{1}{12n+1}<r(n)<\frac{1}{12n}\). Here we only need a looser bound to prove our lemma, i.e.,

$$\begin{aligned} \sqrt{2\pi } n^{n+\frac{1}{2}} e^{-n}< n!< \sqrt{2\pi } n^{n+\frac{1}{2}} e^{-n+\frac{1}{12n}} < n^{n+\frac{1}{2}} e^{-n+1} . \end{aligned}$$

We have

$$\begin{aligned} {{n}\atopwithdelims (){n/2}} = \frac{n!}{(\frac{n}{2})!(\frac{n}{2})!} \le \frac{e\sqrt{n}(\frac{n}{e})^n}{\left( \sqrt{2\pi \frac{n}{2}} \left( \frac{n}{2e}\right) ^{\frac{n}{2}}\right) ^2} = \frac{e}{\pi \sqrt{n}} \cdot 2^n \end{aligned}$$

\(\square\)

Next we show that event \(\mathrm {A}\) can occur with a large probability, with a properly chosen h.

Lemma 3

For any \(n>1\), there exists a constant h, such that \(F(h t_{\min }) \ge \frac{11}{12}\), and we have

$$\begin{aligned} \Pr [\mathrm {A}] \ge 1- \frac{e}{2\pi } \cdot \frac{1}{n}, \end{aligned}$$

where F is the cumulative distribution function of t.

Proof

Since \(\mathrm {A}=\{ t_{\frac{n}{2}} \le h \cdot t_{\min }\}\), the probability that event \(\mathrm {A}\) occurs can be calculated by

$$\begin{aligned} \Pr [\mathrm {A}]&= \Pr [t_{\frac{n}{2}} \le h \cdot t_{\min }] \nonumber \\&= \sum _{k=\frac{n}{2}}^{n} {{n}\atopwithdelims (){k}} \left( F(h t_{\min })\right) ^k \left( 1-F(h t_{\min }) \right) ^{n-k} \nonumber \\&=1 - \sum _{k=0}^{\frac{n}{2}-1} {{n}\atopwithdelims (){k}} \left( F(h t_{\min })\right) ^k \left( 1-F(h t_{\min }) \right) ^{n-k} \end{aligned}$$
(1)

Since \(\left( 1-F(h t_{\min }) \right) ^{n-k} \le \left( 1-F(h t_{\min }) \right) ^{n/2}\), \({{n}\atopwithdelims (){k}} \le {{n}\atopwithdelims (){n/2}}\), \(k=0, \cdots , \frac{n}{2}-1\), and \(\left( F(h t_{\min })\right) ^k < 1\), we get

$$\begin{aligned} (1)&\ge 1 - \sum _{k=0}^{\frac{n}{2}-1} {{n}\atopwithdelims (){n/2}} \left( 1-F(h t_{\min }) \right) ^{\frac{n}{2}} \nonumber \\&= 1 - \frac{n}{2} {{n}\atopwithdelims (){n/2}} \left( 1-F(h t_{\min }) \right) ^{\frac{n}{2}} \nonumber \\&\ge 1 - \frac{n}{2} \cdot \frac{e}{\pi \sqrt{n}} \cdot 2^n \cdot \left( 1-F(h t_{\min }) \right) ^{\frac{n}{2}} \end{aligned}$$
(2)

By choosing h such that \(F(h t_{\min }) \ge \frac{11}{12}\), we get

$$\begin{aligned} (2)&\ge 1 - \frac{n}{2} \cdot \frac{e}{\pi \sqrt{n}} \cdot 2^n \cdot \left( \frac{1}{12} \right) ^{\frac{n}{2}} \\&= 1 - \frac{e}{2\pi } \cdot \sqrt{n} \cdot 3^{-\frac{n}{2}} \\&\ge 1 - \frac{e}{2\pi } \cdot \frac{1}{n} \end{aligned}$$

The last inequality is due to \(3^n \ge n^3, \forall n>1\).

Therefore, \(\Pr [\mathrm {A}] \ge 1- \frac{e}{2\pi } \cdot \frac{1}{n}\). \(\square\)

We can then bound the probability of the case that the expected cost of the mechanism \(\mathrm {M}\) is larger than \((2h+1)t_1\).

Lemma 4

When event \(\mathrm {A}\) holds, and for the choice of h in the above lemma, we have

$$\begin{aligned} \Pr \left[ SC_{\mathrm {M}}({\mathbf{t}}) > (2h+1)t_1 \right] < \frac{e}{2\pi } \cdot \frac{1}{n}. \end{aligned}$$

Proof

According to Lemma 1, event \(\mathrm {A}\) implies \(SC_{\mathrm {M}}(\mathbf{t}) \le (2h+1) t_1\), so we have

$$\begin{aligned} \Pr [\mathrm {A}] \le \Pr [SC_{\mathrm {M}}(\mathbf{t}) \le (2h+1) t_1] \end{aligned}$$

Hence,

$$\begin{aligned} 1 - \Pr [\mathrm {A}] \ge \Pr [SC_{\mathrm {M}}(\mathbf{t}) > (2h+1) t_1] \end{aligned}$$

According to Lemma 3, we have

$$\begin{aligned} 1 - \Pr [\mathrm {A}] \le \frac{e}{2\pi } \cdot \frac{1}{n} \end{aligned}$$

So,

$$\begin{aligned} \Pr \left[ SC_{\mathrm {M}}(\mathbf{t}) > (2h+1) t_1 \right] < \frac{e}{2\pi } \cdot \frac{1}{n} \end{aligned}$$

\(\square\)

We have established necessary building blocks. By carefully choosing the parameter h, we can partition the valuation space into two sets: \(\{SC_{\mathrm {M}}(\mathbf{t})\le (2h+1)t_1\}\) and \(\{SC_{\mathrm {M}}(\mathbf{t}) > (2h+1)t_1\}\). Last, we will use Corollary 1 and Lemma 4 to prove our main result. Essentially, Corollary 1 upper bounds the expected approximation ratio of the first case and Lemma 4 upper bounds the probability of the second case occurring. Note that in any case, the worst-case ratio is upper bounded by \(\frac{n+1}{2}\) according to Koutsoupias [11]. By adding them up together, we obtain our upper bound.

Theorem 1

For any distribution on \([t_{\min }, +\infty )\) , and a constant h such that \(F(h t_{\min }) \ge \frac{11}{12}\) , the average-case approximation ratio of the mechanism \(\mathrm {M}\) is upper bounded by \(2 h + 1.33\) . That is,

$$\begin{aligned} r_{average} = \mathbb {E}_{\mathbf{t}\sim D} \left[ \frac{SC_{\mathrm {M}}(\mathbf{t})}{SC_{\mathrm {OPT}}(\mathbf{t})} \right] < 2 h + 1.33 \end{aligned}$$

Proof

It is easy to see that the above two sets are collectively exhaustive and mutually exclusive, and Lemma 4 holds. So we have

$$\begin{aligned} r_{average}&= \mathbb {E}_{\mathbf{t}\sim D} \left[ \frac{SC_{\mathrm {M}}(\mathbf{t})}{SC_{\mathrm {OPT}}(\mathbf{t})} \right] \\&\le \Pr \left[ SC_{\mathrm {M}}(\mathbf{t})\le (2h+1)t_1 \right] \cdot \mathbb {E} \left[ \frac{SC_{\mathrm {M}}(\mathbf{t})}{SC_{\mathrm {OPT}}(\mathbf{t})} \right] \\&\quad +\,\Pr \left[ SC_{\mathrm {M}}(\mathbf{t})> (2h+1)t_1 \right] \cdot \mathbb {E} \left[ \frac{SC_{\mathrm {M}}(\mathbf{t})}{SC_{\mathrm {OPT}}(\mathbf{t})} \right] \\&\le \Pr \left[ SC_{\mathrm {M}}(\mathbf{t})\le (2h+1)t_1 \right] \cdot (2h+1) \\&\quad +\,\Pr \left[ SC_{\mathrm {M}}(\mathbf{t}) > (2h+1)t_1 \right] \cdot \frac{n+1}{2} \\&\le 1 \cdot (2h+1) + \frac{e}{2\pi } \cdot \frac{1}{n} \cdot \frac{n+1}{2} \\&= 2h+1 + \frac{e}{4\pi } \cdot \frac{n+1}{n} \\&\le 2h+1 + \frac{3e}{8\pi } \\&< 2h+1.33 \end{aligned}$$

Therefore, the average-case approximation ratio of the mechanism \(\mathrm {M}\) is upper bounded by \(2 h + 1.33\). \(\square\)

In hindsight, when the costs of the machines \(t_i\)’s follow any heavy-tailed distribution, the mechanism \(\mathrm {M}\) has a constant average-case approximation ratio bound. However, this was not intuitively foreseeable, as the social cost of the mechanism \(\mathrm {M}\) depends on how often the inputs contain large \(t_i\) and how big they are.

In the following, we give a few examples of the distributions to show the choice of h and the constant upper bounds for these distributions.

Example 1: Pareto Distribution

The Pareto distribution is a power law distribution that is widely used in the description of social, scientific, geophysical, actuarial, and many other types of observable phenomena. According to the influential studies by Arlitt and Williamson [2] and Reed and Jorgensen [23] as well as the references therein, the distributions of web server workload and of Internet traffic which uses the TCP protocol match well with the Pareto distribution.

That is, for a random variable T chosen from this Pareto distribution, the probability that T is smaller than a value t, is given by

$$\begin{aligned} F(t)=\Pr (T<t)= {\left\{ \begin{array}{ll} 1 - \left( \frac{t_{\min }}{t} \right) ^{\alpha } &{} t \ge t_{\min } \\ 0 &{} t < t_{\min } \end{array}\right. } \end{aligned}$$

where \(\alpha >0\) is the tail index of the distribution.

Note that in the proof of Theorem 1, the only place we need to deal with the particular distribution is Lemma 3. So, by handling the constant h for the Pareto distribution, we obtain the following result.

Theorem 2

For the Pareto distribution, let \(h=12^{\frac{1}{\alpha }}\). The average-case approximation ratio of the mechanism \(\mathrm {M}\) is upper bounded by \(2 \cdot 12^{\frac{1}{\alpha }} + 1.33\).

Proof

For the Pareto distribution, let \(h=12^{\frac{1}{\alpha }}\). In Lemma 3, we would have

$$\begin{aligned} \Pr [\mathrm {A}]&= \Pr [t_{\frac{n}{2}} \le h \cdot t_{\min }] \\&= \sum _{k=\frac{n}{2}}^{n} {{n}\atopwithdelims (){k}} \left( F(h t_{\min })\right) ^k \left( 1-F(h t_{\min }) \right) ^{n-k} \\&= \sum _{k=\frac{n}{2}}^{n} {{n}\atopwithdelims (){k}} \left( 1-\frac{1}{h^{\alpha }} \right) ^{k} \left( \frac{1}{h^{\alpha }} \right) ^{n-k} \\&=1 - \sum _{k=0}^{\frac{n}{2}-1} {{n}\atopwithdelims (){k}} \left( 1-\frac{1}{h^{\alpha }} \right) ^{k} \left( \frac{1}{h^{\alpha }} \right) ^{n-k} \\&\ge 1 - \sum _{k=0}^{\frac{n}{2}-1} {{n}\atopwithdelims (){n/2}} \left( \frac{1}{h^{\alpha }} \right) ^{\frac{n}{2}} \\&= 1 - \frac{n}{2} {{n}\atopwithdelims (){n/2}} \left( \frac{1}{h^{\alpha }} \right) ^{\frac{n}{2}} \\&\ge 1 - \frac{n}{2} \cdot \frac{e}{\pi \sqrt{n}} \cdot 2^n \left( \frac{1}{h^{\alpha }} \right) ^{\frac{n}{2}} \\&= 1 - \frac{n}{2} \cdot \frac{e}{\pi \sqrt{n}} \cdot 2^n \left( \frac{1}{12} \right) ^{\frac{n}{2}} \\&= 1 - \frac{e}{2\pi } \cdot \sqrt{n} \cdot 3^{-\frac{n}{2}} \\&\ge 1 - \frac{e}{2\pi } \cdot \frac{1}{n} \end{aligned}$$

The rest of the proof follows the proof of Theorem 1. \(\square\)

Example 2: Exponential Distribution

We then consider the case that the machines’ costs \(t_{i}\)’s are independent variables and follow a truncated Exponential distribution \(\mathrm {D}[t_{\min },\infty )\). That is, for a random variable T chosen from this Exponential distribution, the probability that T is smaller than a value t, is given by

$$\begin{aligned} F(t)=\Pr (T<t)= {\left\{ \begin{array}{ll} 1 - \frac{1}{e^{\lambda t}} &{} t \ge t_{\min } \\ 0 &{} t < t_{\min } \end{array}\right. } \end{aligned}$$

where \(\lambda >0\) is the tail index of the distribution.

Theorem 3

For the Exponential distribution, let \(h=\frac{1}{\lambda t_{\min }} \ln 12\). The average-case approximation ratio of the mechanism \(\mathrm {M}\) is upper bounded by \(2 \cdot \frac{1}{\lambda t_{\min }} \ln 12 + 1.33\).

The proof is similar to the case of a Pareto distribution.

Example 3: Log-logistic Distribution

The log-logistic distribution is the probability distribution of a random variable whose logarithm has a logistic distribution. It is similar in shape to the log-normal distribution but has heavier tails. It is used in networking to model the transmission times of data. The cumulative distribution function is

$$\begin{aligned} F(t; \alpha , \beta ) = \frac{t^{\beta }}{\alpha ^{\beta } + t^{\beta }} \end{aligned}$$

where \(\alpha >0\) is a scale parameter and \(\beta >0\) is a shape parameter. For simplicity, we take \(\alpha =1\) and have the following result.

Theorem 4

For the Log-logistic distribution, let \(h=\frac{1}{t_{\min }} \cdot e^{\frac{\ln 11}{\beta }}\). The average-case approximation ratio of the mechanism \(\mathrm {M}\) is upper bounded by \(2 \cdot \frac{1}{t_{\min }} \cdot e^{\frac{\ln 11}{\beta }} + 1.33\).

4 Conclusion and Future Work

In this paper, we extended the worst-case approximation ratio analysis for the scheduling problem studied in [11] to the average-case approximation ratio analysis. We showed that, when the costs of the machines are independent and identically distributed, the average-case approximation ratios of the optimal mechanism \(\mathrm {M}\) have constant bounds, which is asymptotically better than the worst-case approximation ratios. While in the worst case, the expected cost of the mechanism is \(\varTheta {(n)}\) times of what the optimal cost is, our results offered some relief for deploying the mechanism \(\mathrm {M}\) in practice.

Many problems remain open. Firstly, similar to employing the worst-case analysis as a framework for comparing truthful mechanisms, we can employ the average-case analysis as a framework. Although the mechanism \(\mathrm {M}\) in [11] is optimal for the problem in terms of the worst-case ratio, it may be the case that there are other mechanisms that perform better than \(\mathrm {M}\) in the average case. Note that this comparison may need to be done on a distribution base.

Secondly, it would be interesting to show some lower bounds for the average-case ratio of any truthful mechanism, but one should expect some much more involved arguments than their worst-case lower bound counterparts, and again, very likely different distributions need to be handled differently.

One might query the smoothed analysis of the mechanism \(\mathrm {M}\). We note, though, unlike the random priority mechanism studied in [7] that has a constant smoothed approximation ratio, the smoothed ratio of the mechanism \(\mathrm {M}\) would not be asymptotically different from the worst-case ratio. To see this, the tight bound example in the worst-case analysis of the mechanism \(\mathrm {M}\) is obtained when the ratio of the minimum value of the processing times against the maximum value of the processing times approaches 0, i.e., \(t_1/t_n \rightarrow 0\). Obviously, any small perturbation around these inputs would not change the nature of this fact.

Many more approximate mechanism design problems deserve average-case analysis to understand the nature of the problems and the performance of the mechanisms designed to optimize their worst-case performance.