Minimizing the mean slowdown in the M/G/1 queue

We consider the optimal scheduling problem in the M/G/1 queue. While this is a thoroughly studied problem when the target is to minimize the mean delay, there are still open questions related to some other objective functions. In this paper, we focus on minimizing mean slowdown among non-anticipating polices, which may utilize the attained service of the jobs but not their remaining service time when making scheduling decisions. By applying the Gittins index approach, we give necessary and sufficient conditions for the jobs’ service time distribution under which the well-known scheduling policies first come first served and foreground background are optimal with respect to the mean slowdown. Furthermore, we characterize the optimal non-anticipating policy in the multi-class case for certain types of service time distributions. In fact, our results cover a more general objective function than just the mean slowdown, since we allow the holding costs for a job to depend on its own service time S via a generic function c(S). When minimizing the mean slowdown, this function reads as c(x)=1/x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c(x) = 1/x$$\end{document}.


Introduction
according to an independent Poisson process. Thus, the aggregate arrival process is a Poisson process, too. The total arrival rate of jobs is denoted by λ. We assume that the total load ρ = λE[S] < 1 in order to have a stable M/G/1 queue. The delay (a.k.a. sojourn time or response time) of a job is denoted by T π , where π refers to the scheduling policy applied. We assume that the applied scheduling policy allows preemptions.
The optimal scheduling policy depends naturally on the objective function but also on the information available to the scheduler. Policy π is said to be non-anticipating if it is aware of the arrival time and the attained service of each job in the system, while an anticipating scheduler knows even the remaining service times. If the aim is to minimize the mean delay E[T π ], then the optimal anticipating scheduling policy is SRPT (Shortest Remaining Processing Time) [18,23]. In the special case where the service times are deterministic, SRPT coincides with the ordinary FCFS (first come first served) discipline or any other non-preemptive and work-conserving scheduling policy.
The optimal non-anticipating policy minimizing the mean delay, however, depends essentially on the service time distribution. For example, FCFS is optimal when the service time distribution belongs to the family of NBUE (New Better Used in Expectation) distributions. On the other hand, for more variable DHR (Decreasing Hazard Rate) service times the optimal non-anticipating policy is FB 1 (foreground background), which is another well-known non-anticipating scheduling policy [16].
All these results can be justified by utilizing the concept of Gittins index. It is known that the optimal non-anticipating policy minimizing the mean delay is the Gittins index policy [2,3,8,19,21], which always chooses the job with the highest index G del (a) defined by where S denotes the (original) service time and a the (currently) attained service of the job. While the optimal scheduling problem in the M/G/1 queue is a thoroughly studied problem when the target is to minimize the mean delay, there are still open questions related to other objective functions. In this paper, we focus on minimizing mean slowdown 2 E[T π /S], i.e., the expectation of the ratio between the delay and the service time of a job. Among the anticipating policies, the optimal policy is known to be SPTP 3 (Shortest Processing Time Product) [12,27,28]. But the optimal nonanticipating scheduler with respect to the mean slowdown has long been an open problem [1,5,6,11]. It is only known that FB is the optimal non-anticipating policy if the service time distribution of all jobs is such the ratio h(x)/x is decreasing [7], where h(x) denotes the hazard rate.
In this paper, we prove that the condition given above is not only sufficient but also necessary for the optimality of FB (see Theorem 2 and Corollary 4 in Sect. 3). As well, we give sufficient and necessary conditions under which FCFS is optimal minimizing the mean slowdown among the non-anticipating policies (see Theorem 1 and Corollary 3 in Sect. 3). Furthermore, we characterize the optimal non-anticipating policy in the multi-class case for certain types of service time distributions (see Theorems 3 and 4 together with Corollaries 5 and 6 in Sect. 4).
Our approach is based on the Gittins index. In fact, we consider even a more general objective function than just the mean slowdown. More precisely said, we assume that the holding costs for a job with service time S accrue at rate c(S) > 0. If the aim is to minimize the mean slowdown, then the holding cost rate function c(x) is given by c(x) = 1/x. On the other hand, the choice c(x) = 1 corresponds to minimizing mean delay. It was recently shown in [19,21] that the Gittins index approach is applicable even for this kind of a general setting of holding costs: The optimal non-anticipating policy is the index policy that always chooses the job with the highest index G c (a) defined by 4 where 1 A refers to the indicator function of event A. Starting from this formula, we are able to prove the results mentioned above. The rest of the paper is organized as follows. In Sect. 2, we consider a single job and derive certain important properties of the Gittins index function defined in (2). These properties are then utilized in Sects. 3 and 4, where we characterize the optimal scheduling policy with respect to the general holding costs in the single-class and the multi-class cases, respectively. The main results related to minimizing mean slowdown are then illustrated by numerical examples in Sect. 5.

Properties of the Gittins index
In this section, we consider a single job with service time S. The aim is to derive such properties of the Gittins index (2) that enable us (later on in Sect. 3) to characterize for which type of service time distributions FCFS or FB are optimal with respect to the general holding costs. Thus, we assume here that the holding costs for the job accrue at rate c(S), which depends on its own service time S. We assume that the cost rate function c(x) is right-continuous with left limits.
The service time S is assumed to have a general distribution with the cumulative distribution function denoted by LetF(x) denote the corresponding tail distribution function, and assume thatF(x) > 0 for all x ≥ 0. 5 In addition, we assume that the service time distribution has a right-continuous density function f (x) with left limits. 6 Let h(x) denote the corresponding hazard rate function, In addition, we introduce the following auxiliary functions that depend both on the service time distribution and the cost rate function c(x): Since Let us now rewrite the Gittins index (2) for this job as follows: where a denotes the attained service of the job and J c (a, b) refers to the following efficiency function: continuously continued by defining Define finally to be the (largest) maximizer b of J c (a, b).

Properties related to the H c -function
In this section, we study connections between the function H c (x) defined in (4) and the corresponding Gittins index G c (x). In particular, we derive sufficient and necessary conditions under which G c (x) = H c (x) for some x.
Proof By (4), we have By rearranging the terms in last inequality, we get from which we get, by (6), This completes the proof.

Proposition 1
Let a ≥ 0. The following three statements are equivalent: Proof Note first that, by (5) and (7), the equivalence between (i) and (iii) follows immediately from Lemma 1. Below we prove the equivalence between (ii) and (iii) in two parts. (5) and (7). Thus, Let then x ∈ (m * , ∞]. Now, by defining p ∈ (0, 1) so that it follows from (10) that which implies that, for any x ∈ (m * , ∞], By continuity, we also have From (11) and (12) it follows that which contradicts our assumption that By Proposition 1, we get the following immediate consequence describing further the connections between the service time distribution and the corresponding Gittins index.

Corollary 1
Let a ≥ 0. The following three statements are equivalent: Note that, in this paper, we use the terms 'increasing' and 'decreasing' in their weak forms. So, the functions H c (x) and G c (x) in the claims above are not required to be strictly increasing.

Properties related to the h c -function
In this section, we study connections between the function h c (x) defined in (4) and the corresponding Gittins index G c (x). In particular, we derive sufficient and necessary conditions under which G c (x) = h c (x) for some x.
Since, by (6), the claim follows clearly from the previous inequality.

Proposition 2
Let a ≥ 0. The following two statements are equivalent: However, this is equivalent to the claim that Here we need to study two separate cases (2.1 • and 2.2 • below). 2

However, this contradicts our assumption that
Clearly, p(x) ∈ (0, 1), and we have However, this contradicts our assumption that (9) and (8), we have In addition, let x ∈ (a, b * (a)), and define Clearly, p(x) ∈ (0, 1). Note also that we may write which implies that By Propositions 2 and 3, we get the following immediate consequence further describing the connections between the Gittins index and the service time distribution.
Corollary 2 Let a ≥ 0. The following three statements are equivalent:

Characterization of the optimal policy in the single-class case
In this section, we consider the single-class case, i.e., all jobs have the same service time distribution function F(x). We reveal the properties the service time distribution should have in order for FCFS or FB to be the optimal non-anticipating policy with respect to the generalized holding costs introduced in Sect. 1. As before, let c(x) denote the corresponding holding cost rate function, which is now common to all the jobs. Recall also our assumptions made in Sect. 2 that the density function f (x) of the service time distribution and the cost rate function c(x) are right-continuous with left limits.
Definition 1 Let I denote the set of current jobs in the system. A scheduling policy belongs to the MAS class if it chooses the job with the most attained service, Note that MAS is a whole family of scheduling policies consisting of all nonanticipating policies that are work-conserving and non-preemptive. In particular, FCFS is such a policy. Other well-known examples are non-preemptive LCFS (Last Come First Served) and SIRO (Service In Random Order).

Definition 2
Let I denote the set of current jobs in the system. The FB policy chooses the job with the least attained service, Theorem where H c (a) is defined in (4).

Proof
The result follows immediately from Proposition 1 with choice a = 0 and the optimality of the Gittins index policy [19,21].  (4).

Proof
The result follows immediately from Corollary 2 with choice a = 0 and the optimality of the Gittins index policy [19,21].

Minimizing mean slowdown in the single-class case
We now spell out the implications of Theorems 1 and 2 for the specific case of minimizing mean slowdown E[T π /S], which corresponds to holding cost rate function c(x) = 1/x. Let us denote the corresponding h c -function by which is called the scaled hazard rate in the sequel. In addition, we denote the corresponding H c -function by and the corresponding Gittins index G c (x) by G sld (x). By Theorems 1 and 2, we have the following characterizations for the optimality of the MAS and FB policies.
This result was already anticipated in [1], where it is presented as a conjecture without any proof. It is easy to show that the family of service time distributions satisfying (18) is a (proper) subset of the NBUE distributions, which were mentioned in Sect. 1. As an example of a distribution that belongs to NBUE but does not satisfy condition (18) serves any Weibull distribution with shape parameter k ∈ [1, 2). Feng and Misra [7] already proved that FB minimizes the mean slowdown among the non-anticipating policies if the scaled hazard rate h(x)/x of the service time distribution is a decreasing function of x. Such distributions include clearly all DHR distributions, for which h(x) is required to be decreasing. Here we complete the result by proving that this condition is not only sufficient but also necessary for the optimality of FB.

Characterization of the optimal policy in the multi-class case
In this section, we assume that there are multiple job classes and the scheduler is aware of the class of each job. Let F j (x) denote the service time distribution function and c j (x) the holding cost rate function of class j. As before, we assume that the density function f j (x) of the service time distribution and the cost rate function c j (x) are right-continuous with left limits for all classes j. Our aim is to characterize, for certain types of service time distributions, the optimal non-anticipating policy with respect to the generalized holding costs introduced in Sect. 1.
Fix index i for a while and consider job i. Let j refer to its class. In line with (4), we define job i's h c -function h c,i (x) and H c -function H c,i (x) as follows: where h j (x) refers to the hazard rate function of service time distribution function F j (x).

Definition 3
Let I denote the set of current jobs in the system. The MAX-H policy chooses the job that maximizes the current value of the H c -function, where a i is the attained service of job i and H c,i (·) refers to its H c -function as defined in (19).

Definition 4
Let I denote the set of current jobs in the system. The MAX-h policy chooses the job that maximizes the current value of the h c -function, where a i is the attained service of job i and h c,i (·) refers to its h c -function as defined in (19).
Note that, if the aim is to minimize mean delay, which corresponds to c j (x) = 1 for all job classes j, then MAX-H is the same as the SERPT (Shortest Expected Remaining Processing Time) policy, since function H c,i (x) equals, in this case, the inverse of the mean residual lifetime function, Correspondingly, MAX-h is the same as the HHR (Highest Hazard Rate) policy, since h c,i (x) is the hazard rate function of the job i's service time distribution in this special case.
Note also that functions H c,i (x) and h c,i (x) are common to all jobs i belonging to the same class, say j. Therefore, we may, as well, refer to them by H c, j (x) and h c, j (x), respectively, without any danger for confusion.

Theorem 3 Assume the multi-class case. The MAX-H policy minimizes the generalized holding costs among the non-anticipating policies if class-wise functions H c, j (x) are increasing for all classes j.
Proof The result follows immediately from Corollary 1 with choice a = 0 and the optimality of the Gittins index policy [19,21].

Theorem 4 Assume the multi-class case. The MAX-h policy minimizes the generalized holding costs among the non-anticipating policies if and only if class-wise functions h c, j (x) are decreasing for all classes j.
Proof The result follows immediately from Corollary 2 with choice a = 0 and the optimality of the Gittins index policy [19,21].
The precondition of Theorem 4 is essentially the multi-class analogue of the precondition of Theorem 2. However, the precondition of Theorem 3 is stricter than the multi-class analogue of the precondition of Theorem 1. Below, we discuss why this difference occurs. See Fig. 1 for an accompanying illustration. For MAX-H to be optimal, in the single-class case, we require only that H c (x) is minimized at x = 0, whereas in the multi-class case, we require H c, j (x) to be increasing at all x ≥ 0. The issue with MAX-H when we only have H c, j (x) minimized at x = 0 is that while MAX-H correctly prioritizes jobs that have not begun service (because H c, j (0) = G c, j (0) by Proposition 1), it may incorrectly prioritize jobs that have begun service. This is not a problem in the single-class case, as all that matters is that jobs that have begun service have priority over jobs that have not yet begun service. But in the multi-class case, we may need to compare a class j job in service to a class j job that has not yet begun service.
However, there are cases where MAX-H is optimal even when the precondition of Theorem 3 is not satisfied. It turns out that if the class-wise Gittins index functions G c, j (x) are minimized at x = 0 but have non-overlapping values, i.e., G c, j (x) ≥ G c, j (y) for all x, y ≥ 0 and all j < j , then the class-wise H c, j (x) functions are also minimized at x = 0 and have nonoverlapping values, thanks to Proposition 1 and the fact that H c, j (x) ≤ G c, j (x). In this non-overlapping case, illustrated in Fig. 1c, the Gittins index policy reduces to preemptive class-based priority: class j has priority over class j for j < j , with jobs served in FCFS order 7 within each class. But MAX-H reduces to the same policy, so it is also optimal.
We have shown that the precondition of Theorem 3 is sufficient but not necessary for MAX-H to be optimal. Exactly stating the necessary condition, or even just a more lenient sufficient condition, seems to be difficult. For example, one might consider the condition that the H c, j (x) functions are minimized at x = 0 and have non-overlapping values. But this is neither necessary, because we might have G c, j (x) = H c, j (x) whenever there is overlap; nor sufficient, because while non-overlapping Gittins index functions G c, j (x) suffice, this is not implied by non-overlapping H c, j (x) functions.

Minimizing mean slowdown in the multi-class case
We now spell out the implications of Theorems 3 and 4 to minimizing the mean slowdown, which corresponds to holding cost rate functions c j (x) = 1/x for all job classes j. We have the following characterizations for the optimality of the MAX-H and MAX-h policies as direct consequences of Theorems 3 and 4, respectively.  (17)) as follows

Corollary 6 Assume the multi-class case. The MAX-h policy minimizes the mean slowdown among the non-anticipating policies,
if and only if class-wise scaled hazard rate functions h j (x)/x are decreasing for all classes j.

Numerical examples
In this section, we illustrate numerically the main results related to the minimization of the mean slowdown. Thus, we assume that the holding cost rate is given by c(x) = 1/x for all jobs. In addition, we give examples of the corresponding Gittins index G sld (a) as a function of attained service a.
For the illustration, we use the Weibull service time distribution, for which E[S] = 1 μ Γ (1 + 1 k ) and the tail distribution function is given bȳ where k > 0 is the shape parameter and μ > 0 the scale parameter. With k = 1, we have the exponential distribution as a special case. The scaled hazard rate for a Weibull(k, μ) distribution reads as and the corresponding H c -function as Note that, for k = 2, both the scaled hazard rate h(x)/x and the H c -function H sld (x) reduce to the same constant value for all x > 0: In addition, we note that the scaled hazard rate h(x)/x is decreasing (satisfying the condition of Corollary 4) when k ≤ 2, and the H c -function H sld (x) is increasing (satisfying the condition of Corollary 3) when k ≥ 2.
The behavior of the Weibull distribution with different shape parameter values k is illustrated in Fig. 2 Example 1 Consider first the single-class case where all jobs have the same Weibull service time distribution with shape parameter k and scale parameter μ = Γ (1 + 1 k ). In Fig. 3, we have drawn the mean slowdown with loads ρ = 0.5 (upper panel) and ρ = 0.8 (lower panel) as a function of inverse shape parameter 1/k for the scheduling policies FCFS and FB based on the following known formulas: where f (x) refers to the density function, [min{S, x}], and m 2 (x) = E[min{S, x} 2 ]. For comparison, we have also drawn the mean slowdown for the PS (Processor Sharing) policy based on the following known formula: The shape parameter takes now continuously values k ∈ (1, ∞) so that 1/k ∈ (0, 1).
In the upper panel, the load takes value ρ = 0.5, and in the lower one, we have ρ = 0.8. Note that, in line with Corollary 3, FCFS is optimal when 1/k ≤ 1/2. However, when 1/k ≥ 1/2, the performance of FCFS becomes soon very bad as 1/k increases. In fact, the mean slowdown of the FCFS scheduling policy approaches ∞ as 1/k → 1.
On the other hand, while FB is optimal when 1/k ≥ 1/2 (in line with Corollary 4), its performance decreases remarkably when 1/k ≤ 1/2, the degradation being even worse with a higher load.
Example 2 Next we consider the multi-class case with two job classes. The service times for each class j ∈ {1, 2} follow the Weibull distribution with shape parameter k j and scale parameter μ j = Γ (1 We choose k 1 = 2 and k 2 = 4 so that the condition of Corollary 5 is satisfied, meaning the Gittins index policy reduces to MAX-H. In addition, we use the same arrival rate for both classes: λ 1 = λ 2 = λ/2, where λ denotes the total job arrival rate. Based on simulations, we have estimated the mean slowdown for MAX-H, FCFS, PS, and FB with various load levels ρ = λE [S]. In each simulation run with a fixed scheduling policy and load, we have gathered the system statistics until there are 10 6 job arrivals. The results are presented in Fig. 4. In the top panel, we have the estimated total mean slowdown for all jobs as a function of load ρ. In the middle and bottom panels, there are corresponding curves for the jobs of classes 1 and 2, respectively. Note that, in line with Corollary 5, MAX-H is better than the other scheduling policies for the total mean slowdown. In addition, FCFS is consistently second best, and FB performs the worst in this example, as expected. Interestingly, the classwise results for the two best scheduling policies, MAX-H and FCFS, are very different: MAX-H "prefers" class 1 to class 2, whereas FCFS behaves just the opposite. The intuition for this is as follows. As shown in Fig. 2, under MAX-H, the initial index of class 1 (k 1 = 2) jobs is greater than that of class 2 jobs (k 2 = 4), thus explaining MAX-H's preference for class 1 jobs. This makes sense: class 1 jobs are more likely to be very small than class 2 jobs. Treating all jobs the same, as in FCFS, thus leads to worse mean slowdown for class 1 jobs.
The question remains: given the very different classwise preferences, how come MAX-H and FCFS have such similar overall mean slowdown? We believe this is due to the fact that while a class 2 (k 2 = 4) job may, when it first arrives, have worse Gittins index than a class 1 (k 1 = 2) job, after just a small amount of service, a class 2 job reaches an index higher than that of class 1 jobs (see Fig. 2). This means that the average fraction of time that FCFS is serving a job other than the one of maximal Gittins index is relatively small, consisting entirely of the short first segments of class 2 jobs.

Example 3
In the last example, we again consider the multi-class case with two job classes where the service times for both classes follow the Weibull distribution with unit mean, E[S] = E[S 1 ] = E[S 2 ] = 1. Now we choose the shape parameters as follows: k 1 = 1 and k 2 = 2. Thus, the condition of Corollary 6 is satisfied, meaning the Gittins index policy reduces to MAX-h in this case. As in the previous example, we use the same arrival rate for both classes: λ 1 = λ 2 = λ/2, where λ denotes the total job arrival rate. Based on simulations, we have estimated the mean slowdown for scheduling policies MAX-h, PS, and FB with various load levels ρ = λE [S]. FCFS is left out since its performance is much worse than the other three policies in this case. In each simulation run with a fixed scheduling policy and load, we have gathered the system statistics until there are 10 6 job arrivals. The results are presented in Fig. 5. In the top panel, we have the estimated total mean slowdown for all jobs as a function of load ρ. In the middle and bottom panels, there are corresponding curves for the jobs of classes 1 and 2, respectively. Note that, in line with Corollary 6, MAX-h is better than the other scheduling policies for the total mean slowdown. In addition, FB is consistently better than PS. Note also that the classwise results for the two best policies, MAX-h and FB, are very different: MAX-h "prefers" clearly class 2 to class 1, whereas FB gives roughly similar performance to both classes.

Conclusion and discussion
We considered the optimal scheduling problem in the M/G/1 queue with rather general holding costs, which cover, for example, the minimization of the mean slowdown. To determine the optimal scheduling rule among the non-anticipating policies, which are aware of the attained services of the jobs but not on their remaining service times, we applied the Gittins index approach. In the single-class case, we found the necessary and sufficient conditions under which the FCFS rule (or any other work-conserving and non-preemptive scheduling policy) is optimal (Theorem 1). In addition, we found the necessary and sufficient conditions under which the FB rule is optimal (Theorem 2). In the-multi class case, where the scheduler can identify the class of each job, we derived the necessary and sufficient conditions under which the MAX-H and MAX-h rules (Definitions 3 and 4, respectively) are optimal (Theorems 3 and 4, respectively). To prove these optimality results, we needed the following technical assumptions: the service time distributions have a right-continuous density function with left limits and the holding cost rates are right-continuous functions of the service time with left limits.
There are a number of directions that could be fruitful to explore in future work. Recently, several "near optimality" results for the Gittins index or related policies have been shown for the constant holding cost setting (c(x) = 1), so a natural question is whether such results hold with general holding cost functions. One example is multiserver systems: Scully et al. [20] prove mean delay bounds for the constantholding-cost Gittins index in the M/G/k, which Grosof et al. [10] extend to the case where some jobs occupy multiple servers at once during service. Can we prove analogous multiserver performance bounds for the Gittins index with general holing costs? Coming back to the single-server setting, another question is whether we need the full power of the Gittins index to have good performance. Scully et al. [22] show that a variant of SERPT is a constant-factor approximation for mean delay in the M/G/1. Can we show that (a variant of) MAX-H, which is the natural generalization of SERPT to general holding costs, is a constant-factor approximation for mean holding cost?
Another potential future direction has to do with extending the ideas behind the Gittins index to even more general objective functions. For example, there are many objective functions which demand a time-varying holding cost, such as metrics related to deadlines. Yu et al. [29] show how such problems can be viewed through the lens of restless bandits and thus approached using the Whittle index, a generalization of the Gittins index. The Whittle index has been used for other queue scheduling problems [4,9,15], including recent work on the age-of-information metric [13,14,24]. Unlike the Gittins index, we should not expect the Whittle index to yield optimal policies in general, but it often yields policies that are in some sense asymptotically optimal. For the Gittins index, we now have a general theory of its optimality in the M/G/1. Can we develop a similarly general theory of the Whittle index's asymptotic optimality?