Queue length asymptotics for the multiple-server queue with heavy-tailed Weibull service times

We study the occurrence of large queue lengths in the GI / GI / d queue with heavy-tailed Weibull-type service times. Our analysis hinges on a recently developed sample path large-deviations principle for Lévy processes and random walks, following a continuous mapping approach. Also, we identify and solve a key variational problem which provides physical insight into the way a large queue length occurs. In contrast to the regularly varying case, we observe several subtle features such as a non-trivial trade-off between the number of big jobs and their sizes and a surprising asymmetric structure in asymptotic job sizes leading to congestion.


Introduction
The purpose of this paper is to obtain results for the challenging multiple-server queue, with unlimited waiting room and the FCFS policy, in the case that the job size distribution has a tail of the form e −L(x)x α , α ∈ (0, 1). Our approach exploits the large deviation theory framework and in particular the large deviations principle for random walks with heavy-tailed Weibull increments. Consequently, we can estimate the probability of a large queue length of the G/G/d queue with heavy-tailed Weibull service times and extract information about "the most likely way" so that a large queue length build up occurs. The many-server queue with heavy-tailed service times has so far mainly been considered in the case of regularly varying service times, see [1,2]. In this paper, we show that, for γ ∈ (0, ∞), with c * the value of the optimization problem min d i=1 x α i subject to (2) l (s; where λ is the arrival rate, and service times are normalized to have unit mean. Note that this problem is equivalent to an L α -norm minimization problem with α ∈ (0, 1). Such problems also appear in applications such as compressed sensing, and are strongly NP hard in general, see [3] and references therein. In our particular case, we can analyze this problem exactly, and if γ ≥ 1/(λ − λ ), the solution takes the simple form ( This simple minimization problem has at most two optimal solutions, which represent the most likely number of big jumps that are responsible for a large queue length to occur, and the most likely buildup of the queue length is through a linear path. For smaller values of γ, asymmetric solutions can occur, leading to a piecewise linear buildup of the queue length; we refer to Section 3 for more details. Note that the intuition that the solution to (2) yields is qualitatively different from the case in which service times have a power law. In the latter case, the optimal number of big jobs equals the minimum number of servers that need to be removed to make the system unstable, which equals d − λ . In the Weibull case, there is a nontrivial trade-off between the number of big jobs as well as their size, and this trade-off is captured by (2) and (3). This essentially answers a question posed by Sergey Foss at the Erlang Centennial conference in 2009. Our result is consistent with a more refined asymptotic estimate obtained for d = 2 and λ < 1 in [1]. For earlier conjectures on this problem we refer to [4]. An overview of related results for the case of regularly varying service times, can be found in [2].
We derive (1) by utilizing a tail bound for Q(t), which are derived in [5]. These tail bounds are given in terms of functionals of superpositions of random walks. We show these functionals are (almost) continuous in the M 1 topology, introduced in [6], making this fit into the large deviations framework introduced in the companion paper [7].
The paper is organized as follows. Section 2 provides a model description and some useful tools used in our proofs. Section 3 provides our main result and some mathematical insights associated with it. Lastly, section 4 contains the Lemmas and proofs needed to construct Theorem 1.

Model Description And Preliminary Results
We consider the FCFS G/G/d queuing model with d servers in which inter-arrival times are independent and identically distributed (i.i.d) random variables (r.v.s) and service times are i.i.d r.v.s Let A and S be the generic inter-arrival time and the generic service time respectively, with non-negative support. In addition, we make the following assumptions: 1) There exists θ + such that E(e θA ) < ∞ for every θ ≤ θ + 2) P(S ≥ x) = e −L(x)x α , α ∈ (0, 1) and that L(x)x α−1 is decreasing.
Let Q(t) denote the queue length process at time t in the FCFS G/G/d queuing system with inter-arrival times drawn i.i.d distributed on A and service times drawn i.i.d distributed on S. Our goal is to identify the limit behavior of P(Q(γn) > n) as n → ∞ in terms of the distributions A and S. Let λ = 1/E[A] and µ = 1/E[S]. Let M be the renewal process associated with A. That is,  n (t) = N (i) (nt)/n for t ≥ 0. Our analysis hinges on Theorem 3 of [5], which for the G/G/d queue states Result 1 For all x > 0 and t ≥ 0, Now, from (4), we conclude that for each γ, β ∈ (0, ∞) and we show that (5) is an asymptotically tight upper bound in cases of big delays. It should be noted that in [5], A 1 and S (i) 1 are defined to have the residual distribution to make M and N (i) equilibrium renewal processes, but such assumptions are not necessary for (5) itself.
In view of the above, a natural way to proceed is to establish LDPs forM n andN (i) n 's. We invoke a recent result in [7] on sample path large deviations for random walks with heavy-tailed Weibull increments. Let D[0, 1] denote the Skorokhod space-space of càdlàg functions from [0, 1] to R-and T M 1 denote the M 1 Skorokhod topology on D[0, 1]. We say that ξ ∈ D[0, 1] is a pure jump function if ξ = ∞ i=1 x i 1 [ui,1] for some x i 's and u i 's such that x i ∈ R and u i ∈ [0, 1] for each i and u i 's are all distinct. Let D ↑ p [0, 1] denote the subspace of D[0, 1] consisting of non-decreasing pure jump functions which non-negative values at the origin. The following theorem is the result used in the core of our analysis,

Result 2 LetX
(i) n , i = 1, . . . , d, satisfy an LDP on (D, T M 1 ) with speed L(n)n α , where α ∈ (0, 1). Then, (X T M 1 with speed L(n)n α and the rate function Although we have a proper framework to study (5), note that,M n andN n 's depend on random number of A j 's and S (i) j 's, and hence may depend on arbitrarily large number of A j 's and S (i) j 's with strictly positive probabilities. This does not exactly correspond to the large deviations framework we mentioned earlier. To accommodate such a context, we introduce the following maps. Fix γ > 0. Define for µ > 0, Ψ µ : and for each µ define a map Φ µ : Here we denoted max{x, 0} with [x] + . In words, between the origin and the supremum of ξ, Φ µ (ξ)(s) is the first passage time of ξ crossing the level s; from there to the final point γ, Φ µ (ξ) increases linearly from γ/µ at rate 1/µ (instead of jumping to ∞ and staying there). DefineĀ n ∈ D[0, γ/EA] asĀ n (t) A(nt)/n for t ∈ [0, γ/EA] andS  and then solve the quasi-variational problem to establish the asymptotic bound.

Main results
In this section we will state the most important contribution of this paper.
We make some comments that are meant to provide some physical insight, and highlight differences with the well studied case where the job sizes follow a regularly varying distribution.
If γ < 1/λ, no finite number of large jobs suffice, and we conjecture that the large deviations behavior is driven by a combination of light and heavy tailed phenomena in which the light tailed dynamics involve pushing the arrival rate by exponential tilting to the critical value 1/γ, followed by the heavy-tailed contribution evaluated as we explain in the following development.
If γ > 1/λ the following features are contrasting with the case of regularly varying service-time tails: 1. The large deviations behavior is not driven by the smallest number of jumps which drives the queueing system to instability (i.e. d − λ ). In other words, in the Weibull setting, it might be cheaper to block more servers. 2. The amount by which the servers are blocked may not be the same among all of the servers which are blocked.
To illustrate the first point, assume γ > b/ (λ − λ ), in which case and the optimal solution of c * reduces to Let us use l * to denote an optimizer for the previous expression; intuitively, d − l * represents the optimal number of servers to be blocked (observe that d − λ = d − λ corresponds to the number of servers blocked in the regularly varying case). Note that if we define Hence,ḟ . This observation allows to conclude that whenever γ > b/ (λ − λ ) we can distinguish two cases. The first one occurs if in which case l * = λ (this case is qualitatively consistent with the way in which large deviations occur in the regularly varying case). On the other hand, if this case is the one which we highlighted in feature i) in which we may obtain d − l * > d − λ and therefore more servers are blocked relative to the large deviations behavior observed in the regularly varying case. Still, however, the blocked servers are symmetric in the sense that they are treated in exactly the same way.
In contrast, the second feature indicates that the most likely path to overflow may be obtained by blocking not only a specific amount to drive the system to instability, but also by blocking the corresponding servers by different loads in the large deviations scaling. To appreciate this we must assume that In this case, the contribution of the infimum in (16) becomes relevant. In order to see that we can obtain mixed solutions, it suffices to consider the case d = 2, and 1 < λ < 2 and concluding that for δ small enough and therefore we can have mixed solutions. For example, consider the case d = 2, λ = 1.49, α = 0.1 and γ = 1 λ−1 − 0.1. For these values, λ α , and the most likely scenario leading to a large queue length is two big jobs arriving at the beginning and blocking both servers with different loads. On the other hand, if γ = 1 λ−1 the most likely scenario is a single big job blocking one server. These two scenarios are illustrated in Figure 1.
We conclude this section by presenting future directions of research. It is worth mentioning that we provide asymptotics only for the transient model of the queue length process Q. For the obverse i.e; when the queue is in steady state, more elaborate work is needed to overcome the technicalities arising with the large deviations framework. Specifically, one has to prove that, the interchange of limits as γ and n tend to infinity, is valid. We conjecture that the optimal value, similar to (7), of the variational problem associated with the steady state model will consist solely of the term, min λ l=0 (d − l) 1 λ−l α , obtained by taking γ = ∞ in (2). Lastly, it is possible to obtain Theorem 1 under greater generality by relaxing the assumption that A is light-tailed. For this, all one has to notice is, We expect that the logarithmic asymptotics of (II) will lead to a variational problem parameterized by δ, whose optimal value is denoted as c * δ, . Evaluating the limit of c * δ, as , δ tend to zero and showing its equality with c * , will lead to the same logaritmic asymptotics seen in (7). By exploiting the light-tailed traits of (I), one can reach at the same conclusion seen in Theorem 1.

Proof of Theorem 1
Let D ↑ p [0, γ/µ] be the subspace of D[0, γ/µ] consisting of non-decreasing pure jump functions that assume non-negative values at the origin, and define Proposition 1Ā n satisfies the LDP on D[0, γ/EA], d M 1 with speed L(n)n α and rate function n satisfies the LDP on D[0, γ/ES], d M 1 with speed L(n)n α and the rate function Proof Firstly, note that Mogulski's sample path large deviation principle implies that 1 n nt j=1 (A j − EA · t) satisfies the LDP with speed L(n)n α and with the rate function On the other hand, due to Result 2, 1 n nt j=1 S (i) j − ES · t satisfies the LDP with the rate function Consider the map Υ µ : As a result, there exist parameterizations (u n (s), t n (s)) of ξ n and (u(s), t(s)) of ξ so that, This implies that max{sup t≤γ/EA |u n (s) − u(s)|, sup t≤γ/EA |t n (s) − t(s)|} → 0 as n → ∞. Observe that, if (u(s), t(s)) is a parameterization for ξ, then (u(s) + EA · t(s), t(s)) is a parameterization for Υ EA (ξ). Consequently, Thus, Υ EA (ξ n ) → Υ EA (ξ) in the M 1 topology, proving that the map is continuous. The same argument holds for Υ ES . By the contraction principle,Ā n obeys the LDP with the rate function I 0 (ζ) inf{I A (ξ) : Similarly, I S (i) (ξ) = ∞ when ξ is not a non-decreasing pure jump function. Note that ξ ∈ D ↑ s implies that Taking into account the form of I S (i) and that To see this, remember that by construction,N  n )(γ) has to be strictly smaller than γ . That is, and we are done for the exponential equivalence betweenN Observe that for every ξ ∈ D ES [0, γ/ES], it holds that, ξ(t) ≥ ES · t for every t ∈ [0, γ/ES]. That is, where the second inequality is due to the LDP upper bound forS (i) n in Proposition 1. This concludes the proof for the exponential equivalence betweenN (i) n and Φ ES (S (i) n ). The exponential equivalence betweenM n and Φ EA (Ā n ) is essentially identical, and hence, omitted.
We conclude with the proof for the matching lower bound in case γ > 1/λ. Considering the obvious coupling between Q and (M, N (1) , · · · , N (d) ), one can see that M (s) − d i=1 N (i) (s) can be interpreted as (a lower bound of) the length of an imaginary queue at time s where the servers can start working on the jobs that have not arrived yet. Therefore, P(Q((a + s)n) > n) ≥ P(Q((a + s)n) > n|Q(a) = 0) ≥ (i) n (s) > 1) for any a ≥ 0. Let s * be the level crossing time of the optimal solution of (14). Then, for any > 0, . . , ξ d ) obtained by increasing one of the job size of (ξ * 1 , . . . , ξ * d ) by δ > 0. One can always find a small enough such δ since γ > 1/λ. Note that there exists > 0 such that where the second inequality is from the subadditivity of x → x α . Since δ can be chosen arbitrarily small, letting δ → 0, we arrive at the matching lower bound.
To obtain more insight in the way the rare event {Q(γn) ≥ n} occurs, we now simplify the expression of c * given in Proposition 5. To ease notation, we assume from now on that E[S] = µ −1 = 1.
which in turn equals min inf Proof We want to show that c * is equal to After a simple transformation, we might assume that µ = 1. For simplicity in the exposition we will assume the existence of an optimizer. The argument that we present can be carried out withε-optimizers. In the end, the representation that we will provide will show the existence of an optimizer. First, we will argue that without loss of generality we may assume that if (ζ 1 , ..., ζ d ) is an optimal solution then the corresponding functions ξ 1 , ..., ξ d have at most one jump which occurs at time zero. To see this suppose that (ζ 1 , ..., ζ d ) is an optimal solution and consider the corresponding functions (ξ 1 , ..., ξ d ) such that ζ i = ϕ µ (ξ i ). By feasibility, we must have that at least one of the ξ i 's exhibit at least one jump in [0, γ]. Assume that ξ i exhibits two or more jumps and select two jump times, say 0 ≤ u 0 < u 1 ≤ γ, with corresponding jump sizes x 0 and x 1 , respectively. Letξ in simple words,ξ i (·) is obtained by merging the jump at time u 1 with the jump at time u 0 . It is immediate (since x 0 , x 1 > 0) that for each tξ i (t) ≥ ξ i (t) and, therefore, lettingζ i = ϕ µ ξ i we obtain (directly from the definition of the functionalζ i as a generalized inverse) that for every sζ i (s) ≤ ζ i (s) . Therefore, we conclude that the collection ζ 1 , ...,ζ i , ..., ζ d is feasible. Moreover, since and, by strict concavity, (x 0 + x 1 ) α < x α 0 + x α 1 , we conclude that ζ 1 , ...,ζ i , ..., ζ d improves the objective function, thus violating the optimality of ζ 1 , ..., ζ d . So, we may assume that ξ i (·) has a single jump of size x i > 0 at some time u i and therefore and we must have that t ≥ x i + u i ; otherwise, if x i + u i > t then we might reduce the value of the objective function while preserving feasibility (this can be seen from the form of ζ i (·)), thus contradicting optimality. Now, suppose that u i > 0, choose ε ∈ (0, min(u i , x i )) and definē In simple words, we just moved the first jump slightly earlier (by an amount ε). Once again, letζ i = ϕ µ ξ i , and we have thatζ Therefore, we preserve feasibility without altering the objective function. As a consequence, we may assume that u i = 0 and using expression (18) we then obtain that (14) takes the form Let x = (x 1 , ..., x d ) be any optimal solution, we may assume without loss of generality that 0 ≤ x 1 ≤ ... ≤ x d . We claim that x satisfies the following features. First, x d ≤ γ, this is immediate from the fact that we are minimizing over the x i 's and if x d > γ we can reduce the value of x d without affecting the feasibility of x, thereby improving the value of (20). The same reasoning allows us to conclude that inf{s : l (s; This problem can be simplified to In turn, we know that 0 < λ < d, then it suffices to consider (21) because (λ − m) < 0 implies a λ−m+1 = 0 (otherwise we can reduce the value of the objective function). We first consider the case λ > λ . Moreover, observe that if γ ≥ 1/(λ − λ )) then any solution satisfying (22) and (24) automatically satisfies (23), so we can ignore the constraint (23) if assume that γ ≥ 1/(λ − λ ). If λ is an integer we will simply conclude that a λ +1 = 0 and if we only assume γ > 1/λ we will need to evaluate certain extreme points, as we shall explain later. Now, the objective function is clearly concave and lower bounded inside the feasible region, which in turn is a compact polyhedron. Therefore, the optimizer is achieved at some extreme point in the feasible region (see [10]). Under our simplifying assumptions, we only need to characterize the extreme points of (22), (24), which are given by a i = 1/(λ − i + 1) for i = 1, ..., λ + 1.
So, for the solution to be both basic and feasible we must have that 1/ (λ − l) ≤ γ ≤ 1/ (λ − k) (with strict inequality holding on one side).