Large deviations for the total queue size in non-Markovian tandem queues

We consider a d-node tandem queue with arrival process and light-tailed service processes at all queues i.i.d. and independent of each other. We consider three variations of the probability that the number of customers in the system reaches some high level N , namely during a busy cycle, in steady state, and upon arrival of a new customer. We show that their decay rates for large N have the same value and give an expression for this value.


Introduction
Large deviations for the total queue size in (networks of) queues are of interest since they provide insight into how the probability of overflow decays as the overflow level increases. Such results are well-known for Markovian tandem queues (see, for example, [4]), but not for non-Markovian tandem queues. Thus, in this short paper, our interest is in the probability that the number of customers in a non-Markovian tandem queue reaches some high level N during a busy cycle, and the related probabilities that this number exceeds N in stationarity and upon arrival of a customer. In Sadowsky [5] the probability in a busy cycle has been considered for a single G|G|m queue. In Bertsimas et al. [1] the Palm probability of a single queue in a network reaching some high level N upon arrival of a customer is considered; the associated decay rate is characterized using the sojourn time of a specific customer. Very related to this work is Ganesh [3], in which the large deviations behavior of the sojourn time for queues in series is considered. The exact asymptotics of the sojourn time for tandem queues have been determined by Foss [2].
In this short paper we will consider a d-node G|G|1 tandem queue with renewal input and independent, i.i.d. service processes. We characterize the decay rate for the probability of reaching a total of N customers during a busy cycle of the system. Also we show that the stationary probability of having N customers in the system, as well as the probability of having N customers in the system upon arrival, have the same decay rate.
In Sect. 2 we provide the model and introduce our notation. Section 3 presents the main result of this paper, together with proofs.

Model and preliminaries
In this paper we consider d G|G|1 queues in tandem. Customers arrive at queue 1 according to a renewal process with inter-arrival times A k (between customers k and k + 1) distributed according to some positive random variable A. The service times at queue j, denoted as B ( j) k (for customer k), are independent and identically distributed according to some positive random variable B ( j) . Furthermore, we assume that all processes are independent and that customers are served based on a first come first served (FCFS) principle. After service completion at queue j < d, each customer enters queue j + 1 immediately, and customers leave the system after service completion at queue d. For stability, we assume E B ( j) < E [A] ∀ j. See Fig. 1 for a graphical illustration.
Starting with customer 1 entering queue 1 and all other queues empty, we are interested in the probability of overflow during the busy cycle of the total queue. This can be written as P(K N < K 0 ), where K N is the index of the first customer who reaches the overflow level N and K 0 is the index of the first customer to see an empty system upon arrival. The indices K N and K 0 can be expressed in terms of the inter-arrival times A k (at queue 1) and the inter-departure times D k (from queue d), as follows.
For the inter-departure time D k (between customers k − 1 and k, for k ≥ 2), we can k is the, possibly zero, idle time of queue d after the departure of customer k − 1, before customer k enters queue d. Consistently with this, D 1 is simply defined as the sojourn time of customer 1.
Other probabilities of interest that are related to P(K N < K 0 ) are P(L ≥ N ) and P(L (a) ≥ N ), where L denotes the total number of customers in the system in stationarity, and L (a) denotes the same number but immediately after an arbitrary arrival (including the customer that just arrived).
To characterize the decay rate, we need the following. For any random variable X , let Λ X (θ ) = log E e θ X denote its log moment generating function. For all j = 1, . . . , d, we assume that Λ B ( j) (θ ) exists for some θ > 0, and define θ j as Note that we only consider Λ A (−θ) for θ ≥ 0 and so it always exists. Furthermore, we Finally, we define θ min = min j (θ j ), and assume that θ min < ∞, i.e., we do not have P(B ( j) > A) = 0 for all queues, so that the number of customers can grow arbitrarily large and the decay rates of the probabilities of interest will be in (0, ∞). The queue(s) j with θ j = θ min will be called the θ -bottleneck queue(s). Note that this notion can be different from the ρ-bottleneck queue, which is the queue with the smallest server utilization

Main result
In this section we present the main result of this paper, namely the characterization of the decay rates of P(K N < K 0 ), P(L ≥ N ) and P(L (a) ≥ N ). In order to achieve this result, we will prove both a lower bound and an upper bound for the decay of P(K N < K 0 ), which will also turn out to hold for the other decay rates. We will start with the lower bound, with a proof based on a coupling argument.

Lemma 1 (Lower bound) For the decay of
Proof We compare the tandem queue to a single queue with the same arrival process A k and the service process of the jth queue in the tandem, B k . (This is equivalent to comparing our tandem queue to a tandem queue with the same arrival process and all service times set to 0, except the service times of queue j.) The idea of the proof is to show that overflow is more likely in the tandem queue than in the single queue. Define D i , K 0 and K N analogously to D i , K 0 and K N but for the single queue. Denote the inter-departure time of customer i at queue j in the tandem queue by D i , as the single queue does not have idle times during its busy cycle. Since a customer cannot leave the last queue in the tandem before having left queue j, we find for all k = 1, ..., min(K 0 − 1, K 0 − 1), meaning that a customer leaves the tandem queue not earlier than that same customer leaves the coupled single queue.
Based on this we first show, by contradiction, that K 0 ≤ K 0 , i.e., the single queue empties not later than the tandem queue. Suppose that K 0 > K 0 , then (3) still holds for k up to K 0 − 1. By using (2) and (3) we have which implies by definition of K 0 that K 0 ≤ K 0 . Therefore, our assumption K 0 > K 0 is wrong and so we have shown K 0 ≤ K 0 .
Next, we show that the tandem queue reaches the overflow level not later than the single queue. Suppose we have reached overflow in a busy cycle of the single queue, that is, K N < K 0 . Then we have, by using (1) and (3), Hence K N < K 0 implies K N < K 0 , which means that overflow during a busy period in the single queue implies overflow during a busy period in the tandem queue. So we have for any j that where the second step follows by Theorem 1 in [5]. In particular, the above holds for j such that θ j = θ min , which completes the proof.
The next step is to prove an upper bound. We will use a regenerative argument, for which we need that the expected total time spent at or above level N during a busy cycle in which level N is reached, is bounded from below, independently of N .
Even though this sounds very plausible, we could not find a reference. Hence the next lemma, the proof of which is based on first principles, together with the technical assumption P(B (d) > A) > 0 (which will not be a limitation for the main result).
Let L(t) be the total number of customers in the system at time t, and let T be the length of the first busy cycle; then, we define the expected total time τ N spent at or above level N during a busy cycle as τ N = T 0 1{L(t) ≥ N }dt. Lemma 2 Suppose that P(B (d) > A) > 0. Then some c > 0 exists such that for all N = 1, 2, . . . , Proof Consider a busy cycle in which the overflow level N is reached and denote the moment that N is reached for the first time by t. Then the first arrival after t occurs at time t 1 = t + A K N , while the second departure after t occurs at some time (To see this, note that at time t, when customer K N enters, customer K N − N +1 is the first to depart from the system, so the service of customer K N − N +2 at queue d cannot start earlier than at time t.) It is not difficult to check that if t 1 < t 2 , there will be at least N customers in the system between t 1 and t 2 . Thus, for any N we We are now ready to prove the upper bound, based on a regenerative argument and a Chernoff bound.

and a similar statement holds when we replace all limsups by liminfs.
Proof The proof for the liminfs and the limsups is similar; we only give it explicitly for the limsups. The same steps apply to prove the liminfs, in which the supremum has to be replaced by the infimum at the appropriate places. The first inequality follows from a regenerative argument, as in [4], by which we have where T is the length of a busy cycle, which has a finite, constant expectation due to stability of the system, and τ N is the total time spent above level N during a busy cycle, which is bounded from below independently of N ; see Lemma 2.
The remainder of the proof considers the system in stationarity, so time 0 and customer 0 are not necessarily related to the start of a busy cycle. For the second inequality then, fix some arbitrary time t in stationarity, and consider the last customer to arrive before time t, call this customer k. If the number of customers at time t is ≥ N , then the queue length L Note that this probability is independent of the age of A k at time t, as the inter-arrival times are independent, so in fact L (a) k has the same distribution as L (a) , i.e., customer k cannot be distinguished from an arbitrary customer in stationarity, which proves the second inequality.
For the last inequality, we analyze the right-hand side of the equation above (keeping customer index k − N +1 for convenience). We have for any θ > 0, using the Chernoff bound, and the independence of S k−N +1 and k−1 In [3] it is shown that E[e θ S k−N +1 ] is upper bounded by some constant C for all θ ∈ (0, θ min ) (see just after equation (27) in the proof of Theorem 1). Note that the assumptions in [3] are more general than ours, so we can use this result. Hence, we have for any θ ∈ (0, θ min ) where the last step follows by independence of the inter-arrival times. Taking θ → θ min to achieve the best possible bound proves the statement.

Theorem 1 Consider a stable FCFS d-node G|G|1 tandem queue with arrival process and light-tailed service processes at all queues i.i.d. and independent of each other. If
Proof When P(B (d) > A) > 0, statement (4) follows immediately from Lemmas 1 and 3 since all liminfs and limsups (with respect to each of the three probabilities) are equal to Λ A (−θ min ).
To show that (4) also holds in general, we consider a tandem queue where P(B (d) > A) = 0, and two corresponding systems, fed by the same arrival process. One is a queue in isolation as introduced in the proof of Lemma 1. More specifically, we consider a θ -bottleneck queue, i.e., some queue j for which θ j = θ min . In this single queue we define K 0 , K N , L and L (a) analogously to K 0 , K N , L and L (a) in the tandem queue. Note that P(B ( j) > A) > 0 (otherwise we would have θ min = θ j = ∞), and hence (4) holds for this single queue system.
The other system we consider is the original tandem queue augmented with a suitably chosen additional queue d + 1, for example, letting B (d+1) ∼ B ( j) where queue j is a θ -bottleneck queue (another option is to choose B (d+1) ∼ exp(μ) for some sufficiently large μ). In this system we analogously define K 0 , K N , L and L (a) . Clearly we then have E B (d+1) < E [ A] and θ d+1 ≥ θ min , while we also have P(B (d+1) > A) > 0. As a result, for this system (4) also holds.
All three probabilities for the original tandem queue can now be bounded by the corresponding probabilities in the two other systems, as follows: Each of these inequalities follows similarly to the proof of Lemma 1 by coupling arguments; note that setting B (d+1) ≡ 0 in the augmented tandem queue leads to the original tandem, and setting the service times of all but one queue in the original tandem queue leads to the single queue. Thus, the first inequality is straightforward from the proof of Lemma 1, and the second can be shown similarly. For the other two lines, we just need to consider the departure times in the three systems for the same customer to show that L(t) ≤ L(t) ≤ L(t) at any time t, and hence also in stationarity and upon arrivals.
Finally, we take logarithms above, then divide by N , and take limits.
Note that when θ min = ∞, the total number of customers cannot grow arbitrarily large (see Sect. 2), and hence the decay rates in (4) are not properly defined (or are equal to −∞).

Remark 1
As mentioned in the introduction, Bertsimas et al. [1] and Ganesh [3] consider the decay of related overflow probabilities in a more general setting, where certain types of dependence for the arrival and service processes are allowed. We expect that the bounds in our current work can be extended to this case as well, but this will take different techniques and additional effort, in particular to relate P(K N < K 0 ), P(L ≥ N ) and P(L (a) ≥ N ) in the more general setting.