Abstract
We study systems with two classes of impatient customers who differ across the classes in their distribution of service times and patience times. The customers are served on a firstcome, firstserved basis (FCFS) regardless of their class. Such systems are common in customer call centers, which often segment their arrivals into classes of callers whose requests differ in their complexity and criticality. We first consider an M/G/1 + M queue and then analyze the M/M/k + M case. Analyzing these systems using a queue length process proves intractable as it would require us to keep track of the class of each customer at each position in the queue. Consequently, we introduce a virtual waiting time process where the service times of customers who will eventually abandon the system are not considered. We analyze this process to obtain performance measures such as the percentage of customers receiving service in each class, the expected waiting times of customers in each class, and the average number of customers waiting in queue. We use our characterization to perform a numerical analysis of the M/M/k + M system and find several managerial implications of administering a FCFS system with multiple classes of impatient customers. Finally, we compare the performance a system based on data from a call center with the steadystate performance measures of a comparable M/M/k + M system. We find that the performance measures of the M/M/k + M system serve as good approximations of the system based on real data.
1 Introduction
In this paper, we analyze queueing systems with different classes of customers who may abandon (renege from) the system if their waiting time exceeds their patience time, i.e., the maximum amount of time they are willing to wait before abandoning the system. This work is motivated primarily by customer call centers, which often segment their callers into different classes. Since call centers are the prevalent customerfacing service channel of many organizations, they often receive a variety of caller requests which may differ significantly in their service requirements and criticality. For example, banking call centers receive requests as simple as balance inquiries and as complex and critical as dealing with fraudulent activity on a caller’s account. While a service representative can obtain an account balance relatively quickly, handling fraudulent activity takes longer as it involves a higher level of legal expertise and paperwork. Furthermore, because fraudulent activity is usually more critical than obtaining a balance, callers who are calling about fraud may be more patient than callers who are calling to obtain a balance. Because callers’ service requests vary so greatly, call centers often train subsets of their representatives to handle only certain types of service requests. Based on the service the callers request from the phone menu, the automatic call distributor (ACD) segments callers into classes and routes the callers within each class to the appropriately trained subset of representatives. Depending on which types of requests are included in each class, these classes may differ from each other with respect to their distribution of service times and patience times. Consequently, one subset of representatives may serve a queue that receives arrivals from multiple classes that differ from each other in their typical service requirements and their callers’ patience levels. Call centers sometimes give priority to certain classes based on criteria such as the callers’ value to the organization or the criticality of the callers’ service request. However, in an effort to be fair, call centers often serve their callers on a firstcome, firstserved basis (FCFS) regardless of class. Because the FCFS policy is such a common practice, it is important to describe the performance of such systems. In this study, we do this by characterizing the performance of FCFS systems with two customer classes that may differ from each other in both their distribution of service times and their distribution of patience times.
As queue abandonment is a common customer behavior in many service systems, it is not surprising that a large number of studies have been devoted to characterizing queueing systems with impatient customers. Approaches for describing the performance of systems with a single class of impatient customers have included analytical characterizations (Daley [11], Baccelli and Hebuterne [3], Baccelli et al. [2], Stanford [21]) and performance approximations (Garnett et al. [12], Zeltyn and Mandelbaum [25], Iravani and Balcıoğlu [16]). The literature also contains several studies of twoclass systems. In most of these studies, one customer class is prioritized over the other. Choi et al. [10] analyze the underlying Markov process of an M/M/1 queue where one class of customers with constant patience times receives preemptive priority over a second class of customers who have no impatience. The authors obtain the joint distribution of the system size, and the Laplace transform (LT) of the response time of the second class of customers. Brandt and Brandt [6] extend the approach in [10] to include generally distributed patience times for the first class of customers. Iravani and Balcıoğlu [17] use the levelcrossing technique proposed by Brill and Posner [8, 9] to study two M/GI/1 settings. They first consider a preemptiveresume discipline where customers in both classes have exponentially distributed patience times, and then consider a nonpreemptive discipline where customers in the first class have exponentially distributed patience times, but customers in the second class have no impatience. Iravani and Balcıoğlu obtain the waiting time distributions for each class and the probability that customers in each class will abandon.
A handful of papers have dealt with priority queues with two classes of impatient customers in a multiserver setting. Motivated by call centers where callers may leave a voicemail, Brandt and Brandt [5] consider a multiserver system where callers from the first class are impatient and receive priority over callers from the second class, who have no impatience. Callers from the first class who renege may join the second class by leaving a voicemail and are only contacted when the number of idle servers in the system exceeds some threshold. The authors obtain the exact distribution of the number of callers in service from the first class, and approximations of the moments of the number of callers in service from the second class. Jouini and Roubos [18] consider an M/M/s + M queueing system where all customers have the same mean service times and mean patience times, but callers in one of the classes receive nonpreemptive priority. Within each class, customers may be served according to a FCFS or lastcome, firstserved (LCFS) discipline. Jouini and Roubos obtain the mean unconditional waiting times, the mean waiting times conditional on receiving service, and the mean waiting times conditional on abandoning the system for both classes under several policy permutations.
The studies that are most pertinent to our work are those in which classes of impatient customers are served on a FCFS basis, regardless of their class. Gurvich and Whitt [13] approximate the performance of a multiclass call center with impatient callers under the qualityandefficiencydriven (QED) regime introduced by Halfin and Whitt [14]. They analyze how the call center performs under a class of asymptotically optimal routing policies, of which the FCFS policy is a special case. Talreja and Whitt [23] rely on a deterministic fluid model to approximate the performance of a multiserver, multiclass FCFS system with impatient customers in an overloaded, efficiencydriven (ED) regime. Adan et al. [1] design heuristics to determine the staffing levels required to meet target service levels in an overloaded FCFS multiclass system with impatient customers. Van Houdt [24] considers an MAP/PH/1 multiclass queue where customers in each of the classes have a general distribution of patience times. Van Houdt develops a numerical procedure for analyzing the performance characteristics of the system by reducing the joint workload and arrival processes to a fluid queue, and expresses the steadystate measures using matrix analytical methods. His method produces an exact characterization of the waiting time distribution and abandonment probability under a discrete distribution of patience times, and approximations of the same performance measures under a continuous distribution of patience times. Sakuma and Takine [19] study the M/PH/1 system and assume that customers within each class have the same deterministic patience time. In addition to obtaining the waiting time distribution and abandonment probabilities, they obtain the joint queue length distribution. Finally, Sarhangian and Balcıoğlu [20] study two multiclass FCFS systems. The first system is an M/G/1+M queue where customers across classes may differ in their service time distribution, but all customers share common exponentially distributed patience times. The second system is an M/M/c+M queue where customers across classes may differ in their exponentially distributed patience times, but all customers share the same exponentially distributed service times. For both systems, the authors obtain the LT of the virtual waiting time for each of the k classes by exploiting the levelcrossing technique in [8] and [9]. They then relate the virtual waiting time to the actual waiting time to compute steadystate performance measures such as the mean waiting times and the percentage of customers who renege from each class.
In this study, we analyze two systems in which two classes of impatient customers are served according to an FCFS policy. The distinction between our setting and the settings in previous studies is that in our setting customers across classes may differ in their distributions of their service times and patience times, while in previous settings customers across classes may only differ in one of the two distributions. This distinction is crucial in characterizing the performance of multiclass systems with customers whose service times and patience may vary based on the complexity and criticality of their requests. We first consider an M/G/1+M queue and then analyze an M/M/k+M queue. To characterize the performance of these systems, we introduce a virtual waiting time process as described in Benes [4] and Takács et al. [22] (see Heyman and Sobel [15], pp. 383–390 for details). In a virtual waiting time process, the service times of customers who will eventually abandon the system are not considered. By analyzing this process, we obtain performance characteristics such as the percentage of customers who receive service in each class, the expected waiting times of customers in each class, and the average number of customers waiting in queue from each class. Note that although a related formula for the virtual waiting time in a single class M/G/1+PH queue is reported in [7], it is not suitable for direct computation as it consists of an exponentially growing number of terms. We next perform a numerical analysis of the M/M/k+M system under various arrival rates, mean service times, and mean patience times. Our analysis demonstrates that accounting for differences across classes in the distribution of customers’ service times and patience times is critical, as the performance of our system differs considerably from a system where only the service time distribution varies across classes. The results of our numerical analysis have broad managerial implications including service level forecasting, revenue management, and the evaluation of server productivity. As a final exercise, we compare the simulated performance of a system based on data from a multiclass call center with the performance measures of a comparable M/M/k+M system. To construct our simulated system, we select two classes from the data that differ in their distribution of service times and caller patience times. We find that the performance measures from the M/M/k+M system serve as good approximations of the performance of the simulated system based on the call center data.
The remainder of the paper is organized as follows: In Sect. 2, we analyze the M/G/1+M queue. In Sect. 3, we analyze the M/M/k+M queue, including a special case where the two classes share a common mean service time. In Sect. 4, we derive steadystate performance measures. In Sect. 5, we present our numerical analysis, and in Sect. 6 we compare the performance of the simulated system based on real data and our analytical characterization of the M/M/k+M system.
2 M/G/1+M system
We begin with a singleserver queueing system serving two classes of customers; see Fig. 1. Assume that class i (\(i=1,2\)) customers arrive according to independent Poisson processes with rate \(\lambda _i\) and need independent and identically distributed (iid) service times with cdf \(G_i(\cdot )\) and mean \(\tau _i\). The customers are impatient, and customers from class i leave the system after an exponential amount of time with parameter \(\theta _i\) (called the patience time) if their queueing time is longer than their patience time. The patience times of customers are independent of each other. For this system, we are interested in performance characteristics such as the longrun fraction of customers entering service, server utilization, and the expected waiting time for service. Note that due to the impatience of the customers in each class, the system will always be stable even if the total arrival rate exceeds the service rate.
It seems natural to study this model through its queue length process. However, keeping track of the number of customers from each class who are in the queue is not sufficient. We would also need to keep track of the class of each customer at each position in queue, since patience times depend on customer class. This renders the Markov process intractable. Thus, we introduce the virtual queueing time process below. This process appears to be tractable.
Let W(t) be the virtual queueing time at time t. We know that W(t) decreases with rate 1 at all times, while it is positive. If an arrival from class i occurs at time t and \(W(t) = w\), the arrival leaves without service with probability \(1\mathrm{{e}}^{\theta _i w}\), or enters service with probability \(\mathrm{{e}}^{\theta _i w}\), and needs a random amount of service with cdf \(G_i(\cdot )\). Hence, the distribution of \(S_i(t)\), the size of the upward jump at time t due to a class i arrival, given \(W(t) = w\), is given by
and thus
where \({\tilde{G}}_i(\cdot )\) is the LST of \(G_i(\cdot )\). Let
and
where W is the limit (in distribution) of W(t) as \(t \rightarrow \infty \). In the interval \((t,t+h],\) an arrival of type i occurs with probability \(\lambda _ih + o(h)\), and no event occurs with probability \(1  (\lambda _1 + \lambda _2)h + o(h)\) as \(h \rightarrow 0\). Then, we get
Inserting \(\mathrm{{e}}^{sh} = 1 + sh + o(h)\), rearranging terms and dividing by h, and then letting \(h \rightarrow 0\), we obtain
Now let \(t \rightarrow \infty \). Then, \(\frac{d}{dt}\psi (s,t) \rightarrow 0\), \(\psi (s,t) \rightarrow \psi (s)\) and \(\phi (s,t) \rightarrow \phi (s)\), so
Dividing by s and using
we finally get
Note that \({H}_i(s)/(\lambda _i \tau _i)\) is the LST of the equilibrium distribution of the service times of customers from class i.
Sarhangian and Balcıoğlu [20] have derived this equation by a different method; however, they could solve it only in the special case \(\theta _1 = \theta _2\). Here, we develop an efficient method to solve Eq. (2) even when \(\theta _1 \ne \theta _2\). We have included the detailed derivation of this equation, since the same steps can be used to derive the equations in the multiserver setup studied in the next section.
Repeated application of Eq. (2) shows that its solution can be written as
where
The terms \(c_{i,j}(s)\) satisfy the recursion
with initially \(c_{0,0} (s)= 1\) and \(c_{i,j} (s)= 0\) if \(i<0\) or \(j<0\). The recursive procedure to obtain (3) as well as convergence properties of the series c(s) will be explained in more detail in Sect. 3, where we analyze the multiserver system. Finally, to determine \(p_0\), we use \(\psi (0)=1\) in (2), yielding
so
We shall see in Sect. 4 that many performance measures can be computed in terms of \(\psi (\theta _i)\).
Remark 1
(HyperExponential impatience) For HyperExp(\(\theta _{ij}, p_{ij}\)) impatience of class i customers, it is straightforward to derive the equation
where
One can also show that our results reduce to known results (see Daley [11] for example) when specialized to a system with a single class of customers.
3 M/M/k+M system
3.1 Model
Now we consider a FCFS multiserver system with k servers serving two classes of customers; see Fig. 2. However, unlike in Sect. 2, we now assume that the service times are exponentially distributed. To be precise, we assume that customers from class i arrive according to a PP(\(\lambda _i\)), need iid Exp(\(\mu _i\)) service times, and exhibit iid Exp(\(\theta _i\)) patience times (\(i=1,2\)). If service does not start before the patience time expires, the customer leaves without service.
3.2 Virtual queueing time process
As in the singleserver case, we study the virtual queueing time process, augmented by a supplementary variable to construct a Markov process. Let W(t) be the virtual queueing time at time t in this system. This is the queueing time that would be experienced by a virtual customer arriving at time t. Let \(N_i(t)\) be the number of servers serving a class i customer just after time \(t+W(t)\) but before the next customer (if there is one) entering service at time \(t+W(t)\). This means that \(N_i(t)\) is the number of servers busy with a class i customer just before a customer arriving at time t enters service. So \(N_1 (t) + N_2 (t)\) is always at most \(k1\).
This unusual definition enables us to determine the size of the upward jump of the virtual queueing time if an arriving customer at time t decides to join the queue (since his patience exceeds W(t)). The jump is the minimum of the service time of the arriving customer and the residual service times of the customers in service at the moment he enters service at time \(t+W(t)\). As we will explain below, \(\{(W(t),N_1(t),N_2(t)), t \ge 0\}\) is a Markov process with upward jumps, the size of which depend on W(t), and a continuous downward deterministic drift of rate 1 between jumps.
Suppose \(W(t)=0\) and \((N_1(t),N_2(t)) = (i,j)\). Then, (i, j) is the number of busy servers of classes 1 and 2 at time t. For \(0 \le i+j \le k1\), the transition rates of services in state (0, i, j) are given by
and for \(0 \le i+j < k1\) the transition rates of arrivals are
This accounts for all transitions from states \((0,i,j), \; 0 \le i+j \le k1\), except for the transition rates of arrivals in states with \(i+j=k1\). These transition rates are described below.
Now suppose the state of the trivariate process at time t is (w, i, j) with \(w \ge 0\) and \(i+j=k1\). This means that just after time \(t+w\) we will have i busy servers of class 1 and j servers of class 2. Consider an arrival from class 1 at time t. This customer has to wait an amount w for service to begin. He reneges before his service starts with probability \(1  \mathrm{{e}}^{\theta _1 w}\), in which case the state does not change, and he enters service at time \(t+w\) with probability \(\mathrm{{e}}^{\theta _1 w}\). Then, the next departure occurs after an Exp\(((i+1)\mu _1 + j\mu _2)\) time X and the departure at time \(t+w+X\) is from class 1 with probability \((i+1) \mu _1 /((i+1) \mu _1 + j \mu _2)\) and from class 2 with probability \(j \mu _2 /((i+1) \mu _1 + j \mu _2 )\). In the first case, the state jumps at time t from (w, i, j) to \((w+X,i,j)\). In the second case, the state jumps to \((w+X,i+1,j1)\). The process is similar in the case of a class 2 arrival. Hence, we get the following transition rates from states (w, i, j) with \(w \ge 0\) and \(i+j=k1\):
Between upward jumps, the W process decreases continuously and deterministically at rate 1, while it is positive. When W reaches 0 in state (0, i, j), the process will stay in this state until either an arrival or service completion occurs.
A sample path of the Markov process \(\{(W(t),N_1(t),N_2(t)), t \ge 0\}\) is shown in Fig. 3. It describes the following events: At time \(t_0 = 0\), the system is in state (0, 1, 1), which means that the virtual queueing time is 0, two servers are busy, one with a class 1 customer and the other with a class 2 customer. Then, a class 2 customer arrives at time \(t_1\), so the system states change to state (0, 1, 2). At time \(t_2\), a class 1 customer arrives and W(t) jumps an Exp(\(2\mu _1+2\mu _2\)) amount. At time \(t_2+W(t_2)\), a class 2 customer will complete service (from the two class 1 and two class 2 customers in service), so \((N_1(t),N_2(t))\) jumps to (2, 1) at time \(t_2\). The next customer arrives at time \(t_3\). He is from class 1, and his patience exceeds \(W(t_3)\). So this customer joins the queue and W(t) jumps an Exp(\(3\mu _1+\mu _2\)) amount. At time \(t_3+W(t_3)\), the class 2 customer will complete service before one of the three class 1 customers in service, so \((N_1(t),N_2(t))\) jumps to (3, 0) at time \(t_3\). At time \(t_4\), a class 2 customer arrives, but his patience is less than \(W(t_4)\), and thus he leaves the system without receiving service. At time \(t_5 = t_3+W(t_3)\), the class 2 customer completes service and the virtual queueing time reaches 0. The last event in Fig. 3 occurs at time \(t_6\). A class 1 customer completes service, and the system state jumps to (0, 2, 0).
We next determine the Laplace–Stieltjes transform (LST) of the virtual queueing time.
3.3 Steadystate analysis
We first introduce the notation
and
where \((W,N_1,N_2)\) is the limit (in distribution) of \((W(t),N_1(t),N_2(t))\) as \(t \rightarrow \infty \).
We start with the balance equations for the steadystate probabilities \(p_{i,j}\). Let
Then, the balance equations can be written in vector–matrix form as
where the \((n+1) \times (n+1)\) matrix \(\varDelta _{n} = \mathrm{diag}(n\mu _2, \mu _1 + (n1)\mu _2, \ldots , n\mu _1)\) and the nonzero elements of the \((n+1) \times (n+2)\) matrix \(\Lambda _n = [\lambda _{n,i,j}]\) and the \((n+1) \times n\) matrix \(M_n = [\mu _{n,i,j}]\) are given by
The above equations can be simplified to
The \((n+1) \times n \) matrices \(R_n\) in (5) recursively follow from
where I denotes the identity matrix.
Now we proceed to derive differential equations for the timedependent LSTs \(\psi _i(s,t)\) and then take t to infinity to obtain the steadystate equations. For small \(h > 0\), we get
where, by convention, \(p_{i,j} (t) = 0\) if \(i < 0\) or \(j <0\). Inserting \(\mathrm{{e}}^{sh} = 1 + sh + o(h)\), rearranging terms and dividing by h, and then letting \(h \rightarrow 0\), we obtain
Now let \(t \rightarrow \infty \). Then, the system reaches steady state and \(\frac{d}{dt}\psi _i(s,t) \rightarrow 0\), \(\psi _i(s,t) \rightarrow \psi _i(s)\), \(\phi _i (s,t) \rightarrow \phi _i(s)\) and \(p_{i,j} (t) \rightarrow p_{i,j}\), so
It is useful to rewrite the above equations in vector–matrix form. Let
Then, (6) can be written as
and, by substituting (5),
where the nonzero elements of the \(k \times k\) matrices \(A_1(s) = [a_{1,i,j}(s)]\) and \(A_2(s) = [a_{2,i,j}(s)]\) are given by
For \(s > 0\), we can divide Eq. (7) by s and use the notation
to obtain
This equation is suitable to recursively determine \(\varvec{\psi }(s)\) for \(s >0\). Let
Then, Eq. (9) yields, for \(i, j \ge 0\),
To obtain \(\varvec{\psi }(s) = \varvec{\psi }_{0,0}(s)\), we can repeatedly apply the above equation:
and so on. This results after n iterations in the following expression for \(\varvec{\psi }(s)\):
The \(k \times k\) matrices \(C_{i,j}(s)\) are defined as follows: A sequence of grid points \({\varvec{p}}= \{(i_0,j_0), (i_1,j_1), \ldots , (i_n, j_n)\}\) is called a path from \((i_0,j_0)\) to \((i_n,j_n)\) if each of its steps \((i_{l+1}, j_{l+1})  (i_{l},j_{l})\) are either (1, 0) or (0, 1). For the path \({\varvec{p}}\), we introduce (see Fig. 4)
where, for \(l = 0, \ldots , n1\),
For the path \({\varvec{p}}= \{ (i_0,j_0)\}\), we set \(C_{\varvec{p}}(s) = I\). Let \({\mathcal {P}}(i,j)\) be the set of all paths from (0, 0) to (i, j). Then, \(C_{i,j} (s)\) is defined as
Note that, for \(i+j > 0\), the \(k \times k\) matrices \(C_{i,j}(s)\) can be recursively calculated from
where \(C_{0,0} (s) = I\) and \(C_{i,j} (s)\) is the all zero matrix if \(i<0\) or \(j<0\). The following lemma states that the series of \(C_{i,j} (s)\) is well defined for all \(s >0\).
Lemma 1
For each \(\delta >0\), the series \(\sum _{i=0}^\infty \sum _{j=0}^\infty C_{i,j}(s)\) is absolutely and uniformly convergent for all \(s>\delta \).
Proof
Fix \(\delta > 0\). It suffices to show that there are constants M and \(r < 1\) such that for all \(s>\delta \) and \(i+j \ge 0\),
where E is the allone matrix and the inequality is component wise. In this bound, s needs to stay away from zero, since \(H_1(s)\) and \(H_2 (s)\), and thus by (12) also \(C_{i,j} (s)\), are unbounded as s approaches zero. Now fix \(r < 1\). Since \(H_l (s) \le \lambda _l E / s\) (\(l = 1, 2\)), there is an \(N \ge 0\) such that, for \(s > \delta \) and \(i+j \ge N\),
Recursion (12) implies that, for each \(i,j \ge 0\), \(C_{i,j} (s)\) is bounded for \(s > \delta \). Hence, there is a (sufficiently large) M such that (13) is valid for \(s > \delta \) and the finitely many \(i+j \le N\). By induction we now prove that (13) is valid for all \(i+j \ge N\). Suppose it holds for all \(i+j = n\) (which is true for \(n=N\)). From (12) and (14), we get, for \(s > \delta \) and \(i+j = n+1\),
where the last inequality follows from the induction hypothesis.\(\square \)
Since \(D_{ij} (s)\) are uniformly bounded for all \(s>\delta >0\) and \(i+j \ge 0\), we immediately get the following.
Corollary 1
The series \(\sum _{i=0}^\infty \sum _{j=0}^\infty D_{i,j} (s) C_{i,j}(s)\) is absolutely and uniformly convergent for all \(s>\delta >0\).
Using that \(\psi _{i,j}(s) \le 1\), the second term in (11) is bounded by
So it vanishes as \(n \rightarrow \infty \) by virtue of the absolute convergence of the series of \(C_{i,j} (s)\). Hence, taking \(n \rightarrow \infty \) in (11), we get, from Corollary 1,
where
In particular, we have
To complete the LST of the virtual queueing time, we need to determine \({\varvec{p}}_{k1}\). First we set \(s=0\) in (7), yielding
where the second equality follows from (17). To uniquely determine \({\varvec{p}}_{k1}\), we finally need the normalizing equation
where \({\varvec{e}}\) is the allone vector and \({\varvec{p}}_n\) is given by (5) for \(0\le n < k1\). However, Eq. (19) requires the computation of \(\varvec{\phi }(0)\), which is the hard step. Taking the derivatives on both sides of (7) and setting \(s=0\), we get
Here, prime indicates derivative with respect to s. Thus, to calculate \(\varvec{\phi }(0)\) we need \(\varvec{\psi }'(s)\) at \(s=\theta _1\) and \(s=\theta _2\). For this we can use (15). Differentiating (15) yields
The terms \(C'_{i,j} (s)\) can be recursively computed by differentiating (12):
where \(C'_{i,j} (s)\) is the all zero matrix if \(i=j=0\) or if \(i<0\) or \(j<0\). Term by term differentiation of (16) is justified by the following two lemmas.
Lemma 2
For each \(\delta >0\), the series \(\sum _{i=0}^\infty \sum _{j=0}^\infty C'_{i,j}(s)\) is absolutely and uniformly convergent for all \(s>\delta \).
Proof
The proof is similar to the proof of Lemma 1. Fix \(\delta > 0\). It suffices to show that there are constants M and \(r < 1\) such that, for all \(s>\delta \) and \(i+j \ge 0\),
First fix \(r < 1\). Since \(H_l (s) \le \lambda _l E / s\) and \(H'_l (s) \le \lambda _l E / s^2\) (\(l = 1, 2\)), there is an \(N \ge 0\) such that, for \(s > \delta \) and \(i+j \ge N\),
Recursions (12) and (22) imply that, for each \(i,j \ge 0\), \(C_{i,j} (s)\) and \(C'_{i,j} (s)\) are bounded for \(s > \delta \). Hence, there is a (sufficiently large) M such that both (13) and (22) are valid for \(s > \delta \) and the finitely many \(i+j \le N\). Following the induction steps in the proof of Lemma 1, it follows that (13) is valid for all \(i+j \ge N\). We now show that (22) also holds for all \(i+j \ge N\). Suppose that (22) holds for all \(i+j = n\) (which is true for \(n=N\)). From (22) and (23), we get, for \(i+j = n+1\),
where the last inequality follows form the induction hypothesis. \(\square \)
Lemma 3
For each \(s >0\), the derivative of C(s) exists and is equal to
Proof
Fix \(s > 0\). First note that the series converges by Lemmas 1–2 and the fact that \(D_{i,j} (s)\) and \(D'_{i,j} (s)\) are uniformly bounded for all \(i+j \ge 0\). It suffices to prove that, for each sequence \(\{h_n\}\) converging to 0 such that \(s+h_n > 0\) for all n,
where \(B_{i,j} (s) = D_{i,j} (s) C_{i,j} (s)\). Let {\(h_n\)} be such a sequence. Thus, there is a \(\delta > 0\) such that \(s+h_n > \delta \) for all n. According to (13) and (22) and that the fact that \(D_{i,j} (s+h_n)\) and \(D'_{i,j} (s+h_n)\) are uniformly bounded for all n and \(i+j \ge 0\), there are constants M and \(r < 1\) such that, for all n and \(i+j \ge 0\),
We need to show that for each \(\epsilon >0\) there is an N such that, for \(n > N\),
Let \(\epsilon > 0\). For given n and \(i,j \ge 0\), it follows from the mean value theorem that there is an \(0<\eta <1\) such that \( (B_{i,j}(s+h_n)B_{i,j}(s))/h_n = B'_{i,j} (s + \eta h_n) \). Hence, by (24),
Note that M and r do not depend on n, i, j. So there is a constant K such that, for all n,
Further, for given \(i, j\ge 0\), \((B_{i,j}(s+h_n)B_{i,j}(s))/h_n\) converges to \(B'_{i,j} (s)\) as n tends to infinity. Hence, there is an N such that, for \(n > N\),
Combining (26) and (27) yields (25). \(\square \)
Substitution of (17) and (21) with \(s = \theta _i\) in (20) yields
and the normalization equation (19) can be rewritten as
The above findings are summarized in the following theorem.
Theorem 1
The steadystate LST \(\varvec{\psi }(s)\) of the virtual queueing time satisfies
where C(s) is defined by (16) and the probability vectors \({\varvec{p}}_n\) for \(0 \le n \le k1\) are the solution to the system of linear equations (5), (18) and (28).
3.4 Special case \(\mu _1 = \mu _2 = \mu \)
We now assume \(\mu _1=\mu _2=\mu \). This case has also been studied by Sarhangian and Balcıoğlu [20]. The problem simplifies considerably, since we do not need to keep track of \(N_1(t)\) and \(N_2(t)\) separately, only \(N(t) = N_1(t)+N_2(t)\). Define
Then, the balance equations (4) can be simplified to
and (9) reduces to
The solution of this equation is given by (cf. (15))
where
For \(i + j > 0\), the terms \(c_{i,j}(s)\) are determined from the recursion
with \(c_{0,0} (s) = 1\) and \(c_{i,j} (s) = 0\) if \(i<0\) or \(j<0\). The normalization equation becomes
Together with (29), this yields, for \(n = 0, \ldots , k1\),
where \(\rho = \frac{\lambda _1+\lambda _2}{\mu }\).
4 Performance measures
Now we show how many useful performance measures in steady state can be computed in terms of the LST evaluated at \(\theta _1\) and \(\theta _2\). Suppose the \(M/M/k+M\) system is in steady state. An arrival faces a queuing time of W. If the arrival is from class i, he will enter service if his impatience time \(T_i\) is longer than W. For class 1, this probability is given by the following (and similarly for class 2, by replacing \(\theta _1\) by \(\theta _2\)):
Here, we are using that \(W=0\) and \(E(\mathrm{{e}}^{\theta W};N_1 = i, N_2 = j) = p_{ij}\) when \(i+j < k1\). Next, using Little’s law, we see that the expected number of servers busy serving class 1 customers is given by
and the steady state throughput is equal to
Next we compute the expected time of a class 1 customer waiting for service. This is given by
By Little’s law, we obtain the following for the expected number of class i customers waiting for service:
Finally, the expected conditional waiting time of class i customers entering service follows from
where
These formulas simplify considerably when applied to the \(M/G/1+M\) system. In particular, the probability that the server is busy serving a class i customer is given by
where \(\psi (s)\) is as defined in Eq. (2). The probability that the server is busy is given by
In steady state, the throughput is given by
and the reneging rate by
The expected number of class i customers waiting for service in steady state is given by
The expected number of class i customers in the system in steady state is given by
This implies that, in the special case when \(\tau _i = 1/\theta _i\),
This is expected, since in this case the system behaves like an infinite server queue for each class of customers.
5 Numerical analysis
Recall that the previous analytical models of multiclass FCFS systems have allowed for customers across classes to differ in either their service time distributions or their patience time distributions, but not in both distributions [19, 20, 24]. A contribution of our work is that we allow customers across classes to differ in both distributions. This enables us to model service systems such as call centers that segment their arrivals into classes of callers whose requests may differ greatly in their complexity and criticality. We are therefore interested in using our characterization to compare the performance of a system where customers across classes differ in only one of the distributions with the performance of a system where the customers across classes differ in both distributions. To do this, we conduct a numerical analysis by using the performance measures that we derived for the M/M/k+M system in Sect. 4.
We compare the performance of three systems over a range of system loads. In all of the systems, requests from class 1 have a mean service time of 1 unit (\(\mu _1=1\)) and requests from class 2 have a mean service time of .5 units (\(\mu _2=2\)). In the first system, customers in both classes are equally patient, with a mean patience time of 2/3 units (\(\theta _1,\theta _2=1.5\)). We call this system the base system as customers from the two classes differ in their distribution of service times but not in their distribution of patience times. In the second system, the mean patience time of class 1 customers is 1 unit (\(\theta _1=1\)), while the mean patience time of class 2 customers is .5 units (\(\theta _2=2\)). We call this system the positive system as there is a positive correlation across classes between customers’ service times and patience times, i.e., class 1 customers have longer average service times and longer mean patience times. In the third system, class 1 customers have a mean patience time of .5 units (\(\theta _1=2\)), while class 2 customers have a mean patience time of 1 unit (\(\theta _2=1\)). Holding with our naming convention, we call this system the negative system. In each scenario in our analysis, the arrival rate of the class 1 and the class 2 customers are equal (\(\lambda _1=\lambda _2\)). To vary the system load, we hold the number of servers in the system at 5, while varying the total arrival rate (\(\lambda =\lambda _1+\lambda _2\)) between 6 and 20.
In Fig. 5, we present four steadystate performance measures from the three systems. The measures include the percentage of all customers who receive service, the mean waiting time of all customers, the system throughput, and the average service time of customers who receive service. We discuss each of these measures:

% All customers receiving service (Fig. 5a) The percentage of all customers who receive service is lowest in the positive system and highest in the negative system over all arrival rates. Recall that class 1 customers are more patient than class 2 customers in the positive system and vice versa in the negative system. Consequently, a greater (smaller) proportion of the customers who receive service are from class 1 in the positive (negative) system. Also recall that class 1 customers have longer average service times than class 2 customers. Since servers in the positive (negative) system spend a greater (smaller) percentage of their time serving the customers with longer service times, their aggregate service rate decreases (increases), which decreases (increases) the percentage of all customers who receive service. This result has ramifications for service level forecasting as one of the common measures of service level is the percentage of customers who receive service. This demonstrates that managers who do not account for differences in the distribution of customers’ service times and patience times across classes may produce inaccurate service level forecasts.

Mean waiting time of all customers (Fig. 5b) Irrespective of the arrival rate, customers have the highest mean waiting times in the positive system. However, the mean waiting times of customers in the base system and the negative system are nearly the same. This result was surprising as we expected the average waiting times of the customers in the negative system to be lowest since the average waiting times of customers are highest in the positive system. Because this trend was puzzling to us, we calculate the mean waiting times conditional on receiving service and on reneging for both systems. We find two trends. First, in both of the systems the average waiting times of customers who receive service are higher than the average waiting times of customers who renege. Second, both of these waiting time measures are lower in the negative system than in the base system. So, if both of these measures are lower in the negative system, why are the average waiting times of all customers nearly equal between the two systems? The answer lies in the fact that a higher percentage of customers receive service in the negative system. One can think of the average waiting time of all customers as a weighted average of the waiting times of the customers who receive service and of the customers who renege, where the respective weights are the percentage of customers who receive service and the percentage of customers who renege. Recall from Fig. 5a that the percentage of customers who receive service is highest in the negative system. This means that in the negative system there is a greater weight from a subset of customers who tend to wait longer, since the average waiting time for customers who receive service is greater than the average waiting time of customers who renege. The result is that the average waiting times of all customers in the base system and the negative system are nearly equal even though the average waiting times of customers who receive service and the average waiting times of customers who renege are both lower in the negative system.^{Footnote 1} This result again has ramifications for service level forecasting as the average waiting time of all customers is another common measure of service level.

System throughput (Fig. 5c) We observed in Fig. 5a that the percentage of customers receiving service is lowest (highest) under the positive (negative) system. It is therefore not surprising that system throughput is lowest (highest) under the positive (negative) system. However, what is surprising is that in the positive system, throughput is first increasing in the arrival rate, but is then decreasing after some threshold arrival rate. Initially, increasing the arrival rate increases server utilization and hence system throughput. However, the gains in throughput due to the increase in server utilization diminish and are eventually offset by a reduction in the effective service rate of the system. The reason that the service rate is decreasing in system load is that the percentage of time the servers spend with customers from class 1 (who take longer to serve) is increasing in load. This is because a higher proportion of class 2 customers renege as load increases since they are less patient than class 1 customers. The interesting takeaway is that in a system where customers’ service times and patience times are positively correlated, increasing traffic can actually decrease throughput. This result has potential implications for systems with limited service capacity that generate revenue based on system throughput, for example, restaurants. Mangers of such systems may attempt to increase traffic through marketing efforts in order to generate additional revenue but may instead reduce their revenue if the customers in their system who take longer to serve are also more patient.

Average service time Fig. 5d In the base system, the average service time of customers who receive service remains the same regardless of the arrival rate. This is because customers in each class are equally patient, which makes the proportion of customers receiving service who are from each class invariant to the system load. However, in the positive (negative) system, the average service time increases (decreases) as the arrival rate increases. In the positive system, the class 1 customers are more patient. Thus, as the load on the system increases, the proportion of customers receiving service who are from class 1 increases. Because class 1 customers take longer to serve, the average service time increases as the arrival rate increases. The reverse is true in the negative system, where class 2 customers are more patient but take less time to serve. This relationship between system load and average service time has managerial implications. Servers in customerfacing systems are often evaluated and incentivized based on their average service times. For example, a common practice in call centers is to provide financial rewards for agents to keep their average service time under some threshold. Our analysis shows that managers may reach false conclusions regarding their servers’ productivity if they evaluate their servers based solely on their average service times. In the negative system, managers may receive the impression that servers are speeding up when the arrival rate is higher. Consequently, managers may wrongly reward their servers for their supposed efforts to work faster under heavy loads. Conversely, in the positive system, managers may falsely reprimand their servers for allegedly slowing down as arrival rates increase. Our model demonstrates than an observed correlation between average service times and system load, such as we see in Fig. 5d, may have no correlation with the servers’ efforts. Rather, the observed correlation may be entirely due to a correlation across classes between customers’ service times and patience times.
We make one final observation regarding the performance of these systems as the arrival rate tends to infinity. We have shown that of the customers who receive service, the proportion who are from the more patient class increases as the arrival rate increases. Intuitively, we would expect this proportion to tend to one as the arrival rate tends to infinity due to the stochastic dominance of the patience times of one class over the other. Indeed, as we increase the arrival rate in the negative and positive systems from our numerical examples, we observe this phenomenon. For example, when we set the arrival rate to 1000 for each class in the positive system, of the callers who receive service, 95.9% are from class 2. An outcome of this phenomenon is that as the arrival rate tends to infinity the average service time and throughput of the system tends to the same performance measures of an overloaded system comprised entirely of customers from the more patient class. Hence, in the positive system the average service time tends to 1 and system throughput tends to 5, while in the negative system average service time tends to .5 and system throughput tends to 10.
Overall, our numerical analysis demonstrates the importance of accounting for differences across classes in the distribution of customers’ service times and patience times. We see that differences in these distributions may substantially affect key performance measures, which have a wide range of managerial implications, including service level forecasting, revenue management, and the evaluation of server performance. Consequently, a contribution of our work is that our analytical characterizations may be used to demonstrate to managers some of the implications of administering a multiclass FCFS queue.
6 Model as performance approximation
Managers may be interested in knowing whether they may use the performance measures from our model to approximate the performance of their real world systems. Thus, as a final exercise we compare the simulated performance of a system based on real data with the performance of a comparable M/M/k+M system. Our data come from a multiclass call center of a small USbased bank. To construct our simulated system, we select two classes from the data which differ in their distribution of service times and caller patience times. The first class is comprised of general banking calls such as balance inquiries, and the second class is comprised of technical support calls such as password reset requests. In Fig. 6 we display the estimated density function of the service times from each class, and in Fig. 7 we display the estimated distribution function of the callers’ patience times from each class.^{Footnote 2} Note that class 1 callers tend to have shorter service times and shorter patience times than class 2 callers,^{Footnote 3} i.e., there is a positive correlation across the classes between service times and patience times. To make the systems comparable, we set the mean service time and mean patience time of the two classes in the M/M/k+M system equal to the estimated means of the two classes in the simulated system. Finally, we are interested in understanding to what extent the accuracy of the approximation increases when service rates are allowed to differ across classes. Thus, we also obtain the performance measures of this system under the restriction that the service rates across classes are equal. To do this, we calculate the average service rate across both classes and use the formulas from Sect. 3.4 to obtain the performance measures.
In all of the systems, we assume that calls from each class arrive according to independent Poisson processes with equal arrival rates. To compare the performance of the systems across different loads, we hold the number of servers in the system at 5 and vary the total arrival rate to be 36, 45, 60, and 120 calls per hour. In Table 1, we compare the performance of the three systems, where “Simulation” corresponds to the simulated system, “\(\mu _1\ne \mu _2\)” corresponds to the analytical characterization of the system where service rates are allowed to differ across classes and “\(\mu _1=\mu _2\)” corresponds to the analytical characterization where we restrict the service rates to be the same across classes. For each class, we present the average waiting time in seconds (AWT), the percentage of callers who receive service (%RS), and the average number of callers waiting in queue (AQ). We also present the server utilization (Util %), and the average service time of callers who receive service (AST(All)). We measure how close our analytical characterizations come to the simulated system using relative error, which is given by \(\text {simulated}  \text {analytical}/\text {simulated}.\) Overall, we find that the performance measures of the \(\mu _1\ne \mu _2\) system serve as good approximations of the simulated system, with relative errors of no greater than 3.76%. Of particular note is the high accuracy in predicting the percentage of callers who receive service (%RS) and the average service time of callers who receive service (AST(All)), with relative errors typically less than 1%. Under the lowest arrival rate (\(\lambda =36\)), the \(\mu _1=\mu _2\) characterization provides nearly the same approximation accuracy as the \(\mu _1\ne \mu _2\) characterization. However, as the arrival rate increases, the accuracy of the \(\mu _1 = \mu _2\) characterization decreases relative to the \(\mu _1\ne \mu _2\) characterization. In particular, the \(\mu _1 = \mu _2\) characterization underforecasts AWT, AQ, Util% and AST(All) while overforecasting %RS. This is due to the fact that, in this system, callers who tend to be more patient also tend to have longer service times. While the \(\mu _1\ne \mu _2\) characterization accounts for this correlation, the \(\mu _1 = \mu _2\) characterization does not. Overall, these results demonstrate that managers of twoclass FCFS service systems may use our analytical characterization of the \(\mu _1\ne \mu _2\) system to produce good approximations of the performance of their systems by collecting the mean service time and the mean patience time of customers in each class.
Notes
For example, in the base system with \(\lambda =20\), the average waiting time of customers who receive service is 0.654 and the average waiting time of customers who renege is 0.337. In the negative system, the respective averages are 0.641 and 0.324. However, in the base system the percentage of customers receiving service is 33.4%, while it is 37.2% in the negative system. Hence, the total average waiting time in the base system is \(0.334\times 0.654 + (10.334)\times 0.337 = 0.443\), and the total average waiting time in the negative system is \(0.372\times 0.641 + (10.372)\times 0.324 = 0.442\), which are nearly equal.
Because we do not know the patience times of the callers in our data who received service, the patience time data are rightcensored. Thus, we estimate the patience time distribution using the Kaplan–Meier estimator, which accounts for this form of censoring. However, since waiting times in this call center were not long enough to reveal the entire distribution of patience times, the Kaplan–Meier estimator only estimates a portion of the distribution. To fill in the remainder of the distribution, we assume that callers’ patience times are exponentially distributed with rate parameters equal to the callers’ reneging rate over the estimated portion of the distribution.
The mean service times of callers from class 1 and class 2 are 223.97 and 448.82 s, respectively. The mean patience times are 394.08 and 946.53 s, respectively.
References
Adan, I., Boon, M., Weiss, G.: Design and evaluation of overloaded service systems with skill based routing, under FCFS policies. Perform. Eval. 70(10), 873–888 (2013)
Baccelli, F., Boyer, P., Hebuterne, G.: Singleserver queues with impatient customers. Adv. Appl. Probab. 16(4), 887–905 (1984)
Baccelli, F., Hebuterne, G.: On queues with impatient customers. [Research Report] RR0094, HAL Id: inria00076467, INRIA (1981)
Benes, V.E.: General Stochastic Processes in the Theory of Queues. AddisonWesley Publishing Company, Inc., Reading, MA (1963)
Brandt, A., Brandt, M.: On a twoqueue priority system with impatience and its application to a call center. Methodol. Comput. Appl. Probab. 1(2), 191–210 (1999)
Brandt, A., Brandt, M.: On the twoclass M/M/1 system under preemptive resume and impatience of the prioritized customers. Queueing Syst. 47(1), 147–168 (2004)
Brandt, A., Brandt, M.: Workload and busy period for M/GI/1 with a general impatience mechanism. Queueing Syst. 75(2–4), 189–209 (2013)
Brill, P., Posner, M.J.: Level crossings in point processes applied to queues: singleserver case. Oper. Res. 25(4), 662–674 (1977)
Brill, P., Posner, M.J.: The system point method in exponential queues: a level crossing approach. Math. Oper. Res. 6(1), 31–49 (1981)
Choi, B.D., Kim, B., Chung, J.: M/M/1 queue with impatient customers of higher priority. Queueing Syst. 38(1), 49–66 (2001)
Daley, D.: General customer impatience in the queue GI/G/1. J. Appl. Probab. 2(1), 186–205 (1965)
Garnett, O., Mandelbaum, A., Reiman, M.: Designing a call center with impatient customers. Manuf. Serv. Oper. Manag. 4(3), 208–227 (2002)
Gurvich, I., Whitt, W.: Servicelevel differentiation in manyserver service systems: a solution based on fixedqueueratio routing. Oper. Res. 29, 567–588 (2007)
Halfin, S., Whitt, W.: Heavytraffic limits for queues with many exponential servers. Oper. Res. 29(3), 567–588 (1981)
Heyman, D.P., Sobel, M.J.: Stochastic Models in Operations Research: Stochastic Optimization, vol. 2. Courier Corporation, Chelmsford (1982)
Iravani, F., Balcıogğlu, B.: Approximations for the M/GI/n+GI type call center. Queueing Syst. 58(2), 137–153 (2008)
Iravani, F., Balcıoğlu, B.: On priority queues with impatient customers. Queueing Syst. 58(4), 239–260 (2008)
Jouini, O., Roubos, A.: On multiple priority multiserver queues with impatience. J. Oper. Res. Soc. 65(5), 616–632 (2014)
Sakuma, Y., Takine, T.: Multiclass M/PH/1 queues with deterministic impatience times. Stoch. Models 33(1), 1–29 (2017)
Sarhangian, V., Balcioğlu, B.: Waiting time analysis of multiclass queues with impatient customers. Probab. Eng. Inf. Sci. 27(3), 333–352 (2013)
Stanford, R.E.: Reneging phenomena in single channel queues. Math. Oper. Res. 4(2), 162–178 (1979)
Takács, M., et al.: Introduction to the Theory of Queues. Oxford University Press, New York (1962)
Talreja, R., Whitt, W.: Fluid models for overloaded multiclass manyserver queueing systems with firstcome, firstserved routing. Management Science 54(8), 1513–1527 (2008)
Van Houdt, B.: Analysis of the adaptive MMAP[k]/PH[k]/1 queue: a multitype queue with adaptive arrivals and general impatience. European Journal of Operational Research 220(3), 695–704 (2012)
Zeltyn, S., Mandelbaum, A.: Call centers with impatient customers: manyserver asymptotics of the M/M/n+G queue. Queueing Syst. 51(3), 361–402 (2005)
Acknowledgements
The authors would like to thank Marko Boon for carefully checking all calculations with Mathematica.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Adan, I., Hathaway, B. & Kulkarni, V.G. On firstcome, firstserved queues with two classes of impatient customers. Queueing Syst 91, 113–142 (2019). https://doi.org/10.1007/s111340189592z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s111340189592z
Keywords
 Call centers
 Impatient customers
 Virtual queueing time process
 M/M/k + M queue
 M/G/1 + M queue
Mathematics Subject Classification
 60K25
 68M20
 90B22