On First Come, First Served Queues with Two Classes of Impatient Customers

We study systems with two classes of impatient customers who differ across the classes in their distribution of service times and patience times. The customers are served on a first-come, first served basis (FCFS) regardless of their class. Such systems are common in customer call centers, which often segment their arrivals into classes of callers whose requests differ in their complexity and criticality. We first consider an $M$/$G$/1 + $M$ queue and then analyze the $M$/$M$/$k$ + $M$ case. Analyzing these systems using a queue length process proves intractable as it would require us to keep track of the class of each customer at each position in queue. Consequently, we introduce a virtual waiting time process where the service times of customers who will eventually abandon the system are not considered. We analyze this process to obtain performance measures such as the percentage of customers receiving service in each class, the expected waiting times of customers in each class, and the average number of customers waiting in queue. We use our characterization to perform a numerical analysis of the $M$/$M$/$k$ + $M$ system, and find several managerial implications of administering a FCFS system with multiple classes of impatient customers. Finally, we compare the performance a system based on data from a call center with the steady-state performance measures of a comparable $M$/$M$/$k$ + $M$ system. We find that the performance measures of the $M$/$M$/$k$ + $M$ system serve as good approximations of the system based on real data.


Introduction
In this paper we analyze queueing systems with different classes of customers who may abandon (renege from) the system if their waiting time exceeds their patience time, i.e., the maximum amount of time they are willing to wait before abandoning the system. This work is motivated primarily by customer call centers, which often segment their callers into different classes. Since call centers are the prevalent customer-facing service channel of many organizations, they often receive a variety of caller requests which may differ significantly in their service requirements and criticality. For example, banking call centers receive requests as simple as balance inquiries and as complex and critical as dealing with fraudulent activity on a caller's account. While a service representative can obtain an account balance relatively quickly, handling fraudulent activity takes longer as it involves a higher level of legal expertise and paperwork. Furthermore, because fraudulent activity is usually more critical than obtaining a balance, callers who are calling about fraud may be more patient than callers who are calling to obtain a balance. Because callers' service requests vary so greatly, call centers often train subsets of their representatives to handle only certain types of service requests. Based on the service the callers request from the phone menu, the Automatic Call Distributor (ACD) segments callers into classes and routes the callers within each class to the appropriately trained subset of representatives. Depending on which types of requests are included in each class, these classes may differ from each other with respect to their distribution of service times and patience times. Consequently, one subset of representatives may serve a queue that receives arrivals from multiple classes that differ from each other in their typical service requirements and their callers' patience levels. Call centers sometimes give priority to certain classes based on criteria such as the callers' value to the organization or the criticality of the callers' service request. However, in an effort to be fair, call centers often serve their callers on a first-come, first-served basis (FCFS) regardless of class. Because the FCFS policy is such a common practice, it is important to describe the performance of such systems. In this study we do this by characterizing the performance of FCFS systems with two customer classes that may differ from each other in both their distribution of service times and their distribution of patience times.
As queue abandonment is a common customer behavior in many service systems, it is not surprising that a large number of studies have been devoted to characterizing queueing systems with impatient customers. Approaches for describing the performance of systems with a single class of impatient customers have included analytical characterizations (Daley [11], Baccelli and Hebuterne [3], Baccelli et al. [2], Stanford [21]), and performance approximations (Garnett et al. [12], Zeltyn and Mandelbaum [25], Iravani and Balcıoglu [16]). The literature also contains several studies of two-class systems. In most of these studies, one customer class is prioritized over the other. Choi et al. [10] analyze the underlying Markov process of an M /M /1 queue where one class of customers with constant patience times receives preemptive priority over a second class of customers who have no impatience. The authors obtain the joint distribution of the system size, and the Laplace transform (LT) of the response time of the second class of customers. Brandt and Brandt [6] extend the approach in [10] to include generally distributed patience times for the first class of customers. Iravani and Balcıoglu [17] use the level-crossing technique proposed in Brill and Posner [8] and Brill and Posner [9] to study two M /GI/1 settings. They first consider a preemptive-resume discipline where customers in both classes have exponentially distributed patience times, and then consider a non-preemptive discipline where customers in the first class have exponentially distributed patience times, but customers in the second class have no impatience. Iravani and Balcıoglu obtain the waiting time distributions for each class and the probability that customers in each class will abandon.
A handful of papers have dealt with priority queues with two classes of impatient customers in a multiserver setting. Motivated by call centers where callers may leave a voicemail, Brandt and Brandt [5] consider a multi-server system where callers from the first class are impatient and receive priority over callers from the second class, who have no impatience. Callers from the first class who renege may join the second class by leaving a voicemail and are only contacted when the number of idle servers in the system exceeds some threshold. The authors obtain the exact distribution of the number of callers in service from the first class, and approximations of the moments of the number of callers in service from the second class. Jouini and Roubos [18] consider an M /M /s + M queueing system where all customers have the same mean service times and mean patience times, but callers in one of the classes receive non-preemptive priority. Within each class, customers may be served according to a FCFS or last-come, first-served (LCFS) discipline. Jouini and Roubos obtain the mean unconditional waiting times, the mean waiting times conditional on receiving service, and the mean waiting times conditional on abandoning the system for both classes under several policy permutations.
The studies that are most pertinent to our work are those in which classes of impatient customers are served on a FCFS basis, regardless of their class. Gurvich and Whitt [13] approximate the performance of a multiclass call center with impatient callers under the quality-and-efficiency-driven (QED) regime introduced by Halfin and Whitt [14]. They analyze how the call center performs under a class of asymptotically optimal routing policies, of which the FCFS policy is a special case. Talreja and Whitt [23] rely on a deterministic fluid model to approximate the performance of a multiserver, multiclass FCFS system with impatient customers in an overloaded, efficiency-driven (ED) regime. Adan et al. [1] design heuristics to determine the staffing levels required to meet target service levels in an overloaded FCFS multiclass system with impatient customers. Van Houdt [24] considers an M AP /P H/1 multiclass queue where customers in each of the classes have a general distribution of patience times. Van Houdt develops a numerical procedure for analyzing the performance characteristics of the system by reducing the joint workload and arrival processes into a fluid queue, and expresses the steady state measures using matrix analytical methods. His method produces an exact characterization of the waiting time distribution and abandonment probability under a discrete distribution of patience times, and approximations of the same performance measures under a continuous distribution of patience times. Sakuma and Takine [19] study the M /P H/1 system and assume that customers within each class have the same deterministic patience time. In addition to obtaining the waiting time distribution and abandonment probabilities, they obtain the joint queue-length distribution. Finally, Sarhangian and Balcıoglu [20] study two multiclass FCFS systems. The first system is an M /G/1+M queue where customers across classes may differ in their service time distribution, but all customers share common exponentially distributed patience times. The second system is an M /M /c+M queue where customers across classes may differ in their exponentially distributed patience times, but all customers share the same exponentially distributed service times. For both systems, the authors obtain the LT of the virtual waiting time for each of the k classes by exploiting the level crossing technique in [8] and [9]. They then relate the virtual waiting time to the actual waiting time to compute steady-state performance measures such as the mean waiting times and the percentage of customers who renege from each class.
In this study we analyze two systems in which two classes of impatient customers are served according to an FCFS policy. The distinction between our setting and the settings in previous studies is that in our setting customers across classes may differ in their distributions of their service times and patience times, while in previous settings customers across classes may only differ in one of the two distributions. This distinction is crucial in characterizing the performance of multiclass systems with customers whose service times and patience may vary based on the complexity and criticality of their requests. We first consider an M /G/1+M queue, and then analyze an M /M /k+M queue. To characterize the performance of these systems, we introduce a virtual waiting time process as described in Benes [4] and Takács, et al. [22] (see Heyman and Sobel [15], pp. 383-390 for details). In a virtual waiting time process the service times of customers who will eventually abandon the system are not considered. By analyzing this process we obtain performance characteristics such as the percentage of customers who receive service in each class, the expected waiting times of customers in each class, and the average number of customers waiting in queue from each class. Note that although a related formula for the virtual waiting time in a single class M /G/1+M queue is reported in [7], it is not suitable for direct computation as it consists of an exponentially growing number of terms. We next perform a numerical analysis of the M /M /k+M system under various arrival rates, mean service times, and mean patience times. Our analysis demonstrates that accounting for differences across classes in the distribution of customers' service times and patience times is critical as the performance of our system differs considerably from a system where only the service time distribution varies across classes. The results of our numerical analysis have broad managerial implications including service level forecasting, revenue management, and the evaluation of server productivity. As a final exercise, we compare the simulated performance of a system based on data from a multiclass call center with the performance measures of a comparable M /M /k+M system. To construct our simulated system, we select two classes from the data that differ in their distribution of service times and caller patience times. We find that the performance measures from the M /M /k+M system serve as good approximations of the performance of the simulated system based on the call center data.
The remainder of the paper is organized as follows. In Sect. 2 we analyze the M /G/1+M queue. In Sect. 3 we analyze the M /M /k+M queue, including a special case where the two classes share a common mean service time. In Sect. 4 we derive steady-state performance measures. In Sect. 5 we present our numerical analysis, and in Sect. 6 we compare the performance of the simulated system based on real data and our analytical characterization of the M /M /k+M system.

M/G/1+M System
We begin with a single server queueing system serving two classes of customers, see Figure 1. Assume that class i (i = 1, 2) customers arrive according to independent Poisson processes with rate λ i and need independent and identically distributed (iid) service times with cdf G i (·) and mean τ i . The customers are impatient, and customers from class i leave the system after an exponential amount of time with parameter θ i (called the patience time) if their queueing time is longer than their patience time. The patience times of customers are independent of each other. For this system we are interested in performance characteristics such as the long-run fraction of customers entering service, server utilization, and the expected waiting time for service. Note that due to the impatience of the customers in each class, the system will always be stable even if the total arrival rate exceeds the service rate. Fig. 1 Single server model.
It seems natural to study this model through its queue length process. However, keeping track of the number of customers from each class who are in the queue is not sufficient. We would also need to keep track of the class of each customer at each position in queue, since patience times depend on customer class. This renders the Markov process intractable. Thus, we introduce the virtual queueing time process below. This process appears to be tractable.
Let W (t) be the virtual queueing time at time t. We know that W (t) decreases with rate 1 at all times while it is positive. If an arrival from class i occurs at time t and W (t) = w, the arrival leaves without service with probability 1 − e −θ i w , or enters service with probability e −θ i w , and needs a random amount of service with cdf G i (·). Hence the distribution of S i (t), the size of the upward jump at time t due to a class i arrival, given W (t) = w, is given by and thus and where W is the limit (in distribution) of W (t) as t → ∞. In the interval (t, t + h], an arrival of type i occurs with probability λ i h + o(h), and no event occurs with probability 1 − (λ 1 + λ 2 )h + o(h) as h → 0. Then we get Inserting e sh = 1 + sh + o(h), rearranging terms and dividing by h and then letting h → 0, we obtain Dividing by s and using we finally get Note that H i (s)/(λ i τ i ) is the LST of the equilibrium distribution of the service times of customers from class i. Sarhangian and Balcıoglu [20] have derived this equation by a different method, however they could solve it only in the special case θ 1 = θ 2 . Here we develop an efficient method to solve Eq. 2 even when θ 1 = θ 2 . We have included the detailed derivation of this equation, since the same steps can be used to derive the equations in the multi-server set up studied in the next section.
Repeated application of Eq. (2) shows that its solution can be written as where The terms c i,j (s) satisfy the recursion with initially c 0,0 (s) = 1 and c i,j (s) = 0 if i < 0 or j < 0. The recursive procedure to obtain (3) as well as convergence properties of the series c(s) will be explained in more detail in Sect. 3, where we analyze the multi server system. Finally, to determine p 0 , we use ψ(0) = 1 in (2), yielding We shall see in Sect. 4 that many performance measures can be computed in terms of ψ(θ i ).
Remark 1 (Hyper-Exponential impatience) For Hyper-Exp(θ ij , p ij ) impatience of class i customers, it is straightforward to derive the equation One can also show that our results reduce to known results (see Daley [11] for example) when specialized to a system with a single class of customers.

Model
Now we consider a FCFS multi server system with k servers serving two classes of customers, see Fig. 2. However, unlike in Sect. 2, we now assume that the service times are exponentially distributed. To be precise, we assume that customers from class i arrive according to a PP(λ i ), need iid Exp(µ i ) service times, and exhibit iid Exp(θ i ) patience times (i = 1, 2). If service does not start before the patience time expires, the customer leaves without service.

Virtual queueing time process
As in the single server case, we study the virtual queueing time process, augmented by a supplementary variable to construct a Markov process. Let W (t) be the virtual queueing time at time t in this system. This is the queueing time that would be experienced by a virtual customer arriving at time t. Let N i (t) be the number of servers serving a class i customer just after time t + W (t) but before the next customer (if there is one) entering service at time t + W (t). This means that N i (t) is the number of servers busy with a class i customer just before a customer arriving at time t enters service. So N 1 (t) + N 2 (t) is always at most k − 1. This unusual definition enables us to determine the size of the upward jump of the virtual queueing time if an arriving customer at time t decides to join the queue (since his patience exceeds W (t)). The jump is the minimum of the service time of the arriving customer and the residual service times of the customers in service at the moment he enters service at time t + W (t). As we will explain below, {(W (t), N 1 (t), N 2 (t)), t ≥ 0} is a Markov process with upward jumps, the size of which depend on W (t), and a continuous downward deterministic drift of rate 1 between jumps.
Suppose W (t) = 0 and (N 1 (t), N 2 (t)) = (i, j). Then (i, j) is the number of busy servers of class 1 and 2 at time t. For 0 ≤ i + j ≤ k − 1 the transition rates of services in state (0, i, j) are given by and for 0 ≤ i + j < k − 1 the transition rates of arrivals are This accounts for all transitions from states (0, i, j), 0 ≤ i + j ≤ k − 1, except for the transition rates of arrivals in states with i + j = k − 1. These transition rates are described below.
Now suppose the state of the tri-variate process at time t is (w, i, j) with w ≥ 0 and i + j = k − 1. This means just after time t + w we will have i busy servers of class 1 and j servers of class 2. Consider an arrival from class 1 at time t. This customer has to wait an amount w for service to begin. He reneges before his service starts with probability 1 − e −θ 1 w , in which case the state does not change, and he enters service at time t + w with probability e −θ 1 w . Then the next departure occurs after an Exp((i + 1)µ 1 + jµ 2 ) time X and the departure at time t + w + X is from class 1 with probability (i + 1)µ 1 /((i + 1)µ 1 + jµ 2 ) and from class 2 with probability jµ 2 /((i + 1)µ 1 + jµ 2 ). In the first case the state jumps at time t from (w, i, j) to (w + X, i, j). In the second case the state jumps to (w + X, i + 1, j − 1). The process is similar in case of a class 2 arrival. Hence we get the following transition rates from states (w, i, j) with w ≥ 0 and i + j = k − 1: Between upward jumps, the W process decreases continuously and deterministically at rate 1 while it is positive. When W reaches 0 in state (0, i, j), the process will stay in this state until either an arrival or service completion occurs. A sample path of the Markov process {(W (t), N 1 (t), N 2 (t)), t ≥ 0} is shown in Fig. 3. It describes the following events. At time t 0 = 0 the system is in state (0, 1, 1), which means that the virtual queueing time is 0, 2 servers are busy, one with a class 1 customer and the other with a class 2 customer. Then a class 2 customer arrives at time t 1 , so the system states changes to state (0, 1, 2). At time t 2 a class 1 customer arrives and W (t) jumps an Exp(2µ 1 + 2µ 2 ) amount. At time t 2 + W (t 2 ) a class 2 customer will complete service (from the two class 1 and two class 2 customers in service), so (N 1 (t), N 2 (t)) jumps to (2, 1) at time t 2 . The next customer arrives at time t 3 . He is from class 1 and his patience exceeds W (t 3 ). So this customer joins the queue and W (t) jumps an Exp(3µ 1 + µ 2 ) amount. At time t 3 + W (t 3 ) the class 2 customer will complete service before one of the three class 1 customers in service, so (N 1 (t), N 2 (t)) jumps to (3, 0) at time t 3 . At time t 4 a class 2 customer arrives, but his patience is less than W (t 4 ), and thus he leaves the system without receiving service. At time t 5 = t 3 + W (t 3 ) the class 2 customer completes service and the virtual queueing time reaches 0. The last event in Fig. 3 occurs at time t 6 . A class 1 customer completes service and the system state jumps to (0, 2, 0). We next determine the Laplace-Stieltjes transform (LST) of the virtual queueing time.

Steady-state analysis
We first introduce the notation We start with the balance equations for the steady-state probabilities p i,j . Let Then the balance equations can be written in vector-matrix form as where the (n + 1) × (n + 1) matrix ∆ µ,n = diag(nµ 2 , µ 1 + (n − 1)µ 2 , . . . , nµ 1 ) and the non-zero elements of the (n + 1) × (n + 2) matrix Λ n = [λ n,i,j ] and (n + 1) × n matrix M n = [µ n,i,j ] are given by The above equations can be simplified to The (n + 1) × n matrices R n in (5) recursively follow from where I denotes the identity matrix. Now we proceed to derive differential equations for the time-dependent LSTs ψ i (s, t) and then take t to infinity to obtain the steady-state equations. For small h > 0 we get where, by convention, p i,j (t) = 0 if i < 0 or j < 0. Inserting e sh = 1 + sh + o(h), rearranging terms and dividing by h and then letting h → 0, we obtain Now let t → ∞. Then the system reaches steady state and d It is useful to rewrite the above equations in vector-matrix form. Let Then (6) can be written as and by substituting (5), where the non-zero elements of the k × k matrices A 1 (s) = [a 1,i,j (s)] and A 2 (s) = [a 2,i,j (s)] are given by For s > 0, we can divide Eq. (7) by s and use the notation to obtain ψ(s) = p k−1 D(s) + ψ(s + θ 1 )H 1 (s) + ψ(s + θ 2 )H 2 (s), This equation is suitable to recursively determine ψ(s) for s > 0. Let Then Eq. (9) yields for i, j ≥ 0, To obtain ψ(s) = ψ 0,0 (s) we can repeatedly apply the above equation: and so on. This results after n iterations in the following expression for ψ(s): The k × k matrices C i,j (s) are defined as follows. A sequence of grid points p = {(i 0 , j 0 ), (i 1 , j 1 ), . . . , (i n , j n )} is called a path from (i 0 , j 0 ) to (i n , j n ) if each of its steps (i l+1 , j l+1 ) − (i l , j l ) are either (1, 0) or (0, 1). For path p, we introduce (see Fig. 4) where for l = 0, . . . , n − 1, For path p = {(i 0 , j 0 )}, we set C p (s) = I. Let P(i, j) be the set of all paths from (0, 0) to (i, j). Then C i,j (s) is defined as C p (s).
Note that for i + j > 0, the k × k matrices C i,j (s) can be recursively calculated from where C 0,0 (s) = I and C i,j (s) is the all zero matrix if i < 0 or j < 0. The following lemma states that the series of C i,j (s) is well defined for all s > 0.
Lemma 1 For each δ > 0, the series ∞ i=0 ∞ j=0 C i,j (s) is absolutely and uniformly convergent for all s > δ.
Proof Fix δ > 0. It suffices to show that there are constants M and r < 1 such that for all s > δ and i + j ≥ 0, where E is the all one matrix and the inequality is component wise. In this bound, s needs to stay away from zero, since H 1 (s) and H 2 (s), and thus by (12), also C i,j (s) are unbounded as s approaches zero. Now fix r < 1.
Since |H l (s)| ≤ λ l E/s (l = 1, 2), there is an N ≥ 0 such that for s > δ and i + j ≥ N , Recursion (12) implies that for each i, j ≥ 0, C i,j (s) is bounded for s > δ. Hence, there is a (sufficiently large) M such that (13) is valid for s > δ and the finitely many i + j ≤ N . By induction we now prove that (13) is valid for all i + j ≥ N . Suppose it holds for all i + j = n (which is true for n = N ). From (12) and (14) we get for s > δ and i + j = n + 1, where the last inequality follows from the induction hypothesis.
Since D ij (s) are uniformly bounded for all s > δ > 0 and i + j ≥ 0, we immediately get the following.

Corollary 1
The series ∞ i=0 ∞ j=0 D i,j (s)C i,j (s) is absolutely and uniformly convergent for all s > δ > 0.
Using that |ψ i,j (s)| ≤ 1, the second term in (11) is bounded by So it vanishes as n → ∞ by virtue of the absolute convergence of the series of C i,j (s). Hence, taking n → ∞ in (11), we get from Corollary 1, where In particular, we have ψ(θ i ) = p k−1 C(θ i ), i = 1, 2.
To complete the LST of the virtual queueing time, we need to determine p k−1 . First we set s = 0 in (7) yielding where the second equality follows from (17). To uniquely determine p k−1 , we finally need the normalizing equation where e is the all one vector and p n is given by (5) for 0 ≤ n < k−1. However, Eq. (19) requires the computation of φ(0), which is the hard step. Taking the derivatives on both sides of (7) and setting s = 0, we get Here prime indicates derivative with respect to s. Thus, to calculate φ(0) we need ψ (s) at s = θ 1 and s = θ 2 . For this we can use (15). Differentiating (15) yields The terms C i,j (s) can be recursively computed by differentiating (12), where C i,j (s) is the all zero matrix if i = j = 0 or if i < 0 or j < 0. Term by term differentiation of (16) is justified by the following two lemmas.

s) is absolutely and uniformly convergent for all s > δ.
Proof The proof is similar to the proof of Lemma 1. Fix δ > 0. It suffices to show that there are constants M and r < 1 such that for all s > δ and i + j ≥ 0, First fix r < 1. Since H l (s) ≤ λ l E/s and H l (s) ≤ λ l E/s 2 (l = 1, 2), there is an N ≥ 0 such that for s > δ and i + j ≥ N , Recursions (12) and (22) imply that for each i, j ≥ 0, C i,j (s) and C i,j (s) are bounded for s > δ. Hence, there is a (sufficiently large) M such that both (13) and (22) are valid for s > δ and the finitely many i + j ≤ N . Following the induction steps in the proof of Lemma 1 it follows that (13) is valid for all i + j ≥ N . We now show that also (22) holds for all i + j ≥ N . Suppose that (22) holds for all i + j = n (which is true for n = N ). From (22) and (23) we get for i + j = n + 1, where the last inequality follows form the induction hypothesis.
Lemma 3 For each s > 0, the derivative of C(s) exists and is equal to Proof Fix s > 0. First note that the series converges by Lemmas 1-2 and the fact that D i,j (s) and D i,j (s) are uniformly bounded for all i + j ≥ 0. It suffices to prove that for each sequence {h n } converging to 0, such that s + h n > 0 for all n, Let {h n } be such sequence. Thus there is a δ > 0 such that s + h n > δ for all n. According to (13) and (22) and that the fact that D i,j (s + h n ) and D i,j (s + h n ) are uniformly bounded for all n and i + j ≥ 0, there are constants M and r < 1 such that for all n and i + j ≥ 0, We need to show that for each > 0 there is an N such that for n > N Let > 0. For given n and i, j ≥ 0, it follows from the mean value theorem that there is an 0 < η < 1 such that (B i,j (s + h n ) − B i,j (s))/h n = B i,j (s + ηh n ). Hence, by (24), Note that M and r do not depend on n, i, j. So there is a constant K such that for all n, Further, for given i, j ≥ 0, (B i,j (s + h n ) − B i,j (s))/h n converges to B i,j (s) as n tends to infinity. Hence, there is an N such that for n > N 0≤i+j<K Combining (26) and (27) yields (25).
Substitution of (17) and (21) with s = θ i in (20) yields and normalization Eq. (19) can be rewritten as The above findings are summarized in the following theorem.

Theorem 1
The steady-state LST ψ(s) of the virtual queueing time satisfies where C(s) is defined by (16) and the probability vectors p n for 0 ≤ n ≤ k − 1 are the solution to the system of linear equations (5), (18) and (28).

Special Case µ 1 = µ 2 = µ
We now assume µ 1 = µ 2 = µ. This case has also been studied by Sarhangian and Balcıoglu [20]. The problem simplifies considerably, since we do not need to keep track of N 1 (t) and N 2 (t) separately, only N (t) = N 1 (t) + N 2 (t). Define Then the balance equations (4) can be simplified to and (9) reduces to The solution of this equation is given by (cf. (15)) For i + j > 0 the terms c i,j (s) are determined from the recursion with c 0,0 (s) = 1 and c i,j (s) = 0 if i < 0 or j < 0. The normalization equation becomes Together with (29) this yields for n = 0, . . . , k − 1,

Performance Measures
Now we show how many useful performance measures in steady state can be computed in terms of the LST evaluated at θ 1 and θ 2 . Suppose the M/M/k + M system is in steady state. An arrival faces a queuing time of W . If the arrival is from class i, he will enter service if his impatience time T i is longer than W . For class 1, this probability is given by (and similarly for class 2, by replacing θ 1 by θ 2 ) Here we are using that W = 0 and E(e −θW ; N 1 = i, N 2 = j) = p ij when i + j < k − 1. Next, using Little's law, we see that the expected number of servers busy serving class 1 customers is given by and the steady state throughput is equal to Next we compute the expected time of a class 1 customer waiting for service. This is given by By Little's law, we obtain for the expected number of class i customers waiting for service Finally, the expected conditional waiting time of class i customers entering service follows from These formulas simplify considerably when applied to M/G/1 + M system. In particular, the probability that the server is busy serving a class i customer is given by where ψ(s) is as defined in Eq. (2). The probability that the server is busy is given by In steady state the throughput is given by and the reneging rate by The expected number of class i customers waiting for service in steady state is given by The expected number of class i customers in the system in steady state is given by This implies that, in the special case when τ i = 1/θ i , This is expected, since in this case the system behaves like an infinite server queue for each class of customers. 16 Ivo Adan et al.

Numerical Analysis
Recall that the previous analytical models of multiclass FCFS systems have allowed for customers across classes to differ in either their service time distributions or their patience time distributions, but not in both distributions ( [24], [19], [20]). A contribution of our work is that we allow customers across classes to differ in both distributions. This enables us to model service systems such as call centers that segment their arrivals into classes of callers whose requests may differ greatly in their complexity and criticality. We are therefore interested in using our characterization to compare the performance of a system where customers across classes differ in only one of the distributions with the performance of a system where the customers across classes differ in both distributions. To do this, we conduct a numerical analysis by using the performance measures that we derived for the M /M /k+M system in Sect. 4. We compare the performance of three systems over a range of system loads. In all of the systems, requests from class 1 have a mean service time of 1 unit (µ 1 = 1) and requests from class 2 have a mean service time of .5 units (µ 2 = 2). In the first system, customers in both classes are equally patient, with a mean patience time of 2/3 units (θ 1 , θ 2 = 1.5). We call this system the base system as customers from the two classes differ in their distribution of service times but not in their distribution of patience times. In the second system, the mean patience time of class 1 customers is 1 unit (θ 1 = 1), while the mean patience time of class 2 customers is .5 units (θ 2 = 2). We call this system the positive system as there is a positive correlation across classes between customers' service times and patience times, i.e., class 1 customers have longer average service times and longer mean patience times. In the third system, class 1 customers have a mean patience time of .5 units (θ 1 = 2), while class 2 customers have a mean patience time of 1 unit (θ 2 = 1). Holding with our naming convention, we call this system the negative system. In each scenario in our analysis, the arrival rate of the class 1 and the class 2 customers are equal (λ 1 = λ 2 ). To vary the system load, we hold the number of servers in the system at 5, while varying the total arrival rate (λ = λ 1 + λ 2 ) between 6 and 20.
In Figure 5 we present four steady-state performance measures from the three systems. The measures include the percentage of all customers who receive service, the mean waiting time of all customers, the system throughput, and the average service time of customers who receive service. We discuss each of these measures: -% All Customers Receiving Service (4a): The percentage of all customers who receive service is lowest in the positive system and highest in the negative system over all arrival rates. Recall that class 1 customers are more patient than class 2 customers in the positive system and vice versa in the negative system. Consequently, a greater (smaller) proportion of the customers who receive service are from class 1 in the positive (negative) system. Also recall that class 1 customers have longer average service times than class 2 customers. Since servers in the positive (negative) system spend a greater (smaller) percentage of their time serving the customers with longer service times, their aggregate service rate decreases (increases), which decreases (increases) the percentage of all customers who receive service. This result has ramifications for service level forecasting as one of the common measures of service level is the percentage of customers who receive service. This demonstrates that managers who do not account for differences in the distribution of customers' service times and patience times across classes may produce inaccurate service level forecasts. -Mean Waiting Time of All Customers (4b): Irrespective of the arrival rate, customers have the highest mean waiting times in the positive system. However, the mean waiting times of customers in the base system and the negative system are nearly the same. This result was surprising as we expected the average waiting times of the customers in the negative system to be lowest since the average waiting times of customers is highest in the positive system. Because this trend was puzzling to us, we calculate the mean waiting times conditional on receiving service and on reneging for both systems. We find two trends. First, in both of the systems the average waiting times of customers who receive service is higher than the average waiting times of customers who renege. Second, both of these waiting time measures are lower in the negative system than in the base system. So, if both of these measures are lower in the negative system, why are the average waiting times of all customers nearly equal between the two systems? The answer lies in the fact that a higher percentage of customers receive service in the negative system. One can think of the average waiting time of all customers as a weighted average of the waiting times of the customers who receive service and of the customers who renege, where the respective weights are the percentage of customers who receive service and the percentage of customers who renege. Recall from Figure 4a that the percentage of customers who receive service is highest in the negative system. This means that in the negative system there is a greater weight from a subset of customers who tend to wait longer, since the average waiting time for customers who receive service is greater than the average waiting time of customers who renege. The result is that the average waiting times of all customers in the base system and the negative system are nearly equal  Figure 4a that the percentage of customers receiving service is lowest (highest) under the positive (negative) system. It is therefore not surprising that system throughput is lowest (highest) under the positive (negative) system. However, what is surprising is that in the positive system, throughput is first increasing in the arrival rate, but is then decreasing after some threshold arrival rate. Initially, increasing the arrival rate increases server utilization and hence system throughput. However, the gains in throughput due to the increase in server utilization diminish and are eventually offset by a reduction in the effective service rate of the system. The reason that the service rate is decreasing in system load is that the percentage of time the servers spend with customers from class 1 (who take longer to serve) is increasing in load. This is because a higher proportion of class 2 customers renege as load increases since they are less patient than class 1 customers. The interesting takeaway is that in a system where customers' service times and patience times are positively correlated, increasing traffic can actually decrease throughput. This result has potential implications for systems with limited service capacity that generate revenue based on system throughput, e.g., restaurants. Mangers of such systems may attempt to increase traffic through marketing efforts in order to generate additional revenue but may instead reduce their revenue if the customers in their system who take longer to serve are also more patient. -Average Service Time (4d): In the base system the average service time of customers who receive service remains the same regardless of the arrival rate. This is because customers in each class are equally patient, which makes the proportion of customers receiving service who are from each class invariant to the system load. However, in the positive (negative) system, the average service time increases (decreases) as the arrival rate increases. In the positive system, the class 1 customers are more patient. Thus, as the load on the system increases, the proportion of customers receiving service who are from class 1 increases. Because class 1 customers take longer to serve, the average service time increases as the arrival rate increases. The reverse is true in the negative system, where class 2 customers are more patient but take less time to serve. This relationship between system load and average service time has managerial implications. Servers in customer-facing systems are often evaluated and incentivized based on their average service times. For example, a common practice in call centers is to provide financial rewards for agents to keep their average service time under some threshold. Our analysis shows that managers may reach false conclusions regarding their servers' productivity if they evaluate their servers based solely on their average service times. In the negative system, managers may receive the impression that servers are speeding up when the arrival rate is higher. Consequently, managers may wrongly reward their servers for their supposed efforts to work faster under heavy loads. Conversely, in the positive system, managers may falsely reprimand their servers for allegedly slowing down as arrival rates increase. Our model demonstrates than an observed correlation between average service times and system load, such as what we see in Figure 4d, may have no correlation with the servers' efforts. Rather, the observed correlation may be entirely due to a correlation across classes between customers' service times and patience times.
We make one final observation regarding the performance of these systems as the arrival rate tends to infinity. We've shown that of the customers who receive service, the proportion who are from the more patient class increases as the arrival rate increases. Intuitively, we would expect this proportion to tend to one as the arrival rate tends to infinity due to the stochastic dominance of the patience times of one class over the other. Indeed, as we increase the arrival rate in the negative and positive systems from our numerical examples, we observe this phenomenon. For example, when we set the arrival rate to 1,000 for each class in the positive system, of the callers who receive service, 95.9% are from class 2. An outcome of this phenomenon is that as the arrival rate tends to infinity the average service time and throughput of the system tends to the same performance measures of an overloaded system comprised entirely of customers from the more patient class. Hence, in the positive system the average service time tends to 1 and system throughput tends to 5, while in the negative system average service time tends to .5 and system throughput tends to 10.
Overall, our numerical analysis demonstrates the importance of accounting for differences across classes in the distribution of customers' service times and patience times. We see that differences in these distributions may substantially affect key performance measures, which have a wide range of managerial implications, including service level forecasting, revenue management, and the evaluation of server performance. Consequently, a contribution of our work is that our analytical characterizations may be used to demonstrate to managers some of the implications of administering a multiclass FCFS queue.

Model as Performance Approximation
Managers may be interested in knowing whether they may use the performance measures from our model to approximate the performance of their real world systems. Thus, as a final exercise we compare the simulated performance of a system based on real data with the performance of a comparable M /M /k+M system. Our data comes from a multiclass call center of a small US-based bank. To construct our simulated system, we select two classes from the data which differ in their distribution of service times and caller patience times. The first class is comprised of general banking calls such as balance inquiries and the second class is comprised of technical support calls such as password reset requests. In Figure 6 we display the estimated density function of the service times from each class and in Figure 7 we display the estimated distribution function of the callers' patience times from each class. 2 Note that class 1 callers tend to have shorter service times and shorter patience times than class 2 callers 3 , i.e., there is a positive correlation across the classes between service times and patience times. To make the systems comparable, we set the mean service time and mean patience time of the two classes in the M /M /k+M system equal to the estimated means of the two classes in the simulated system. Finally, we are interested in understanding to what extent the accuracy of the approximation increases when service rates are allowed to differ across classes. Thus, we also obtain the performance measures of this system under the restriction that the service rates across classes are equal. To do this, we calculate the average service rate across both classes and use the formulas from Sect. 3.4 to obtain the performance measures. In all of the systems we assume that calls from each class arrive according to independent Poisson processes with equal arrival rates. To compare the performance of the systems across different loads, we hold the number of servers in the system at 5, and vary the total arrival rate to be 36, 45, 60, and 120 calls per hour. In Table  1 we compare the performance of the three systems, where "Simulation" corresponds to the simulated system, "µ 1 = µ 2 " corresponds to the analytical characterization of the system where service rates are allowed to differ across classes and "µ 1 = µ 2 " corresponds to the analytical characterization where we restrict the service rates to be the same across classes. For each class, we present the average waiting time in seconds (AWT), the percentage of callers who receive service (%RS), and the average number of callers waiting in queue (AQ). We also present the server utilization (Util %), and the average service time of callers who receive service (AST(All)). We measure how close our analytical characterizations come to the simulated system using relative error, which is given by |simulated − analytical|/simulated. Overall, we find that the performance measures of the µ 1 = µ 2 system serve as good approximations of the simulated system, with relative errors of no greater than 3.76%. Of particular note is the high accuracy in predicting the percentage of callers who receive service (%RS) and the average service time of callers who receive service (AST(All)), with relative errors typically less than 1%. Under the lowest arrival rate (λ = 36), the µ 1 = µ 2 characterization provides nearly the same approximation accuracy as the µ 1 = µ 2 characterization. However, as the arrival rate increases, the accuracy of the µ 1 = µ 2 characterization decreases relative to the µ 1 = µ 2 characterization. In particular the µ 1 = µ 2 characterization underforecasts AWT, AQ, Util% and AST(All) while overforecasting %RS. This is due to the fact that in this system, callers who tend to be more patient also tend to have longer service times. While the µ 1 = µ 2 characterization accounts for this correlation, the µ 1 = µ 2 characterization does not. Overall, these results demonstrate that managers of two-class FCFS service systems may use our analytical characterization of the µ 1 = µ 2 system to produce good approximations of the performance of their systems by collecting the mean service time and the mean patience time of customers in each class.