Batch service systems with heterogeneous servers

Bulk-service multi-server queues with heterogeneous server capacity and thresholds are commonly seen in several situations such as passenger transport or package delivery services. In this paper, we develop a novel decomposition-based solution approach for such queues using arguments from renewal theory. We then obtain the distribution of the waiting time measure for multi-type server systems. We also obtain other useful performance measures such as utilization, expected throughput time, and expected queue lengths.

geneous servers with bulk service have been found in container terminal systems [11], passenger transport systems, and parcel distribution networks. In a passenger transport system, the servers can vary in their carrying capacity (for example, a 20-seater bus vs. a 40-seater bus) and serves (transports) the passengers in a batch. Hence, the servers may have heterogeneous capacity. Likewise, we can model a parcel distribution network using heterogeneous capacity servers. The freight movers own or rent trucks from the market of different capacities. The trucks vary in their body length and carrying capacity. For instance, Light Duty Box Trucks typically have a carrying capacity of 8600-14,050 lbs, whereas Heavy-duty Flatbed Trucks typically have a carrying capacity of 26,000-52,000 lbs. The trucks have a threshold load limit before they leave the origin dispatch unit. The performance analysis problem (such as the number and quantity of loads waiting at a depot before being dispatched) can be modeled using heterogeneous capacity resources (multiple types of trucks). However, the literature on performance analysis of bulk-service queues with heterogeneous servers and threshold service is scarce.
In Arora [3], a heterogeneous two-server queueing process fed by Poisson arrivals and exponential service time distributions has been considered under the bulk-service discipline. Time-dependent probabilities for the queue length have been obtained in terms of Laplace transforms, from which different measures associated with the queueing process, like the mean queue-length, could be determined. Goswami and Samanta [8] analyzed a discrete-time bulk-service queueing system with two heterogeneous servers, i.e., two batch servers working with different service rates. They assumed the interarrival times of customers and service times of batches to be independent and geometrically distributed. They obtain closed-form expressions for the steady-state probabilities at an arbitrary epoch with the help of the displacement operator method and derive the outside observer's observation epoch probabilities and waiting time distribution measured in slots. Chakka and Van Do [4] proposed a new HetSigma queue for performance analysis of wireless communication systems. They use negative customers to model server failures, packet losses, and load balancing in networks. They analyze joint Markov modulation of the arrival and service processes, superposition of K Compound Poisson Process (CPP) streams of positive customer arrivals and a CPP of negative customer arrivals in each modulating phase for a multi-server queue with c non-identical servers, and generalized exponential service times.
Using a matrix geometric method, Kumar and Madheswari [10] obtained the stationary queue length distribution and mean system size for a Markovian queue with two heterogeneous servers and multiple vacations. Using a generating function technique, Ammar [2] analyzed the transient behavior (exact time dependent solutions) of a two-processor heterogeneous system with catastrophes, server failures and repairs. The tasks arrive according to a Poisson process and service times are exponentially distributed. Each task requires exactly one processor for its execution and the scheduling policy is FCFS. Using the embedded method, Keaogile et al. [9] presented an exact analysis for finding the probability generating function of the steady state number of customers in a discrete time queue with two heterogeneous servers.
While there are several studies that analyze batch service queues with single or multiple homogeneous servers, studies on analysis of batch service queues with heterogeneous server systems are limited (for example, see Chang and Harn [5], Chen et al. [6], Gold and Tran-Gia [7], Aalto [1]). We contribute to the literature in the following ways: (1) we analyze batch service systems with heterogeneous servers and threshold service capacity. In many practical settings, the server may have a large service capacity; however, the service is initiated only when a threshold capacity of the server is utilized. Such settings are commonly observed in amusement parks, bus services, multi-trailer systems due to revenue or system functionality considerations.
(2) We perform an exact analysis of batch service systems using a combination of Markov chains and system state decomposition. We decompose the system using a free and busy periods state of the system, perform separate analysis, and then combine the results from the free and the busy periods to obtain the joint probability of the number of free servers and the number of customers waiting in the queue. This joint probability leads both to the waiting time distribution of a customer and to performance measures such as used server capacity, expected throughput time, expected waiting time, and expected queue lengths. Such measures are useful for batch service system design. Due to the complexity of the analysis, we show the results for exponential service times only. However, our work can be extended to batch service queues with heterogeneous servers and service times which have a phase-type distribution, albeit with significant numerical costs associated with a large system state space.
The rest of the paper is organized as follows: In Sect. 2, we describe the queueing model. In Sect. 3, we perform the analysis of the queueing system with two types of servers and batch service. We extend our analysis to more than two server types with batch services in Sect. 4. Results from numerical experiments are included in Sect. 5. Finally, we draw our conclusions in Sect. 6.

Model
We consider a system with S types of batch servers characterized by their rates and capacity, denoted as the Here λ denotes the Poisson customer arrival rate; for σ = 1, . . . , S, let N σ denote the number of servers of type σ , each with a bulk exponential service rate of μ σ , with a maximum batch size B σ and a minimum batch size T σ . The service times are assumed to be independent of the batch sizes in process. Furthermore, we assume that if a type σ server is free, there are fewer than T σ customers in the queue. We assume for 1 ≤ τ < σ ≤ S that T τ ≤ T σ ; if T τ = T σ then we assume that type τ servers have priority.
In the following section, we analyze the system with S = 2 types of servers. In Sect. 4, we give some results for the general case S ≥ 2. /N 1 + N 2 queue to find the waiting time distribution. We first concentrate on the joint probability function of the number of busy servers of type σ = 1, 2 and the queue length. We then find the residual waiting time distribution of a tagged customer, conditional on the free servers and the number of Table 1 Transition from state (n 1 , n 2 , n Q )

To state
Rates conditional on waiting customers who arrived either before or after the tagged customer. The waiting time of a customer equals the residual waiting time given the number of busy servers and the queue length at arrival and that there are no waiting customers who arrived after this customer. By the fact that 'Poisson arrivals see time averages', the joint probability function of the number of busy servers and the queue length at arrival was already determined, so we can find the unconditional waiting time distribution.

The joint probability function of the number of busy servers of each type and the queue length
The state space for this queue can be expressed using a three-tuple (n 1 , n 2 , n Q ), with n σ = 0, . . . , N σ , σ = 1, 2 and n Q = 0, 1, . . ., where n 1 and n 2 denote the number of busy servers of type 1 and type 2 respectively and n Q denotes the number of waiting customers. Note that if n σ < N σ then n Q < T σ . Due to our assumptions, we can find the transition rates from (n 1 , n 2 , n Q ); these transition rates are given in Table 1. The rate of service completion is proportional to the number of type 1 and type 2 busy servers, i.e., n 1 μ 1 + n 2 μ 2 . If there are fewer than T σ customers in the queue, a type σ server will not begin its service.
To analyze the continuous time Markov chain (CTMC), we split the state space into a free period and a busy period (see Fig. 1). During the free period (FP) at least one server is free. The states (n 1 , n 2 , n Q ) with n 1 + n 2 < N 1 + N 2 , and n Q < T σ and n σ < N σ , σ = 1, 2 correspond to the free period. During the busy period (BP), all servers are busy, i.e., the states (N 1 , N 2 , n Q ) with n Q = 0, 1, . . .. When a BP starts, we know that n Q = 0. When a FP starts, we know that either n 1 = N 1 −1 and n 2 = N 2 or n 1 = N 1 and n 2 = N 2 − 1. However, when we enter a FP, the distribution of n Q is unknown. We now discuss the free and the busy period analysis in the following sections.

Busy period analysis
During the BP, all the servers are busy. Therefore, we only have to focus on the queue length, which, together with the state '−1' representing the FP, can be described by a CTMC process.
The transition rate from '−1' (the state representing the FP) to '0' is arbitrarily chosen since the sojourn time in '−1' in combination with the probability that the process is in state '−1' is only used to compute the expected length of the BP. A BP starts always with N Q = 0 customers in the queue. We are interested in π BP (n), the conditional probability of being in state 'n' given that all servers are busy, and E(T B ), the expected length of a BP. The rate up from any state is λ and the rate down to or below state n = 0, 1, . . . equals N 1 μ 1 where we used that during a BP all servers are busy (see Fig. 2). The steady state probabilities for the queue length N Q are defined by the following balance equations: It is easily checked that the steady state probabilities for the states '−1' and 'n' (i.e., π(−1) and π(n)) are provided by with α equal to the unique solution in (0, 1) of the equation The length of the BP and its expectation can be found by considering the above described Markov chain as a regenerative process, which regenerates when it enters state '−1'. The number of consecutive times that the states are positive equals the length of a BP. The theory of regenerative processes now gives which, combined with (1), leads to The conditional distribution of the queue length during a BP is found by (2). This gives In the analysis of the FP, we need to know in which states this FP starts, or, equivalently, how the BP ends. The probability that the BP is ended by a type σ server becoming free which sees n Q < T σ customers at the end of the BP is denoted by P(type σ server, n Q ). Note that the end of a BP corresponds to entering state −1. By conditioning on entering state −1, we find that for n Q = 0, . . . , T σ − 1 and σ = 1, 2.

Free period analysis
In the free period, we not only need to keep track of how many customers are waiting, but also how many servers of each type are busy. The state space for the free period can be described as follows: The state (N 1 , N 2 , 0) represents the BP, with some arbitrary total rate out. As in the analysis of the BP, the sojourn time in this extra state is only used to find the length of the FP. The transition rates are given in Table 1, except for the extra state Note that the rates are proportional to the exit probability of the BP given in (6). The expected length of a FP can be found analogously to the derivation of the length of the BP, cf. (4), .
For the FP, however, no simple expression exists for π(N 1 , N 2 , 0) or for E(T FP ). We have to use numerical methods to compute these probabilities. Combining the results for the busy period and the free period leads to the steady state joint probability of the number of busy servers of each type and the number of waiting customers: where P(BP) and P(FP), the probabilities that the system is in a BP, resp. an FP, are given by and This joint probability distribution is used in the next section to find the waiting time distribution.

Waiting time
In this section, we find the steady-state waiting time distribution of a customer. First the special case T 2 = 1 is analyzed. In this case a server cannot be idle when a customer is waiting. It is clear that a customer can only wait when all the servers are busy and that a waiting time ends at a service completion. In case that T 2 > 1, it might happen that a waiting time ends at an arrival of a new customer. This requires a totally different analysis.

The special case T 2 = 1
For the special case where a server will not be free when there are customers waiting, that is, the minimal batch sizes T 1 = T 2 = 1, we can find the waiting time distribution as follows.
Consider N D , the number of arrivals during a waiting time. N D , given that the waiting time has length d, has a Poisson distribution with E N D = λd. Unconditioning gives us that the z-transform of N D is given by E z N D =Ŵ (λ (1 − z)), wherê W denotes the Laplace Stieltjes transform (LST) of the steady-state waiting time distribution. An up-down crossing argument gives that N D is identically distributed to N A , where N A denotes the number of customers waiting at an arrival epoch (here we assume that customers enter a batch one by one in order of arrival). By the PASTA property, N A and N Q are identically distributed. Using the time stationary probabilities π(n Q ) found in the previous subsection, we can find an expression for the LST of the waiting time:Ŵ

The general case T 2 ≥ 1
In the case T 2 > 1, it can happen that a type 2 server is free because there are fewer than T 2 customers waiting. Once the threshold is met by an arrival, the server starts working and the waiting of the customers in the queue ends. This phenomenon prevents us from employing the technique used for the case T 2 = 1.
The waiting time of an arriving customer depends not only on the number of waiting customers, but also whether there is a type 1 server free or (only) a type 2 server. Suppose there is a type 1 server free, then we know that the number of waiting customers at arrival n Q < T 1 . It is readily seen that if n Q = T 1 − 1, the arriving customer has no waiting time. If n Q = T 1 − 1 − k, with k = 1, . . . , T 1 − 1, the waiting time of the customer ends after k new arrivals and therefore has a Erlang-k distribution. If no type 1 server is free but a type 2 server is free, it becomes a bit more complicated. We can have the same reasoning as in the previous case, but now it can happen that a type 1 server becomes free, the number of waiting customers exceeds T 1 , and the type 1 server starts a new service. Even then it is possible that the waiting of our tagged customer does not end, because he did not fit in the type 1 batch. Thus, to decide whether its waiting time ends, we need to know both the number of waiting customers which arrived before him and after him. If no server is free at the arrival, the customer first has to wait until he fits in the batch of a server that becomes free. Even if that is the case, it might be possible that he still has to wait, because the threshold batch size of the server is not reached.
To analyze the waiting time of an arbitrary customer, we first concentrate on the remaining waiting time of this tagged customer at some time point. As observed, the remaining waiting time depends on whether a type 1 server is free, or only a type 2 server, or no server at all, and on the number of waiting customers which arrived before, respectively after, the tagged customer.

Remaining waiting time
Suppose we follow a customer during its waiting. We will decompose the remaining waiting time of this customer into two parts, namely the time until a new customer arrives (this event is denoted by A) or a type σ server ends its service (denoted by F σ ), and the remaining waiting time after this event. We denote the remaining waiting time of the customer when there are n − 1 waiting customers who arrived before him and who arrived after him by W n, if there is no server free, by W (1) n, if there is a type 1 server free and by W (2) n, if there is only a type 2 server free. When a customer has to wait while a type 1 server is free, it has to wait at least until a customer arrives (Exp(λ)) and then it either is taken into service or has to wait with one customer more behind him in the queue. This gives us the following relation: where Exp(λ) denotes a random variable with an exponential distribution and mean 1/λ. If only a type 2 server is free, there are more relevant events, namely an arrival of a customer or the end of a type 1 service. In the first event ( A), the number of customers behind him is increased (recorded until T 2 − 1 is reached). In the second event (F 1 ), the customer might be taken into service (when he is at the head of the line and there are enough customers in queue to fill the batch threshold), or the number of customers in front of him is decreased by B σ and the type 1 server is busy. This gives Finally, for the case where no server is free, the system is in a BP. When a customer waits during a BP, the first part of its waiting time ends either by an arrival, or the end of a service by a type 1 or type 2 server. In the first case, the number of customers behind him is increased. In the second case the customer is either taken into service (when he is at the head of the line and there are enough customers in the queue to fill the batch threshold), or the number of customers in front of him is decreased by B σ , or an FP starts. This gives where M λ = N 1 μ 1 + N 2 μ 2 + λ. Note that W n, = W n, +1 for ≥ T 2 − 1 since the threshold of any batch is always met when ≥ T 2 − 1. If a type 1 server is free we find If only a type 2 server is free we find Finally, we define the double generating function φ (z, s) for = 0, 1, . . . by . .. Therefore, we will use (10) only for = 0, . . . , T 2 − 1 and set φ T 2 (z, s) = φ T 2 −1 (z, s).

Waiting time
In this section, we concentrate on the waiting time of an arriving customer. On arrival of a customer, we see that this waiting time is the remaining waiting time of this customer with the same state of the servers, the same queue length and no waiting customers who arrived after him. So we can write n Q +1,0 if a type 1 server is free, W (2) n Q +1,0 if only a type 2 server is free, By these equations, we find the LST for W as π FP (n 1 , n 2 , n Q )φ (1) n Q +1,0 (s) π FP (N 1 , n 2 , n Q )φ (2) n Q +1,0 (s) π FP (n 1 , n 2 , n Q )φ (1) n Q +1,0 (s) π FP (N 1 , n 2 , n Q )φ (2) n Q +1,0 (s) , where the double generating function φ 0 (α, s) is defined in (10). With this expression for the LST of the waiting time and (8,9), it is easy to show the next theorem.

Other useful performance measures
Apart from the waiting time, there are also other interesting performance measures such as the throughput of servers per type per period, the throughput of customers per type of server per period, the used capacity per type of server per period, the fraction of used capacity per type of server, and the expected waiting and throughput time. In this section, we give an expression for these performance measures in terms of the steady state probabilities π FP (n 1 , n 2 , n Q ), π BP (n Q ), P(FP) and P(BP).
To find the throughput of servers per type in the free period (T H σ (FP)), we use the interpretation that the steady state distribution also represents the fraction of time the system is in a certain state. For any state we know the departure rate of the servers. This gives For the busy period we do the same; here the server only starts a service when at least his threshold is met and we find for the throughput of servers per type in the busy period T H σ (BP) = N σ μ σ α T σ for σ = 1, 2.
The throughput of customers per type of server per period (T H C σ (P), P = BP, FP) can be found in a similar way as the throughput of servers, but here we also have to count the number of customers being served simultaneously by a server. This gives and Now we have the throughput of both servers and customers, we can look at the average used capacity per type of server per period: for σ = 1, 2 and P = BP, FP, and the fraction of used capacity per type of server: Finally, we also find the expected waiting time and the expected throughput time without computing their respective distribution. For the expected waiting time, we use Little's Law to find n Q π FP (n 1 , n 2 , n Q ) To find the expected throughput time, the sum of the waiting time (W ) and the service time (T S ), we now only need to find the expected service time

Analysis of queue with multi-type of servers
In this section, we give some results for the case where S ≥ 2. We refer to the corresponding formulas and the given arguments in the previous sections. The state space for the queue can be expressed using a vector (n 1 , . . . , n S , n Q ), with n σ = 0, . . . , N σ , σ = 1, . . . , S and n Q = 0, 1, . . ., where n σ denote the number of busy servers of type σ and n Q denotes the number of waiting customers. Note that if n σ < N σ then n Q < T σ . The rates between these states can be found analogously to the rates given in Table 1.
For the busy period we can derive that for n = 0, 1, . . ., with α equal to the unique solution in (0, 1) of the equation for λ < S σ =1 N σ μ σ B σ . We then find, similarly to the derivation in the case S = 2, that For the transition rates used in the analysis of the FP, we need to know the probability that a busy period ends with the end of service of a type σ server, finding n Q < T σ customers in the queue, given by For the free period we describe the state space by The transition rates from the extra state (N 1 , . . . , N S , 0) are again proportional to the exit probabilities from the busy period given in Eq. (14). Again, similarly to the previous section, we have .
Finally, we mention some relations for W (σ ) n, , the remaining waiting time of a customer who has position n in the queue, with customers behind him and the lowest type of free servers σ . In this case the remaining waiting time might be influenced by arriving customers or by the end of a higher priority type of service. We find that If no servers are free, we get Eventually we find that the waiting time in this system has a phase type distribution, as stated in the following theorem: where '0' is the absorbing state, with initial probabilities for σ 1 = 1, . . . , σ − 1 and σ = 1, . . . , S.

Remark 1
We can also prove that, for an arbitrary customer, the sojourn time in the system has a phase type distribution by adding extra states '1',. . .,'S' to the state space of the previous theorem, where 'σ ' indicates that the tagged customer is being served by a type σ server. Then we split the rate at which the original Markov chain enters the absorbing states over these new states. Furthermore, P I (0), the initial probability that the waiting time is 0, splits over the states 'σ ' as . . . From state 'σ ', the rate to the absorbing state '0' is μ σ . The absorption time of this newly defined Markov chain has the same distribution as the sojourn time of an arbitrary customer.

Numerical experiments
To give some results of the method presented in this paper, we consider the system depicted in Fig. 3. In this system, there are two types of servers, say type A and B. The number of type A servers is N A = 4, with processing rate μ A = 0.4 and capacity B A = 5; the number of type B servers is N B = 2, with characteristics μ B = 0.2 and B B = 9. The arrival rate λ = 6. In Table 2 we provide the expected queue length for different threshold values. Note that the server type with the lowest threshold has priority. When the thresholds of both server types are equal, we have a choice to decide which server has priority over the other. In Table 3, we provide the probability of a zero waiting time in the queue. For three combinations of T A and T B , we show the graphs of the probability density function and the cumulative distribution function (cdf) of the waiting times in Figs. 4 and 5, respectively. Note the jump of the cdf at t = 0 with height the probability of a zero waiting time in the queue.    To obtain the results, we carried out several experiments. For low and high arrival rates, both the expected queue length and the zero waiting time probability were monotone with increase in threshold. In the case where λ = 6, we see that there is a threshold setting that minimizes the expected queue length (T A = 4 and T B = 6) and another setting that maximizes the probability of zero waiting time (T A = T B = 4, type A servers have priority).

Conclusion
In this paper, we have shown that the waiting time in a queue with Poisson arrivals and exponential servers of different types has a phase type distribution. To show this result, we split the state space into two parts, the BP and FP, and analyzed these parts separately. This splitting works so well since entering the BP is always at the same state. To find the waiting time distribution, we have to analyze the FP numerically. A possible numerical problem may arise since there are many states corresponding to the FP (in the order of T 1 * S σ =1 N σ states). Future work could include numerical investigation of the threshold quantity for batch service that can trade-off waiting time vs. used resource capacity. We also hope that this study will find applications in analysis of container terminal systems where there are different types of vehicles for internal container transport, and container handling responsiveness is a key performance measure for the terminal.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.