1 Introduction

The subject of this paper is the approximative analysis of single-server tandem queues with general service times and finite buffers. The blocking protocol is Blocking-After-Service (BAS): if the downstream buffer is full upon service completion the server is blocked and has to wait until space becomes available before starting to serve the next job (if there is any). Networks of queues (and in particular, tandem queues) with blocking, have been extensively investigated in the literature; see e.g. Buzacott et al. (1995), Colledani and Tolio (2011), Dallery and Gershwin (1992), Perros and Altiok (1986), Perros (1989, 1994). In most cases, however, queueing networks with finite buffers are analytically intractable and therefore the majority of the literature is devoted to approximate analytical investigations. The approximation developed in this paper is based on decomposition, following the pioneering work of Gershwin (1987): the tandem queue is decomposed into single-buffer subsystems, the parameters of which are determined iteratively. In each subsystem, the “actual” service time, starvation and blocking are aggregated in a single service time, and these aggregate service times are typically assumed to be successively independent. However, these aggregate service times are not independent. For instance, knowledge that the server is blocked after service completion (resulting into a long aggregate service time) makes it more likely that the server will also be blocked after the next service. Especially in longer tandem queues with small buffers and in tandem queues with service times with high variability, dependencies of successive aggregate service times may have a strong impact on the performance. In this paper, an approach is proposed to include such dependencies in the aggregate service times.

The model considered in the current paper is a tandem queue L consisting of N servers and N−1 buffers in between. The servers (or machines) are labeled M i , i=0,1,…,N−1. The first server M 0 acts as a source for the tandem queue, i.e., there is always a new job available for servicing. The service times of server M i are independent and identically distributed, and they are also independent of the service times of the other servers; S i denotes the generic service time of server M i , with rate μ i and squared coefficient of variation \(c^{2}_{S_{i}}\). The buffers are labeled B i and the size of buffer B i is b i (i.e., b i jobs can be stored in B i ). We assume that each server employs the BAS blocking protocol. An example of a tandem queue with 4 machines is illustrated in Fig. 1.

Fig. 1
figure 1

A tandem queue with 4 servers

The approximation is based on decomposition of the tandem queue into subsystems, each one consisting of a single buffer. To take into account the relation of buffer B i with the upstream and downstream part of the tandem queue, the service times of the server in front of buffer B i and the one after buffer B i are adapted by aggregating the “real” service times S i−1 and possible starvation of M i−1 before service, and S i and possible blocking of M i after service. The aggregate service processes of M i−1 and M i are described by employing the concept of Markovian Arrival Processes (MAPs; see e.g. Neuts 1989), the parameters of which are determined iteratively. It is important to note that Markovian Arrival Processes can be used to describe dependencies between successive service times. Although decomposition techniques for single-server queueing networks have also been widely used in the literature, see e.g. Gershwin (1987), Helber (2005), Kerbache and MacGregor Smith (1987), Perros (1994), van Vuuren et al. (2005), van Vuuren and Adan (2009), the distinguishing feature of the current approximation is the inclusion of dependencies between successive (aggregate) service times by employing Markovian Arrival Processes.

The paper is organized as follows. In Sect. 2 we describe the decomposition of the tandem queue in subsystems. Section 3 presents the iterative algorithm. The service processes of each subsystem are explained in detail in Sects. 4 and 5, after which the subsystem is analyzed in Sect. 6. Numerical results can be found in Sect. 7 and they are compared to simulation and other approximation methods. Finally, Sect. 8 contains some concluding remarks and gives suggestions for further research.

2 Decomposition

The original tandem queue L is decomposed into N−1 subsystems L 1,L 2,…,L N−1. Subsystem L i consists of buffer B i of size b i , an arrival server \(M_{i}^{a}\) in front of the buffer, and a departure server \(M_{i}^{d}\) after the buffer. Figure 2 displays the decomposition of line L of Fig. 1.

Fig. 2
figure 2

Decomposition of the tandem queue of Fig. 1 into 3 subsystems

The arrival server \(M_{i}^{a}\) of subsystem L i is, of course, server M i−1, but to account for the connection with the upstream part of L, its service times are different from S i−1. The random variable A i denotes the service time of the arrival server \(M_{i}^{a}\) in subsystem L i . This random variable aggregates S i−1 and possible starvation of M i−1 before service because of an empty upstream buffer B i−1. Accordingly, the random variable D i represents the service time of the departure server \(M_{i}^{d}\) in subsystem L i ; it aggregates S i and possible blocking of M i after service completion, because the downstream buffer B i+1 is full. Note that successive service times D i of departure server \(M_{i}^{d}\) are not independent: a long D i induced by blocking is more likely to be followed by again a long one. The same holds for long service times A i induced by starvation. We try to include dependencies between successive aggregate service times in the modeling of D i , but they will be ignored in A i . The reason for modeling A i and D i differently is that starvation occurs before the service start and blocking after service completion, so there is an “asymmetry” in the available information at the end of A i and D i , respectively. In the subsequent sections we construct an algorithm to iteratively determine the characteristics of A i and D i for each i=1,…,N−1.

3 Iterative method

This section is devoted to the description of the iterative algorithm to approximate the performance of tandem queue L. The algorithm is based on decomposition of L in N−1 subsystems L 1,L 2,…,L N−1 as explained in the previous section.

Step 0: Initialization

The first step of the algorithm is to initially assume that there is no blocking. This means that the random variables D i are initially assumed to be equal to S i .

Step 1: Evaluation of subsystems

We subsequently evaluate each subsystem, starting from L 1 and up to L N−1. First we determine new estimates for the first two moments of A i , before calculating the equilibrium distribution of L i .

(a) Service process of the arrival server

For the first subsystem L 1, the service time A 1 is equal to S 0, because server M 0 cannot be starved. For the other subsystems we proceed as follows in order to determine the first two moments of A i . Define \(p_{i,b_{i}+2}\) as the long-run fraction of time arrival server \(M_{i}^{a}\) of subsystem L i is blocked, i.e., buffer B i is full, \(M_{i}^{d}\) is busy, and \(M_{i}^{a}\) has completed service and is waiting to move the completed job into B i . By Little’s law we have for the throughput T i of subsystem L i ,

(1)

By substituting in (1) the estimate \(T_{i-1}^{(k)}\) for T i , which is the principle of conservation of flow, and \(p^{(k-1)}_{i,b_{i}+2}\) for \(p_{i,b_{i}+2}\) we get as new estimate for \(\mathbb{E}[A_{i}]\),

(2)

where the superscripts indicate in which iteration the quantities have been calculated. The second moment of A i cannot be obtained by using Little’s law. Instead we calculate the second moment using (3).

(b) Analysis of subsystem L i

Based on the new estimates for the first two moments of A i , we translate subsystem L i to a Markov process and calculate its steady-state distribution as described in Sect. 6.

(c) Determination of the throughput of L i

Once the steady-state distribution is known, we determine the new throughput \(T^{(k)}_{i}\) according to (7).

Step 2: Service process of the departure server

From subsystem L N−2 down to L 1, we adjust the parameters to construct the distribution of D i , as will be explained in Sect. 5. Note that D N−1=S N−1, because server M N−1 can never be blocked.

Step 3: Convergence

After Steps 1 and 2 we verify whether the iterative algorithm has converged or not by comparing the throughputs in the (k−1)-th and k-th iteration. When

$$\sum_{i=1}^{N-1} | T^{(k)}_{i}- T^{(k-1)}_{i} | < \varepsilon,$$

we stop and otherwise repeat Steps 1 and 2.

4 Service process of the arrival server

In this section, we model the service process of arrival server \(M_{i}^{a}\) of subsystem L i (cf. Step 1(a) in Sect. 3). As an approximation, we act as if the service times A i are independent and identically distributed, thus ignoring dependencies between successive service times A i .

Note that an arrival in buffer B i , i.e., a job being served by \(M_{i}^{a}\) moves to buffer B i when space becomes available, corresponds to a departure from \(M_{i-1}^{d}\) in the upstream subsystem L i−1. Just after this departure, two situations may occur: subsystem L i−1 is empty with probability (w.p.) \(q^{e}_{i-1}\), or it is not empty with probability \(1-q^{e}_{i-1}\). By convention, we do not count the job at \(M_{i-1}^{a}\) as being in L i−1. So subsystem L i−1 is empty whenever there are no jobs in B i−1 and \(M_{i-1}^{d}\). In the former situation, M i−1 has to wait for a residual service time of arrival server \(M_{i-2}^{a}\) of subsystem L i−1, denoted as RA i−1, before the actual service S i−1 can start. In the latter situation, the actual service S i−1 can start immediately. Hence, since the service time A i of arrival server \(M_{i}^{a}\) includes possible starvation of M i−1 before the actual service S i−1, we have

$$A_i = \left\{ \begin{array}{l@{\quad}l}RA_{i-1} + S_{i-1} & \mbox{with probability } q^e_{i-1},\\S_{i-1} & \mbox{otherwise}.\end{array}\right.$$

This representation is used to determine the second moment of A i . Based on \(q_{i-1}^{e}\) and the first two moments of RA i−1, the determination of which is deferred to Sect. 6 (cf. (8)), we obtain the second moment \(\mathbb{E}[A_{i}^{2}]\) as

$$ \mathbb{E}\bigl[A_i^2\bigr]=q_{i-1}^e\mathbb{E}\bigl[RA_{i-1}^2\bigr] + 2 q_{i-1}^e\mathbb{E}[RA_{i-1}] \mathbb{E}[S_{i-1}] + \mathbb{E}\bigl[S_{i-1}^2\bigr].$$
(3)

The first moment \(\mathbb{E}(A_{i})\) follows from (2) expressing conservation of flow.

5 Service process of the departure server

In this section, we describe the service process of the departure server \(M_{i}^{d}\) of subsystem L i in detail (cf. Step 2 in Sect. 3). To describe D i we take into account the occupation of the last position in buffer B i+1 (or server \(M_{i+1}^{d}\) if b i+1=0). A job served by \(M_{i}^{d}\) may encounter three situations in downstream subsystem L i+1 on departure from L i , or equivalently, on arrival at L i+1; see Fig. 3. The situation encountered on arrival has implications for possible blocking of the next job served by \(M_{i}^{d}\), as will be explained below.

  1. (i)

    The arrival is triggered by a service completion of departure server \(M_{i+1}^{d}\) of L i+1, i.e., server \(M_{i+1}^{a}\) was blocked because the last position in B i+1 was occupied, and waiting for \(M_{i+1}^{d}\) to complete service. Then the next service of \(M_{i}^{d}\) (if there is a job) and \(M_{i+1}^{d}\) start simultaneously and buffer B i+1 is full. We denote the time elapsing till the next service completion of departure server \(M_{i+1}^{d}\) by \(D^{b}_{i+1}\), which is equal to the time the last position in B i+1 will be occupied before it becomes available again. Hence, in this situation, the next service time D i of \(M_{i}^{d}\) is equal to the maximum of S i and \(D^{b}_{i+1}\), if \(M_{i}^{d}\) can immediately start with the next service. Otherwise, if \(M_{i}^{d}\) is starved just after the departure, D i is equal to the maximum of S i and the residual time of \(D^{b}_{i+1}\) at the service start of \(M_{i}^{d}\).

    Fig. 3
    figure 3

    Possible situations in downstream subsystem L i+1 encountered on departure from L i

  2. (ii)

    Just before the arrival there is only one position left in buffer B i+1. So, right after this arrival, B i+1 is full. Now we denote the time elapsing till the next service completion of departure server \(M_{i+1}^{d}\) by \(D^{f}_{i+1}\), which is again the time the last position in B i+1 will stay occupied. Thus D i is equal to the maximum of S i and the residual time of \(D^{f}_{i+1}\) at the service start of \(M_{i}^{d}\).

  3. (iii)

    Finally, when neither of the above situations occurs, the arrival does not fill up buffer B i+1, because there are at least two positions available in B i+1. Hence, the last position in B i+1 stays empty and the next service time D i is equal to S i .

Note that only in situation (i) and (ii) the next job to be served by \(M_{i}^{d}\) can be possibly blocked at completion of S i . If a departure from L i encounters situation (i), (ii), or (iii) in L i+1, then what is the probability that the next departure from L i encounters one of these situations? Now we are not going to act as if the probability that the next departure from L i encounters either of the three situations is independent of the past. This would imply that successive service times D i are independent (and they are not). Instead, we are going to introduce transition probabilities between the above three situations, i.e., the probability that a departure encounters situation (i), (ii) or (iii) depends on the situation encountered by the previous one. Hence, the service process of \(M_{i}^{d}\) will be described by a Markov chain.

If a departure from L i sees situation (i), and \(M_{i}^{d}\) can immediately start with the next S i and finishes before \(M^{d}_{i+1}\) finishes \(D^{b}_{i+1}\), then the next departure from L i sees again (i). However, if \(M_{i+1}^{d}\) finishes first, then on completion of S i by \(M_{i}^{d}\), both (ii) or (iii) may be seen. We denote by \(p^{b,nf}_{i+1}\) the probability that \(M_{i+1}^{d}\) completes at least two services before the next arrival at L i+1, given that \(M_{i+1}^{d}\) completes at least one service before the next arrival. So, if \(M_{i+1}^{d}\) finishes first, then the next departure from L i sees (iii) with probability \(p^{b,nf}_{i+1}\), and (ii) otherwise. We assumed that \(M_{i}^{d}\) can immediately start service after a departure. If, on the other hand, \(M_{i}^{d}\) is starved and has to wait for the next job to arrive, then \(D^{b}_{i+1}\) should be replaced by the residual time of \(D^{b}_{i+1}\) at the service start of \(M_{i}^{d}\).

The transitions are the same from situation (ii), except that \(D^{b}_{i+1}\) should be replaced by \(D^{f}_{i+1}\). So, if a departure from L i sees situation (ii), and \(M_{i}^{d}\) finishes before M i+1 (i.e., \(S_{i} < D^{f}_{i+1}\)), then the next departure from L i certainly sees (i). If \(S_{i} > RD^{f}_{i+1}\), then the next departure from L i sees (iii) with probability \(p^{f,nf}_{i+1}\) and (ii) otherwise.

Finally, in situation (iii), the next departure from L i will never see (i). It will see (ii) with probability \(p^{nf,f}_{i+1}\) and (iii) otherwise, where \(p^{nf,f}_{i+1}\) is defined as the probability that, on an arrival at L i+1, there is exactly one position left in the buffer of L i+1. The different situations and possible transitions are summarized in Table 1, where we assume that \(M_{i}^{d}\) can immediately start with the next service after a departure. If this is not the case, then \(D^{b}_{i+1}\) and \(D^{f}_{i+1}\) should be replaced by their residual times at the start of the next service of \(M_{i}^{d}\) (since \(D^{b}_{i+1}\) and \(D^{f}_{i+1}\) will always start at the moment of a departure).

Table 1 Different situations and possible transitions of the service process of departure server \(M_{i}^{d}\)

This completes the description of the service processes of the arrival and departure servers of L i . In the next section, we translate subsystem L i to a Quasi-Birth-Death (QBD) process; see Latouche and Ramaswami (1999).

6 Subsystem

In this section, we describe the analysis of a subsystem L i (cf. Steps 1(b) and 1(c) in Sect. 3). For ease of notation, we drop the subscript i in the sequel of this section. In order to translate L to a Markov process, we will describe the random variables introduced in the foregoing sections in terms of exponential phases, commonly referred to as phase-type distributed random variables (see e.g. Tijms 1994). In Sect. 6.1, we first explain how to fit phase-type distributions on the first and second moment. By employing this concept, we translate subsystem L to a Quasi-Birth-and-Death process (QBD) in Sect. 6.2. Based on the steady-state distribution of this QBD, we derive performance measures, which can be used to model the service process of the arrival server succeeding the subsystem and the service process of the departure server preceding the subsystem.

6.1 Fitting phase-type distributions on the first two moments

Consider a random variable X with mean \(\mathbb{E}[X]\) and second moment \(\mathbb{E}[X^{2}]\). The squared coefficient of variation \(c^{2}_{X}\) is defined as

$$c^2_X = \frac{\operatorname{var} (X)}{\mathbb{E}^2[X]} = \frac{\mathbb {E}[X^2]}{\mathbb{E}^2[X]} - 1.$$

We adopt the following recipe to fit a phase-type distribution on \(\mathbb{E}[X]\) and \(c^{2}_{X}\), see Tijms (1994). If \(1/k \leq c_{X}^{2} \leq1/(k-1)\) for some k=2,3,…, then the mean and squared coefficient of variation of the Erlang k−1,k distribution with density

$$ f(x)=p \mu^{k-1} \frac{x^{k-2}}{(k-2)!}e^{-\mu t}+(1-p)\mu^{k} \frac {x^{k-1}}{(k-1)!}e^{-\mu x}, \quad x \ge0,$$
(4)

matches \(\mathbb{E}[X]\) and \(c^{2}_{X}\), provided the parameters p and μ are chosen as

$$p= \frac {1}{1+c^{2}_{X}}\bigl(kc_{X}^{2}-\bigl(k\bigl(1+c^{2}_{X}\bigr)-k^2c^{2}_{X}\bigr)^{1/2}\bigr), \qquad\mu= \frac{k-p}{\mathbb{E}(X)}.$$

Hence, in this case we may describe X in terms of a random sum of k−1 or k independent exponential phases, each with rate μ. The phase diagram of the Erlang k−1,k distribution is illustrated in the left part of Fig. 4. Alternatively, if \(c^{2}_{X}>1\), then the Hyper-Exponential2 distribution with density

$$ f(t)=p \mu_1 e^{-\mu_1 x} + (1-p)\mu_2e^{-\mu_2 x}, \quad x \ge0,$$
(5)

matches \(\mathbb{E}[X]\) and \(c^{2}_{X}\), provided the parameters p, μ 1 and μ 2 are chosen as

$$p= \frac{1}{2} \biggl(1+\sqrt{ \frac{c^{2}_{X}-1}{c^{2}_{X}+1}}\biggr),\qquad\mu_1= \frac{2p}{\mathbb{E}[X]}, \qquad\mu_2= \frac {2(1-p)}{\mathbb{E}[X]}.$$

This means that X can be represented in terms of a probabilistic mixture of two exponential phases with rates μ 1 and μ 2, respectively. The phase diagram of the Hyper-exponential2 distribution is illustrated in the right part of Fig. 4.

Fig. 4
figure 4

Phase diagram of Erlang k−1,k distribution (left) and Hyper-exponential2 distribution (right)

A unified representation of Erlang and Hyper-exponential distributions is provided by the family of Coxian distributions (cf. Cumani 1982). A random variable X is said to have a Coxian k distribution if it has to go through at most k exponential phases, where phase i has rate ν i , i=1,…,k. It starts in phase 1 and after phase i, i=1,…,k−1, it enters phase i+1 with probability p i , and otherwise, it exits with probability 1−p i . Phase k is the last phase, so p k =0. Clearly, the Erlang k−1,k distribution is a Coxian k distribution with ν i =μ for all i and p i =1 for i=1,…,k−2 and p k−1=1−p. The Hyper-exponential2 distribution is a Coxian2 distribution with

$$\nu_1 = \mu_1 , \qquad\nu_2 =\mu_2, \qquad p_1 = (1-p) \frac{\mu_1-\mu_2}{\mu_1},$$

where, without loss of generality, μ 1μ 2. This representation of Erlang and Hyper-exponential distributions in terms of Coxians will be convenient for the description of the service processes of the arrival and departure server in Appendices A and B.

It is also possible to use phase-type distributions matching the first three (or even higher) moments; see e.g. van der Heijden (1993), Osogami and Harchol-Balter (2003). Obviously, there exist many phase-type distributions matching the first two moments. However, numerical experiments suggest that the use of other distributions does not essentially affect the results, cf. Johnson (1993).

6.2 Subsystem analysis

We apply the recipe of Sect. 6.1 to represent each of the random variables A, S, D b and D f in terms of exponential phases. The status of the service process of the arrival server M a can be easily described by the service phase of A. The description of the service process of the departure server M d is more complicated. Here we need to keep track of the phase of S and the phase of D b or D f, depending on situation (i), (ii) or (iii). The description of this service process is illustrated in the following example.

Example

Suppose that S can be represented by two successive exponential phases, D b by three phases and D f by a single phase, where each phase possibly has a different rate. Then the phase-diagram for each situation (i), (ii), and (iii) is sketched in Fig. 5. States a, b and c are the initial states for each situation. The gray states indicate that either S, D b or D f has completed all phases. A transition from one of the states d, e, f, g and h corresponds to a service completion of departure server M d (i.e., a departure from subsystem L); the other transitions correspond to a phase completion, and do not trigger a departure. The probability that a transition from state e is directed to initial state a is equal to 1; the probability that a transition from state d is directed to initial state a, b and c is equal to 0, 1−p b,nf and p b,nf, respectively. The transition probabilities from the other states f, g and h can be found similarly.

Fig. 5
figure 5

Phase diagram for the service process of the departure server

In Fig. 5 it is assumed that M d can immediately start with the next service S after a departure. However, if M d is starved, then S will not immediately start but has to wait for the next arrival at L (i.e., service completion of the arrival server M a). However, D b or D f will immediately start completing their phases, and may even have completed all their phases at the start of S.

From the example above, it will be clear that the service process of M d can be described by a Markovian Arrival Process (MAP): a finite-state Markov process with generator Q d . This generator can be decomposed as Q d =Q d0+Q d1, where the transitions of Q d1 correspond to service completions (i.e., departures from L) and the ones of Q d0 correspond to transitions not leading to departures. The dimension n d of Q d can be large, depending on the number of phases required for S, D b and D f. Similarly, the service process of M a can be described by a Markovian Arrival Process with generator Q a =Q a0+Q a1 of dimension n a . For an extensive treatment of MAPs, we refer the reader to Neuts (1989). The specification of the generators Q a and Q d is deferred to Appendices A and B, respectively.

Subsystem L can be described by a QBD with states (i,j,l), where i denotes the number of jobs in subsystem L, excluding the one at the arrival server M a. Clearly, i=0,…,b+2, where i=b+2 indicates that the arrival server is blocked because buffer B is full. The state variables j and l denote the state of the arrival and departure process, respectively. To specify the generator Q of the QBD we use the Kronecker product: If A is an n 1×n 2 matrix and B is an n 3×n 4 matrix, the Kronecker product AB is defined as

$$A \otimes B = \left( \begin{array}{c@{\quad}c@{\quad}c}A(1,1) B & \cdots& A(1,n_2) B\\\vdots&& \vdots\\A(n_1,1) B &\cdots& A(n_1,n_2) B\end{array}\right).$$

We order the states lexicographically and partition the state space into levels, where level i=0,1,…,b+2 is the set of all states with i jobs in the system. Then Q takes the form:

$$\mathbf{Q} = \left( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c@{\quad}c}B_{00} & B_{01} & & & & \\B_{10} & A_{1} & A_{0} & & & \\& A_{2} & \ddots& \ddots& & \\& & \ddots& \ddots& A_{0} & \\& & & A_{2} & A_1 & C_{10} \\& & & & C_{01} & C_{00} \\\end{array}\right).$$

Below we specify the submatrices in Q. The transition rates from levels 1≤ib are given by

where I n is the identity matrix of size n. The transition rates are different for the levels i=0 and i=b+2. At level b+2 the arrival server M a is blocked, so

where P(x,:) is the x-th row of matrix P and P(:,y) is the y-th column of P. To specify the transition rates to level 0, we introduce the transition rate matrix Q s of dimension n s , describing the progress of the phases of D b or D f while the departure server M d is starved. Further, the n d ×n s matrix \(\bar{Q}_{d1}\) contains the transition rates from states in Q d , that correspond to a departure, to the initial states in Q s . Finally, \(\bar{I}_{n_{s},n_{d}}\) is the 0-1 matrix of size n s ×n d that preserves the phase of Q s (i.e., the phase of D b or D f) when the departure server M d starts serving the next job after having been starved. Then we obtain

This concludes the specification of Q.

The steady-state distribution of the QBD can be determined by the matrix-geometric method; see e.g Latouche and Ramaswami (1999), Naoumov et al. (1997), van Vuuren and Adan (2009). We denote the equilibrium probability vector of level i by π i . Then π i has the matrix-geometric form

$$ \pi_i = x_1 R^{i-1} +x_{b+1} \hat{R}^{b+1-i}, \quad i=1,\ldots,b+1,$$
(6)

where R is the minimal nonnegative solution of the matrix-quadratic equation

$$A_0 +R A_1 +R^2 A_2 = 0,$$

and \(\hat{R}\) is the minimal nonnegative solution of

$$A_2 + \hat{R} A_1 + \hat{R}^2A_0 = 0.$$

The matrices R and \(\hat{R}\) can be efficiently determined by using an iterative algorithm developed in Naoumov et al. (1997). The vectors π 0, x 1, x b+1 and π b+2 follow from the balance equations at the boundary levels 0,1,b+1 and b+2,

Substitution of (6) for π 1 and π b+1 in the above equations yields a set of linear equations for π 0, x 1, x b+1 and π b+2, which together with the normalization equation, has a unique solution. This completes the determination of the equilibrium probabilities vectors π i . Once these probability vectors are known, we can easily derive performance measures and quantities required to describe the service times of the arrival and departure server.

Throughput:

The throughput T satisfies

(7)

where e is the all-one vector.

Service process of the arrival server:

To specify the service time of the arrival server we need the probability q e that the system is empty just after a departure and the first two moments of the residual service time RA of the arrival server at the time of such an event. The probability q e is equal to the mean number of departures per time unit leaving behind an empty system divided by the mean total number of departures per time unit. So

$$ q^e=\pi_1 B_{10} e / T.$$
(8)

The moments of RA can be easily obtained, once the distribution of the phase of the service time of the arrival server, just after a departure leaving behind an empty system, is known. Note that component (j,k) of the vector π 1 B 10 is the mean number of transitions per time unit from level 1 entering state (j,k) at level 0. By adding all components with j=l and dividing by π 1 B 10 e, i.e., the mean total number of transitions per time unit from level 1 to 0, we obtain the probability that the arrival server is in phase l just after a departure leaving behind an empty system. Further, if the service time A of the arrival server is represented by a Coxian distribution with n a phases, where phase j has rate ω j and exit probability 1−p j , j=1,…,n a , then the first two moments of the residual service time RA given that the service time A is in phase l are given by

Summation of the conditional moments multiplied by the probability of being in phase l yields the moments of RA.

Service process of the departure server:

We need to calculate the first two moments of D b and D f and the transition probabilities p b,nf, p f,nf and p nf,f. This requires the distribution of the initial phase upon entering level b+1 due to a departure (or arrival). Clearly, component (j,k) of π b+2 C 01 is equal to the number of transitions per time unit from level b+2 entering state (j,k) at level b+1. Hence, π b+2 C 01/π b+2 C 01 e yields the distribution of the initial phase upon entering level b+1 due to a departure. Defining D b(1) and D b(2) as the time till the first, respectively second, departure and A b(1) as the time till the first arrival, from the moment of entering level b+1, it is straightforward to calculate the moments of D b(1)≡D b and the probabilities Pr[D b(1)<A b(1)] and Pr[D b(2)<A b(1)]. Transition probability p b,nf now follows from

$$p^{b,nf} = \Pr\bigl[D^b (2) < A^b (1) |D^b (1) < A^b (1)\bigr] = \frac{\Pr[D^b(2) < A^b (1)]}{\Pr[D^b (1) < A^b (1)]} .$$

Calculation of the moments of D f and transition probability p f,nf proceeds along the same lines, where the distribution of the initial phase upon entering level b+1 due to an arrival is given by π b A 0/π b A 0 e. Finally, p nf,f satisfies

$$p^{nf,f} = \frac{\pi_b A_0 e}{\pi_0 B_{01} e + \sum_{i=1}^{b} \pi _i A_0e} .$$

7 Numerical results

In order to investigate the quality of the current method we evaluate a large set of examples and compare the results with discrete-event simulation. We also compare the results with the approximation of van Vuuren and Adan (2009), a recent and accurate approximation. The crucial difference between the two methods lies in the modeling of the departure service process: the current method attempts to take into account dependencies between successive service times. In each example we assume that only mean and squared coefficient of variation of the service times at each server are known, and we match, both in the approximation and discrete-event simulation, mixed Erlang or Hyper-exponential distributions to the first two moments of the service times, depending on whether the coefficient of variation is less or greater than 1; see (4) and (5) in Sect. 6. Then we compare the throughput and the mean sojourn time (i.e., the mean time that elapses from the service start at server M 0 until service completion at server M N−1) produced by the current approximation and the ones in van Vuuren and Adan (2009) with the ones produced by discrete-event simulation. Each simulation run is sufficiently long such that the widths of the 95% confidence intervals of the throughput and mean sojourn time are smaller than 1%.

We use the following set of parameters for the tests. The mean service times of the servers are all set to 1. We vary the number of servers in the tandem queue between 4, 8, 16, 24 and 32. The squared coefficient of variation (SCV) of the service times of each server is the same and is varied between 0.5, 1, 2, 3 and 5. The buffer sizes between the servers are the same and varied between 0, 1, 3 and 5. We will also test three kinds of imbalance in the tandem queue. We test imbalance in the mean service times by increasing the average service time of the ‘even’ servers from 1 to 1.2. The effect of imbalance in the SCV is tested by increasing the SCV of the service times of the ‘even’ servers by 0.5. Finally, imbalance in the buffer sizes is tested by increasing the buffers size of the ‘even’ buffers by 2. This leads to a total of 800 test cases.

The results for each category are summarized in Tables 2, 3, 4 and 5. Each table lists the average error in the throughput and the mean sojourn time compared with simulation results. Each table also gives for three error-ranges the percentage of the cases that fall in that range, and the average error of the approximation of van Vuuren and Adan (2009), denoted by VA.

Table 2 Overall results for tandem queues with different buffer sizes
Table 3 Overall results for tandem queues with different SCVs of the service times
Table 4 Overall results for tandem queues with different mean service times
Table 5 Overall results for tandem queues of different length

From the tables we can conclude that the current method performs well and better than van Vuuren and Adan (2009). The overall average error in the throughput is 2.56% and the overall average error in the mean sojourn time is 2.54%, while the corresponding percentages for van Vuuren and Adan (2009) are 4.40% and 5.82%, respectively.

In Table 2 it is striking that in case of zero buffers the current method produces the most accurate estimates, while the method of van Vuuren and Adan (2009) produces the least accurate results. A possible explanation is that for each subsystem the current method keeps track of the status of the downstream server while its departure server is starved; this is not done in van Vuuren and Adan (2009). Both methods seem to be robust to variations in buffer sizes along the line. Table 3 convincingly demonstrates that especially in case of service times with high variability the current approximation performs much better that van Vuuren and Adan (2009). Remarkably, Table 5 shows that, while van Vuuren and Adan (2009) performs better for short lines, the average error in the throughput of the current method does not seem to increase for longer lines, a feature not shared by the approximation of van Vuuren and Adan (2009).

Lastly, we compare the current method to other approaches reported in the literature. In Table 6, results are listed for tandem lines with four servers and exponential service times (used in Buzacott et al. 1995). The columns Exact, App, Buz, and Per list the exact results, results of the current approximation, the approximation of Buzacott et al. (1995), and the one of Perros and Altiok (1986). Table 7 lists results for tandem lines with three servers and non-exponential service times. In this table, the columns Sim, App, Buz, and Alt show results of simulation, the current approximation, the approximation of Buzacott et al. (1995), and the one of Altiok (1989). Both tables show that the methods perform well on these cases.

Table 6 Throughput for four-server lines with exponential service times
Table 7 Throughput for three-server lines with general service times

8 Conclusions

In this paper, we developed an approximate analysis of single-server tandem queues with finite buffers, based on decomposition into single-buffer subsystems. The distinguishing feature of the analysis is that dependencies between successive aggregate service times (including starvation and blocking) are taken into account. Numerical results convincingly demonstrated that it pays to include such dependencies, especially in case of longer tandem queues and service times with a high variability. The price to be paid, of course, is that the resulting subsystems are more complex and computationally more demanding.

We conclude with a remark on the subsystems. There seems to be an asymmetry in the modeling of the service processes of the arrival and departure server. The service times of the arrival server are assumed to be independent and identically distributed, whereas the service times of the departure server are modeled by a Markovian arrival process, carefully taking into account dependencies between successive service times. Investigating whether a similar Makovian description of the service process of the arrival server is also feasible (and rewarding) seems to be an interesting direction for future research.