1 Introduction

This paper is devoted to polling systems. The basic polling system is a queueing model in which customers arrive at n queues according to independent Poisson processes, and in which a single server visits those n queues in cyclic order to serve the customers. When \(n=1\), this system reduces to the classical M / G / 1 queue. For general n, the basic polling system may be viewed as an M / G / 1 queue with n customer classes and dynamically changing priority—in contrast to queueing models with multiple customer classes which have fixed priority levels. In many applications, the switchover times of the server, when moving from one queue to another, are nonnegligible and should be included in the model.

Applications of polling systems abound, because a service facility that can serve the needs of n different types of customers is such a natural setting in every-day life. Indeed, polling systems have been used to model a plethora of congestion situations, like (i) a patrolling repairman with n types of repair jobs, (ii) a machine producing n types of products on demand, (iii) protocols in computer-communication systems, allocating resources to n stations, job types or traffic sources, and (iv) a signalized road traffic intersection with n different traffic streams. These and other application areas have given rise to a huge range of variants and extensions of the basic polling system. Several overviews of the applicability of polling systems have been published, cf.  Grillo (1990), Levy and Sidi (1990), Takagi (1991) and Boon et al. (2011). We, therefore, refrain from an extensive discussion of polling applications. When it comes to polling surveys, one should of course mention that, until 2000, Takagi maintained a quite complete bibliography on polling models, which included more than 700 publications (Takagi 1997, 2000). A more recent survey is Vishnevskii and Semenova (2006).

The main goals of the present paper are threefold: first, to discuss a number of the key methodologies in analyzing polling models; second, to give an overview of recent polling developments; and finally, to present a number of challenging open problems, which hopefully promote the interest of the reader in this fascinating field.

As a disclaimer, we would like to emphasize that we do not aim for completeness. Since the publication of the survey (Takagi 2000), several hundreds of polling papers have appeared. When discussing recent developments, we mainly focus on contributions which we believe to be methodologically important or which give rise to interesting open problems—and undoubtedly there is a bias towards publications which are in some way related to the authors.

Polling models are closely related to queueing models with vacations. One could naively model one queue of a polling model as a queue in isolation, in which the intervisit time (composed of switchover times and visit times at the other queues, i.e., the time periods the server spends at a queue) is viewed as a server vacation. Unfortunately, the intervisit times depend on the visit times in an intricate way.

In this paper, we do not give much consideration to queues with vacations; we refer the reader to the surveys of Doshi (1986, 1990) and the books of Takagi (1991) and Tian and Zhang (2006).

The remainder of this paper is organized in the following way. Section 2 presents a detailed model description. Section 3 reviews some properties and results of very general validity, including the so-called pseudo-conservation law. Section 4 focuses on waiting times and (mainly) joint queue-length distributions, for the important class of disciplines which satisfy a so-called branching property. Section 5 is devoted to polling models which do not satisfy that property.

The next few sections consider some special topics: polling models with arrival processes that generalize the above-mentioned Poisson processes (Sect. 6), scheduling in polling models (Sect. 7) and two types of asymptotics: many-queue asymptotics and heavy-traffic asymptotics (Sect. 8). Section 9 contains a collection of interesting isolated polling models and results. Finally, Sect. 10 presents some suggestions for further research.

2 Model description

We are interested in situations in which a service facility offers services to n classes of customers, in some prescribed order. We present the model description via 10 assumptions. Some of these assumptions will be relaxed in later sections.

Assumption 1

The service facility has a single server and that server works at unit speed when it is working.

Assumption 2

The number of customer classes, n, is finite.

Assumption 3

Customers in the various classes arrive at the service facility according to n independent Poisson arrival processes, with intensity \(\lambda _i\) for class i, joining a queue \(Q_i\), \(i=1,2,\ldots ,n\). Customers of class i have service requirements which are independent, identically distributed (i.i.d.) random variables, generically denoted by \(\mathbf{B}_i\), with distribution \(B_i(\cdot )\) and Laplace–Stieltjes transform (LST) \(\beta _i(\cdot )\), \(i=1,2,\ldots ,n\). Service requirements of customers of different classes are also independent of each other and of the arrival processes.

Assumption 4

Each queue has an infinite buffer capacity. Furthermore, all customers have infinite patience; hence, no customer is lost.

Assumption 5

The routing policy of the server is cyclic: the server successively visits the queues in order \(Q_1,Q_2,\ldots ,Q_n,Q_1,Q_2,\ldots ,Q_n\) etc. Another option that we will briefly touch upon is a polling table, i.e., a fixed visit pattern which is cyclically repeated (like star polling with \(Q_1\) as center of the star: \(Q_1,Q_2,Q_1,Q_3,\ldots ,Q_1,Q_n\)). Yet, another option is random polling, in which the server visits the queues according to a probabilistic visit scheme. Markovian polling refers to the case in which the transitions between queues follow a Markov chain.

Assumption 6

The service policy, describing the behavior of the server while visiting a queue, can be one of many policies which have been considered in the literature. The most popular ones are the following: (i) exhaustive: the server keeps serving a queue until it has become empty; (ii) gated: the server keeps serving a queue until all those customers have been served that were already present when the server arrived at that queue; (iii) k-limited: the server keeps working at a queue until a predefined number of k customers has been served, or the queue has become empty—whichever occurs first. Other policies include decrementing service: the server serves a queue until the number in that queue has decreased to one less than the number present upon arrival of the server; time-limited service: the server serves customers at \(Q_i\) until a time limit \(T_i\) has been reached, or until the queue has become empty—whichever occurs first; and binomial-gated: the server restricts service to the customers present upon its arrival, but each of those is only served with a fixed probability \(p_i\) (in \(Q_i\), \(i=1,2,\ldots ,n\)). Another well-studied policy is Globally gated: when the server arrives at \(Q_1\) at some time \(t_1\), it starts a cycle of the n queues in which it only serves the customers that are already present at \(t_1\).

Finally, we assume that a server does not stay at an empty queue if other queues are not empty (non-idling assumption); however, in Sect. 9.2, we briefly consider an idling service policy.

Assumption 7

The service order within each queue is First-Come First-Served (FCFS). This assumption was almost universally made in the polling literature until the work of Wierman et al. (2007). In Sect. 7, we will discuss non-FCFS service orders.

Assumption 8

The times to switch from \(Q_i\), \(i=1,2,\ldots ,n\), to the next queue are assumed to be i.i.d. random variables, generically denoted by \(\mathbf{S}_i\), with distribution \(S_i(\cdot )\) and LST \(\sigma _i(\cdot )\). All switchover times are assumed to be independent of each other and of the interarrival and service times. When the switchover times between successive queues are all zero, a special situation arises. If the system has become empty after a visit to, say, \(Q_i\) in the case of zero switchover times, then the server is assumed to visit queues \(Q_{i+1},\ldots ,Q_n\) (which now takes zero time) and stay in front of \(Q_1\) (see Sect. 4). In the case of non-zero switchover times, the server is assumed to keep switching in an empty system.

Assumption 9

As soon as a customer has been served, it leaves the system. At some places, we briefly mention the case of customer routing; a served customer might rejoin the same queue, or join another one.

Assumption 10

The total traffic load is such that the key stochastic processes (queue lengths and waiting times) reach steady state. A necessary condition for this is that the total offered load \(\rho := \sum _{i=1}^n \rho _i < 1\); here, \(\rho _i := \lambda _i {E}\mathbf{B}_i\) is the mean offered load at \(Q_i\) per time unit, \(i=1,2,\ldots ,n\). When all switchover times are zero, this condition is also sufficient. Otherwise, the situation may be much more complicated, and in particular, the service policies may influence the stability condition; e.g., in 1-limited service, the server is forced to spend a switchover time after each service, see Fricker and Jaïbi (1994) for an extensive discussion of these stability issues. We refer to Foss and Chernova (1996a), Foss and Chernova (1996b), Foss et al. (1996), Foss and Last (1996), Foss and Last (1998), Foss and Kovalevskii (1999), and Kovalevskii et al. (2005) for stability results for various polling systems (not necessarily satisfying all of the above assumptions), along with related dominance theorems and fluid limits.

When a polling system satisfies all 10 assumptions, we denote it by PS.

3 General results

In this section, we discuss a number of results which hold for basically all PS, i.e., polling systems that satisfy Assumptions 110 of Sect. 2. These are cycle-time and visit-time results (Sect. 3.1), workload decompositions (Sect. 3.2), pseudo-conservation laws for mean waiting times (Sect. 3.3), Eisenberg’s relations between queue lengths at visit beginnings, visit completions, service beginnings and service completions (Sect. 3.4), and a general relation between the joint queue-length distribution at an arbitrary epoch and the joint queue-length distributions at visit beginnings and visit completions (Sect. 3.5).

3.1 Mean cycle and visit times

In a polling model of type PS, let us define the cycle time \(\mathbf{C}_i\) of \(Q_i\) as the time between two successive visit beginnings of the server to \(Q_i\). If the mean total switchover time in a polling model of type PS is positive, i.e., \(s := \sum _{i=1}^n {E}\mathbf{S}_i > 0\), then the mean cycle time for \(Q_i\) satisfies the following balance equation:

$$\begin{aligned} {E}\mathbf{C}_i - s = \rho {E}\mathbf{C}_i, \quad i=1,2,\ldots ,n. \end{aligned}$$

Indeed, the left-hand side gives the mean length of time the server is working during an arbitrary cycle of \(Q_i\), and the right-hand side gives the mean amount of work arriving in PS during an arbitrary cycle \(\mathbf{C}_i\). In steady state, these two quantities should be equal. Hence, we find

$$\begin{aligned} {E}\mathbf{C}_i = \frac{s}{1-\rho }, ~~~i=1,2,\ldots ,n. \end{aligned}$$
(1)

Apparently, each queue has the same mean cycle time \({E}\mathbf{C}\). It is important to notice, though, that the distributions of the cycle times of different queues, and even the variances, may not the same (unless the system is completely symmetric).

The balance argument used above also immediately implies that the mean visit time \({E}\mathbf{V}_i\) of \(Q_i\) is given by

$$\begin{aligned} {E}\mathbf{V}_i = \rho _i {E}\mathbf{C}_i = \frac{\rho _i s}{1-\rho } , ~~~i=1,2,\ldots ,n. \end{aligned}$$
(2)

In a system with zero switchover times, viz., \(s=0\), Formulas (1) and (2) still hold if the server is assumed to keep cycling when the system has become empty (indeed, in an empty system, there will be an infinite number of zero-length cycles); however, these formulas are meaningless then.

3.2 Workload decompositions

Again, consider the polling system PS, and assume in addition that all switchover times are zero. The server is then always working as long as there are customers in the system (cf. Assumption 6). Since the server is working at unit speed when it is working (Assumption 1), a sample path consideration reveals that the amount of work in the system evolves in a way that does not depend on the order of service of the queues, or within the queues, and neither on the service policies at the queues. This is the principle of work conservation (cf. Heyman and Sobel 1982, p. 418). In particular, the amount of work evolves exactly as in an M / G / 1 queue in which the arrival rate is \(\Lambda := \sum _{i=1}^n \lambda _i\) and in which the service time distribution is \(\sum _{i=1}^n \frac{\lambda _i}{\Lambda } B_i(\cdot )\). We denote this queue by the ’corresponding M / G / 1 queue’.

If the switchover times are positive, then the principle of work conservation is violated: the server is sometimes switching (not working), although there is work present in the system. It was proven in Boxma and Groenendijk (1987) that, for a cyclic polling system PS, a principle of work decomposition holds: the steady-state amount of work \(\mathbf{V}_\mathrm{{with}}\) in PS with switchover times is, in distribution, the sum of the steady-state amount of work \(\mathbf{V}_\mathrm{{without}}\) in the corresponding PS without switchover times (hence the corresponding M / G / 1 queue) plus the steady-state amount of work \(\mathbf{Y}\) present in the system at an epoch in which the server is not working:

$$\begin{aligned} \mathbf{V}_\mathrm{{with}} {\mathop {=}\limits ^{d}} \mathbf{V}_\mathrm{{without}} + \mathbf{Y}, \end{aligned}$$
(3)

and \(\mathbf{V}_\mathrm{{without}}\) and \(\mathbf{Y}\) are independent. This decomposition result was generalized in Boxma (1989) to a large class of single-server queues with multiple customer classes and various forms of work interruptions. These decompositions fit in a line of decomposition results for queueing models with vacations/interruptions which goes back to the ground-breaking paper of Fuhrmann and Cooper (1985a) which concentrates on queue-length decompositions. It should be noticed that queue lengths are much more sensitive to distributional assumptions than workload, and hence, the conditions for queue-length decompositions to hold are also more stringent than those for workload decompositions. Most of the decomposition proofs rely on sample path considerations, and on the fact that the workload evolves exactly the same under FCFS and Last-Come First-Served (LCFS), and on the exploitation of nice properties of the LCFS Preemptive-Resume discipline, see also the insightful discussion in Ivanovs and Kella (2013), and a workload decomposition for polling models with multi-dimensional Lévy input in Boxma and Kella (2014).

3.3 Pseudo-conservation laws

For the PS model, one can express the mean workload \({E}\mathbf{V}_\mathrm{{with}}\) into the mean numbers \({E}\mathbf{N}_i\) of waiting customers at the various queues of PS, and hence, via Little’s formula, into the mean waiting times \({E}\mathbf{W}_i\). This is sometimes referred to as Brumelle’s formula Brumelle (1971):

$$\begin{aligned} {E}\mathbf{V}_\mathrm{{with}} = \sum _{i=1}^n {E}\mathbf{B}_i {E}\mathbf{N}_i + \sum _{i=1}^n \rho _i \frac{{E}\mathbf{B}_i^2}{2{E}\mathbf{B}_i} = \sum _{i=1}^n \rho _i {E}\mathbf{W}_i + \frac{1}{2} \sum _{i=1}^n \lambda _i {E}\mathbf{B}_i^2. \end{aligned}$$
(4)

Indeed, \({E}\mathbf{B}_i {E}\mathbf{N}_i\) is the mean amount of work of waiting customers at \(Q_i\) (we use here the fact that service at each queue is non-preemptive; hence, we have to exclude a discipline like time-limited), and \(\rho _i \frac{{E}\mathbf{B}_i^2}{2 {E}\mathbf{B}_i}\) is the product of the probability that \(Q_i\) is being served at an arbitrary epoch, and the mean length of the residual service time of a customer at \(Q_i\).

Using (3) and the fact that, in the case of zero switchover times, one has (using a well-known result for the ‘corresponding M / G / 1 queue’):

$$\begin{aligned} {E}\mathbf{V}_\mathrm{{without}} = \sum _{i=1}^n \frac{\lambda _i {E}\mathbf{B}_i^2}{2(1-\rho )}, \end{aligned}$$
(5)

the following so-called pseudo-conservation law (PCL) for the mean waiting times is obtained Boxma and Groenendijk (1987):

$$\begin{aligned} \sum _{i=1}^n \rho _i {E}\mathbf{W}_i = \rho \sum _{i=1}^n \frac{\lambda _i {E}\mathbf{B}_i^2}{2(1-\rho )} + {E}\mathbf{Y}. \end{aligned}$$
(6)

In Boxma and Groenendijk (1987), \({E}\mathbf{Y}\) is subsequently split in three terms:

$$\begin{aligned} {E}\mathbf{Y}= \rho \frac{s^{(2)}}{2s} + \frac{s}{2(1-\rho )} \left[ \rho ^2 - \sum _{i=1}^n \rho _i^2 \right] + \sum _{i=1}^n {E}\mathbf{Z}_{ii}, \end{aligned}$$
(7)

where s and \(s^{(2)}\) are the mean and second moment of the total switchover time in one cycle of the server. The three terms reflect the influence of the presence of switchover times. All three terms have an easy probabilistic interpretation. Focussing on the contributions from \(Q_i\), one has \({E}\mathbf{Z}_{ii}\) in the last term, which denotes the mean amount of work left behind by the server in \(Q_i\) after a visit to that queue. In the first term, one has a contribution \(\rho _i \frac{s^{(2)}}{2s}\), which is the mean amount of work which has arrived in \(Q_i\) (after the server visit to \(Q_i\)) during the past part of the total switchover time in a cycle. Finally, the contribution of \(Q_i\) to the second term of (7), \(\rho _i \sum _{j=i+1}^n \frac{\rho _j s}{1-\rho }\), is the mean total workload which has arrived in \(Q_i\) during the visit times at \(Q_{i+1},\ldots ,Q_n\) of the server (cf. (2)).

The term \({E}\mathbf{Z}_{ii}\) is the only term that depends on the service policy at the queues (and in fact only on the service policy at that particular queue). For many service policies, it is easy to determine \({E}\mathbf{Z}_{ii}\). For exhaustive service, it equals zero, and for gated service \({E}\mathbf{Z}_{ii} = \rho _i^2 \frac{s}{1-\rho }\); indeed, \(\rho _i {E}\mathbf{V}_i\) arrives on average at \(Q_i\) per visit, and \({E}\mathbf{V}_i = \frac{\rho _i s}{1-\rho }\) according to (2).

The PCL has been generalized in several directions, including batch Poisson arrivals, polling tables, and Markovian polling. The simplicity, quite general validity, and robustness of the PCL make it suitable for several purposes. These include the development of approximations for mean waiting times and/or a check of such approximations and optimizations as will be discussed in Sects. 9.2 and 9.3.

3.4 Eisenberg’s relation

In this section, following Borst and Boxma (1997), we discuss a beautiful relation of Eisenberg (1972), which in our view would have deserved greater attention in the polling literature. Eisenberg relates the probability generating functions of queue lengths at various instants: visit beginnings and endings, and service beginnings and endings. Eisenberg (1972) studies a polling model with non-zero switchover times and the exhaustive service discipline at all queues (while briefly discussing the case of gated service at all queues). He considers the following four quantities, with \(\mathbf{N}\) denoting a vector of numbers of customers at \(Q_1, \ldots , Q_n\) and N a realization:

  • \(\mathbf{S}_{b_i}(t, N)\) := number of service beginnings at \(Q_i\) in (0, t) for which \(\mathbf{N}= N\);

  • \(\mathbf{S}_{c_i}(t, N)\) := number of service completions at \(Q_i\) in (0, t) for which \(\mathbf{N}= N\);

  • \(\mathbf{V}_{b_i}(t, N)\) := number of visit beginnings at \(Q_i\) in (0, t) for which \(\mathbf{N}= N\);

  • \(\mathbf{V}_{c_i}(t, N)\) := number of visit completions at \(Q_i\) in (0, t) for which \(\mathbf{N}= N\).

In the case of a service or visit completion, the state is defined as what exists immediately after the departure of the customer.

Eisenberg (1972) now makes the crucial observation that each time a visit beginning or a service completion occurs, this coincides with either a service beginning or a visit completion. Hence

$$\begin{aligned} \mathbf{V}_{b_i}(t, N) + \mathbf{S}_{c_i}(t, N) = \mathbf{S}_{b_i}(t, N) + \mathbf{V}_{c_i}(t, N). \end{aligned}$$
(8)

As observed in Borst and Boxma (1997), (8) not only holds for the case of non-zero switchover times and exhaustive or gated service, but for any service discipline, and also for the case of zero switchover times. Define the following equilibrium state probabilities for this polling model:

  • \(\tilde{S}_{b_i}(N) := Pr (\mathbf{N}= N\), S is at \(Q_i \mid \) service beginning instant);

  • \(\tilde{S}_{c_i}(N) := Pr(\mathbf{N}= N\), S is at \(Q_i \mid \) service completion instant);

  • \(\tilde{V}_{b_i}(N) := Pr( \mathbf{N}= N\) \(\mid \) visit beginning at \(Q_i\));

  • \(\tilde{V}_{c_i}(N) := Pr( \mathbf{N}= N\) \(\mid \) visit completion at \(Q_i\)).

Eisenberg (1972) divides all four terms in (8) by the total number of service completions at all queues in (0, t), and takes the limit for \(t \rightarrow \infty \). He thus relates those four equilibrium state probabilities:

$$\begin{aligned} \gamma _i \tilde{V}_{b_i}(N) + \tilde{S}_{c_i}(N) = \tilde{S}_{b_i}(N) + \gamma _i \tilde{V}_{c_i}(N). \end{aligned}$$

Here, \(\gamma _i\) is the long-term ratio of the number of visit completions at \(Q_i\) to the number of customers that are handled by the system; in this cyclic polling model \(\gamma _i \equiv \gamma \), \(i = 1, \ldots , n\). Written in terms of PGFs (probability generating functions)

$$\begin{aligned} \gamma V_{b_i}(z) + S_{c_i}(z) = S_{b_i}(z) + \gamma V_{c_i}(z), \end{aligned}$$
(9)

for \(z = (z_1, \ldots , z_n)\), \(\mid z_j \mid \, \le 1\), \(j = 1, \ldots , n\); here, \(V_{b_i}(z)\) and \(V_{c_i}(z)\) denote the PGF of the joint queue-length distribution at visit beginnings and visit completions of \(Q_i\), respectively, while \(S_{b_i}(z)\) and \(S_{c_i}(z)\) denote the PGF of the joint distribution of queue-length vector and server position at service beginnings and service completions, respectively.

Now, Eisenberg observes that \(S_{c_i}(z)\) and \(S_{b_i}(z)\) are related via

$$\begin{aligned} S_{c_i}(z) = S_{b_i}(z) \beta _i\left( \sum \limits _{j = 1}^{n}\lambda _j (1 - z_j)\right) / z_i, \end{aligned}$$
(10)

for \(\mid z_j \mid \, \le 1\), \(j = 1, \ldots , n\). It follows from (9) and (10) that

$$\begin{aligned} S_{c_i}(z) = \frac{\gamma \beta _i\left( {\sum \nolimits _{j = 1}^{n}} \lambda _j (1 - z_j)\right) }{z_i - \beta _i\left( {\sum \nolimits _{j = 1}^{n}} \lambda _j (1 - z_j)\right) } [V_{b_i}(z) - V_{c_i}(z)]. \end{aligned}$$
(11)

Eisenberg, considering the variant with switchover times and exhaustive service, subsequently expresses \(V_{b_i}(z)\) into \(V_{c_{i - 1}}(z)\). For the moment we refrain from that [see (15)], but we observe that Formula (11) is generally valid for the polling systems PS described in Sect. 2 (with and without switchover times).

Taking \(z = (1, \ldots , 1, y, 1, \ldots , 1)\) in (11), with y as ith argument, and dividing by the probability \(\lambda _i / \lambda \) that an arbitrary service completion is at \(Q_i\), gives the queue-length PGF at \(Q_i\) at a service completion instant at \(Q_i\). PASTA, in combination with a standard up- and down-crossing argument, shows that the queue-length distribution at \(Q_i\) at its service completion instants, at its customer arrival instants, and in steady state, are all the same. Hence, with \(\mathbf{N}_i\) the steady-state queue length at \(Q_i\) and with \(\mathbf{X}_i\) and \(\mathbf{Y}_i\) the steady-state queue lengths at \(Q_i\) at the beginning and end of a visit at that queue (or, equivalently, at the end and beginning of an intervisit time of \(Q_i\)), one obtains after some rewriting (see Borst and Boxma 1997 for the details):

$$\begin{aligned} {E}(y^{\mathbf{N}_i}) = \frac{(1 - \rho _i) (1 - y) \beta _i(\lambda _i (1 - y))}{\beta _i(\lambda _i (1 - y)) - y} \frac{{E}(y^{\mathbf{Y}_i}) - {E}(y^{\mathbf{X}_i})}{(1 - y) ({E}\mathbf{X}_i - {E}\mathbf{Y}_i)}, ~~~|y| \le 1. \end{aligned}$$
(12)

The first term in the right-hand side is the PGF \({E}(y^{\mathbf{N}_{i \mid M / G / 1}})\) of the queue-length distribution in a ‘corresponding’ isolated M / G / 1 queue of \(Q_i\) with arrival rate \(\lambda _i\) and service time LST \(\beta _i(\cdot )\). The second term appears to be the PGF of the number of customers \(\mathbf{N}_{i \mid I}\) at an arbitrary intervisit time of \(Q_i\). Formula (12) implies that

$$\begin{aligned} \mathbf{N}_i {\mathop {=}\limits ^{d}} \mathbf{N}_{i \mid M / G / 1} + \mathbf{N}_{i \mid I}, \end{aligned}$$
(13)

the two terms in the right-hand side being independent. This is the well-known Fuhrmann–Cooper queue-length decomposition (Fuhrmann and Cooper 1985a).

Remark 3.1

Fuhrmann and Cooper (1985a) state five conditions under which their decomposition holds; these conditions are contained in the 10 assumptions of Sect. 2, except that it is explicitly assumed in Fuhrmann and Cooper (1985a) that service is non-preemptive, a condition that is violated when the service discipline is time-limited, for example.

Using the distributional form of Little’s law, cf. Keilson and Servi (1990), the above Fuhrmann–Cooper queue-length decomposition (13) immediately translates into a waiting-time decomposition. In Sect. 4.1, we will return to this relation, for the case of polling models that satisfy Property 4.1. \(\square \)

3.5 The joint queue-length distribution at an arbitrary epoch

In Sect. 3.4, we focused on queue-length vectors at visit beginnings and visit completions, and at service beginnings and service completions. Throughout the polling literature, the attention has always been on those epochs, as far as joint queue-length distributions is concerned. However, in Boxma et al. (2011), it was shown that, for the general PS model, one can express the PGF L(z) of the steady-state joint queue-length distribution at an arbitrary epoch in those at visit beginnings and visit completions, in the following way (with \(z = (z_1,\ldots ,z_n)\)):

$$\begin{aligned} L(z)= \frac{1}{{E}\mathbf{C}}\sum _{i=1}^n\left( \frac{V_{b_i}(z)-V_{c_i}(z)}{\Sigma (z)} \frac{z_i\left( 1-\beta _i(\Sigma (z))\right) }{z_i-\beta _i(\Sigma (z))} + \frac{V_{c_i}(z)-V_{b_{i+1}}(z)}{\Sigma (z)} \right) , \end{aligned}$$
(14)

with \(\Sigma (z):=\sum _{j=1}^n \lambda _j(1-z_j)\). Its proof in Boxma et al. (2011) is based on the following relations:

  1. (i)

    Eisenberg’s (1972) relation (11) as generalized to PS polling models in Borst and Boxma (1997).

  2. (ii)

    Relation (10) between queue-length PGFs at the beginning and end of a service.

  3. (iii)

    An obvious relation between queue lengths at the beginning and end of a switchover:

    $$\begin{aligned} V_{b_{i+1}}(z)=V_{c_i}(z) \sigma _i\left( \Sigma (z)\right) , \quad i=1,2,\ldots ,n. \end{aligned}$$
    (15)
  4. (iv)

    A stochastic mean value theorem, expressing L(z) as an average over the PGFs of the joint queue-length distribution at an arbitrary moment during a visit to \(Q_i\) (\(X_i(z)\)) and during a switchover period between \(Q_i\) and \(Q_{i+1}\) (\(Y_i(z)\)):

    $$\begin{aligned} L(z) = \frac{1}{{E}\mathbf{C}}\sum _{i=1}^n\left( \frac{{E}\mathbf{B}_i}{\gamma _i} X_i(z) + s_iY_i(z) \right) , \end{aligned}$$
    (16)

    where, for \(i=1,2,\ldots ,n\),

    $$\begin{aligned} X_i(z)= & {} S_{b_i}(z) \beta _i^\mathrm{{past}}(\Sigma (z)), \end{aligned}$$
    (17)
    $$\begin{aligned} Y_i(z)= & {} V_{c_i}(z) \sigma _i^\mathrm{{past}}(\Sigma (z)), \end{aligned}$$
    (18)

    where \(\beta _i^\mathrm{{past}}(\cdot )\) and \(\sigma _i^\mathrm{{past}}(\cdot )\) are the LSTs of the past parts of \(\mathbf{B}_i\) and \(\mathbf{S}_i\), respectively, and therefore

    $$\begin{aligned} \beta _i^\mathrm{{past}}(\Sigma (z)) = \frac{1 - \beta _i(\Sigma (z))}{{E}\mathbf{B}_{i} \Sigma (z)}, \qquad \sigma _i^\mathrm{{past}}(\Sigma (z)) = \frac{1 - \sigma _i(\Sigma (z))}{{E}\mathbf{S}_{i} \Sigma (z)}. \end{aligned}$$
    (19)

Starting from (16), substituting (17) and (18), and using (10) and (11) to eliminate all \(S_{b_i}(z)\) and \(S_{c_i}(z)\), yields (14).

Remark 3.2

In Boxma et al. (2011) also zero switchover times are allowed; the same result (14) is shown to hold.

In Theorem 1 of Boxma et al. (2011), it was subsequently observed that one may simplify (14) as follows, using the fact that \(\sum _{i=1}^n (V_{c_i}(z) - V_{b_{i+1}}(z)) = \sum _{i=1}^n (V_{c_i}(z) - V_{b_{i}}(z))\) and (11):

$$\begin{aligned} L(z) = \frac{\sum _{i=1}^n \lambda _i (1-z_i) S_{c_i}(z)}{\sum _{i=1}^n \lambda _i(1-z_i)}. \end{aligned}$$
(20)

This formula is remarkably simple; notice that it does not involve the service time distributions and that the service disciplines at the various queues do not play a role either, which confirms that (14) is based on very general principles. A short proof of this formula was subsequently presented in Boon et al. (2017). That proof is based on a very simple, yet very general, balance equation for n-dimensional queue-length processes just before arrivals and just after departures, and on PASTA. For marginal queue lengths, it reduces to a classical up- and downcrossing identity.

4 The joint queue-length distribution at polling epochs

In Sect. 3.4, we have seen that Eisenberg’s results (Eisenberg 1972) yield simple relations between the PGF \(S_{c_i}(z)\) of the joint queue-length vector at service completion epochs (or \(S_{b_i}(z)\), at service beginning epochs) and the PGFs \(V_{b_i}(z)\) and \(V_{c_i}(z)\) of the joint queue-length vector at visit beginning and visit completion epochs. Here, again, \(z = (z_1,\ldots ,z_n)\). We now restrict ourselves to polling models for which the service discipline at each queue satisfies the following property:

Property 4.1

If there are \(k_i\) customers present at \(Q_i\) at the start of a visit, then during the course of the visit, each of these \(k_i\) customers will effectively be replaced in an i.i.d. manner by a random population having PGF \(h_i(z_1 ,\ldots , z_n )\), which may be any n-dimensional PGF.

Resing (1993) (see also Fuhrmann 1981) has studied polling systems that satisfy this property; this includes the case of exhaustive or gated service at all queues, but it excludes the case of 1-limited service at any queue. Resing (1993) has pointed out that, for this class of polling systems, the joint queue-length process at visit instants of a fixed queue is a so-called multi-type branching process with immigration. The theory of multi-type branching processes (cf. Athreya and Ney 1972; Resing 1990) thus leads to an expression for the PGF of the joint steady-state queue-length process at visit beginning (polling) instants (which exists if \(\rho < 1\) and \(s_i < \infty \) for all i). Property 4.1 prescribes how each of the customers present at \(Q_i\) at the visit beginning is replaced by independent families of customers at its visit completion. This enables one to express \(V_{c_i}(\cdot )\) nicely into \(V_{b_i}(\cdot )\):

$$\begin{aligned} V_{c_i}(z) = V_{b_i}(z_1, \ldots , z_{i - 1}, h_i(z), z_{i + 1}, \ldots , z_n). \end{aligned}$$
(21)

Next, we relate \(V_{b_i}(z)\) to \(V_{c_{i - 1}}(z)\). That will allow us—after n steps—to express, say, \(V_{b_1}(\cdot )\) into itself, and finally to obtain an explicit expression for \(V_{b_1}(z)\). The PGFs \(V_{c_i}(\cdot )\), \(S_{b_i}(\cdot )\), and \(S_{c_i}(\cdot )\) then also follow.

In our analysis, we follow (Resing 1993). We distinguish the two cases of non-zero and zero switchover times. In both cases, the following branching functions play a crucial role, thus establishing the link between both cases.

Define

$$\begin{aligned} f(z) := (f_1(z), \ldots , f_n(z)), \end{aligned}$$
(22)

with

$$\begin{aligned} f_i(z) := h_i(z_1, \ldots , z_i, f_{i + 1}(z), \ldots , f_n(z)) \end{aligned}$$
(23)

for \(\mid z_j \mid \, \le 1\), \(j = 1, \ldots , n\). This is the offspring PGF, the PGF of the joint distribution of the numbers of customers at the end of a cycle with respect to \(Q_1\) that are descendants of a type-i customer. In this branching process setting, a descendant of some customer K is a customer that has arrived during the service time of K or of one of its descendants.

For \(\mid z_j \mid \, \le 1\), \(j = 1, \ldots , n\), define

$$\begin{aligned} f^{(0)}(z) := z, ~~~ f^{(k)}(z) := f(f^{(k - 1)}(z)), \quad \quad k \ge 1. \end{aligned}$$

Case I: Non-zero switchover times

Observe that

$$\begin{aligned} V_{b_i}(z) = V_{c_{i - 1}}(z) \sigma _{i - 1}\left( \sum \limits _{j = 1}^{n}\lambda _j (1 - z_j)\right) . \end{aligned}$$
(24)

Substituting (21) into (24)

$$\begin{aligned} V_{b_i}(z) = V_{b_{i - 1}}\left( z_1, \ldots , z_{i - 2}, h_{i - 1}(z), z_i, \ldots , z_n\right) \sigma _{i - 1}\left( \sum \limits _{j = 1}^{n}\lambda _j (1 - z_j)\right) . \end{aligned}$$
(25)

Applying (25) n times (which corresponds to following the server during one full cycle with respect to \(Q_1\))

$$\begin{aligned} V_{b_1}(z) = V_{b_1}(f(z)) g(z), \end{aligned}$$
(26)

with

$$\begin{aligned} g(z) = \prod \limits _{i = 1}^{n}\sigma _i\left( \sum \limits _{j = 1}^{i} \lambda _j (1 - z_j) + \sum \limits _{j = i + 1}^{n} \lambda _j (1 - f_j(z))\right) . \end{aligned}$$

The function \(g(\cdot )\) represents the ‘immigration process’ of this multi-type branching process: it is the PGF of the vector of all customers that either have arrived in the switchover periods of the past cycle (measured with respect to \(Q_1\)), or are descendants of such customers.

Iterating (26) yields

$$\begin{aligned} V_{b_1}(z)= & {} \prod \limits _{k = 0}^{\infty }g(f^{(k)}(z)) \nonumber \\= & {} \prod \limits _{k = 0}^{\infty }\prod \limits _{i = 1}^{n}\sigma _i\left( \sum \limits _{j = 1}^{i} \lambda _j (1 - f_j^{(k)}(z)) + \sum \limits _{j = i + 1}^{n} \lambda _j (1 - f_j^{(k + 1)}(z))\right) , \end{aligned}$$
(27)

the infinite product being convergent when the ergodicity conditions are fulfilled.

Case II: Zero switchover times

In the case of zero switchover times (in the sequel, we add a superscript 0 for that case, to distinguish its quantities from those for non-zero switchover times):

$$\begin{aligned} V_{b_i}^0(z) = V_{c_{i - 1}}^0(z), \end{aligned}$$
(28)

for \(i = 2, \ldots , n\). The relation between \(V_{b_1}^0(z)\) and \(V_{c_n}^0(z)\) deserves special attention because of our convention concerning the behavior of the server when the system is empty. When all queues in the model with zero switchover times become empty, S makes a full cycle, and subsequently stops right before \(Q_1\) (all this requires zero time). When the first new customer arrives, S cycles along the queues to that customer. The consequence of this is that when the system is empty at the start of a visit to \(Q_1\), then the next visit to \(Q_1\) does not take place until a customer has arrived. We can write

$$\begin{aligned} V_{b_1}^0(z) = V_{c_n}^0(z) - V_{b_1}^0(0) [1 - g^0(z)], \end{aligned}$$
(29)

with

$$\begin{aligned} g^0(z) := \sum \limits _{i = 1}^{n}\frac{\lambda _i}{\lambda } z_i. \end{aligned}$$

The function \(g^0(\cdot )\) represents the ‘immigration process’ of the multi-type branching process: it is the PGF of the arrival process of customers during periods in which the system is empty.

Remark 4.1

Although we sometimes find it convenient to concentrate on \(Q_1\), it should be noted that our convention for the position of S in an empty system does not affect the waiting-time and queue-length distributions.

In fact, our convention slightly differs from that of Resing (1993), who assumes that when the system is empty, S immediately stops right behind \(Q_1\), and hence takes \({g^0(z) = {\sum \nolimits _{i = 1}^{n}} \frac{\lambda _i}{\lambda } f_i(z)}\). Our convention enables us to simultaneously apply the theory of multi-type branching processes and Eisenberg’s approach. \(\square \)

Substituting (21) into (28)

$$\begin{aligned} V_{b_i}^0(z) = V_{b_{i-1}}^0(z_1, \ldots , z_{i - 2}, h_{i - 1}(z), z_i, \ldots , z_n) \end{aligned}$$
(30)

for \(i = 2, \ldots , n\). Starting from (29) and (21) for \(i = n\), and subsequently using (30) for \(i = n, n - 1, \ldots , 2\), one obtains

$$\begin{aligned} V_{b_1}^0(z) = V_{b_1}^0(f(z)) - V_{b_1}^0(0) [1 - g^0(z)]. \end{aligned}$$
(31)

Iterating (31) yields

$$\begin{aligned} V_{b_1}^0(z) = 1 - V_{b_1}^0(0) \sum \limits _{k = 0}^{\infty }\left[ 1 - g^0(f^{(k)}(z))\right] = 1 - V_{b_1}^0(0) \sum _{k=0}^{\infty } \sum _{i=1}^n \frac{\lambda _i}{\lambda } \left( 1 - f_i^{(k)}(z)\right) , \end{aligned}$$
(32)

with

$$\begin{aligned} V_{b_1}^0(0) = \left[ 1 + \sum \limits _{k = 0}^{\infty }\left[ 1 - g^0(f^{(k)}(0))\right] \right] ^{- 1} = \left[ 1 + \sum \limits _{k = 0}^{\infty }\sum _{i=1}^n \frac{\lambda _i}{\lambda } \left( 1 - f_i^{(k)}(0)\right) \right] ^{- 1}, \end{aligned}$$

the infinite sum being convergent when the ergodicity conditions are fulfilled.

From (27) and (32), we see that \(V_{b_1}(z)\) as well as \(V_{b_1}^0(z)\) is determined by \(\sum \limits _{j = 1}^{n}\lambda _j (1 - f_j^{(k)}(z))\).

Remark 4.2

It is worth observing that the Globally gated service discipline (Boxma et al. 1992), as described in Sect. 2, does not satisfy Property 4.1. At the same time, the PGFs \(V_{b_i}(\cdot )\) and \(V_{c_i}(\cdot )\) can all be expressed in terms of the joint queue-length PGF \(V_{b_1}(\cdot )\) at the start of a cycle. Indeed, Globally gated is, arguably, the most tractable service discipline, providing a useful testing ground for novel concepts. Altman et al. (1992) consider the elevator variant of Globally gated, where the various queues are visited in alternating order. From an application perspective, it might be interesting to consider the concept of a reservation mechanism, which also underlies Globally gated, in more detail. For example, customers at some queue might have a certain window of opportunity to make a reservation for service in the next visit period of that queue, see Abidini et al. (2017a) for an application in optical switches.

4.1 Marginal queue lengths and waiting times

Above, the joint queue-length PGFs \(V_{b_i}(z)\) and \(V_{b_i}^0(z)\) at visit beginning instants have been determined for the class of cyclic polling models in which Property 4.1 holds for the service disciplines at all queues. In Sect. 3.4, we already obtained a decomposition for the PGF of the marginal queue-length distribution at \(Q_i\) into a corresponding M / G / 1 term and a term involving \({E}(y^{\mathbf{X}_i})\) and \({E}(y^{\mathbf{Y}_i})\) (via the PGF \({E}(y^{\mathbf{N}_{i \mid I}})\)). In particular, denoting

$$\begin{aligned} \tilde{h}_i(y):= & {} h_i(1, \ldots , 1, y, 1, \ldots , 1); ~~~ \tilde{V}_{b_i}(y)\\:= & {} V_{b_i}(1, \ldots , 1, y, 1, \ldots , 1) ; ~~~ \tilde{V}_{b_i}^0(y) \\:= & {} V_{b_i}^0(1, \ldots , 1, y, 1, \ldots , 1), \end{aligned}$$

with y as ith argument, it follows from (12) and (21) for the case of non-zero switchover times that

$$\begin{aligned} {E}(y^{\mathbf{N}_{i \mid I}}) = \frac{\tilde{V}_{b_i}(\tilde{h}_i(y)) - \tilde{V}_{b_i}(y)}{(1 - y) \tilde{V}_{b_i}'(1) (1 - \tilde{h}_i'(1))}; \end{aligned}$$
(33)

the same result holds for the case of zero switchover times, replacing \(\tilde{V}_{b_i}(\cdot )\) by \(\tilde{V}_{b_i}^0(\cdot )\) in (33). Similarly indicating queue lengths, and waiting times, by a superscript 0 in the case of zero switchover times, one finds (Borst and Boxma 1997)

$$\begin{aligned} {E}(y^{\mathbf{N}_i})= & {} {E}(y^{\mathbf{N}_i^0}) \frac{[\tilde{V}_{b_i}(\tilde{h}_i(y)) - \tilde{V}_{b_i}(y)] \tilde{V}_{b_i}^0{'}(1)}{[\tilde{V}_{b_i}^0(\tilde{h}_i(y)) - \tilde{V}_{b_i}^0(y)] \tilde{V}_{b_i}'(1)}, \end{aligned}$$
(34)
$$\begin{aligned} {E}({e}^{- \omega \mathbf{W}_i})= & {} {E}({e}^{- \omega \mathbf{W}_i^0}) \frac{[\tilde{V}_{b_i}(\tilde{h}_i(1 - \omega / \lambda _i)) - \tilde{V}_{b_i}(1 - \omega / \lambda _i)] \tilde{V}_{b_i}^0{'}(1)}{[\tilde{V}_{b_i}^0(\tilde{h}_i(1 - \omega / \lambda _i)) - \tilde{V}_{b_i}^0(1 - \omega / \lambda _i)] \tilde{V}_{b_i}'(1)}. \end{aligned}$$
(35)

For exhaustive service, \(\tilde{h}_i(\cdot ) \equiv 1\); for gated service, \(\tilde{h}_i(y) = \beta _i(\lambda _i(1 - y))\).

Let us now (without loss of generality) concentrate on \(\mathbf{W}_1\) and \(\mathbf{W}_1^0\). After some calculations Borst and Boxma (1997), one gets

$$\begin{aligned} {E}({e}^{- \omega \mathbf{W}_1}) = {E}({e}^{- \omega \mathbf{W}_1^0}) \frac{\tilde{V}_{b_1}(\tilde{h}_1(1 - \omega / \lambda _1)) - \tilde{V}_{b_1}(1 - \omega / \lambda _1)}{s [\tilde{H}(\tilde{h}_1(1 - \omega / \lambda _1)) - \tilde{H}(1 - \omega / \lambda _1)]}, \end{aligned}$$
(36)

which for exhaustive service (\(\tilde{h}_1(\cdot ) \equiv 1\)) and gated service (\(\tilde{h}_1(1 - \omega / \lambda _1) = \beta _1(\omega )\)), corresponds to Theorems 2 and 5 in Srinivasan et al. (1995), respectively.

Remark 4.3

The above results expose a close similarity between the cases with and without switchover times. Before Borst and Boxma (1997), models with switchover times and models without switchover times had usually been treated separately, often via different approaches; the problem with simply letting the switchover times tend to zero in a polling model with non-zero switchover times is that the number of polling epochs in an idle period tends to infinity, leading to degenerate distributions at such epochs, cf. Levy and Kleinrock (1991) and Eisenberg (1994). The relationship between the two models has further been exposed in Cooper et al. (1996), Fuhrmann (1992), and Srinivasan et al. (1995); in Borst and Boxma (1997), some of their results are unified and generalized.

4.2 Computational aspects

The above results provide a basis for a very efficient numerical calculation of the mean waiting times as well as higher order moments (Borst and Boxma 1997). The number of elementary operations (additions, multiplications) involved for calculating the mean waiting time at a single queue is \(O(n \log _\rho (\epsilon ))\), with \(\epsilon \) the desired level of accuracy. This is comparable to the computational complexity of the so-called descendant-set approach developed by Konheim and Levy (1992) and the so-called station time method of Ferguson and Aminetzah (1985) which entail solving a system of \(n^2\) equations for obtaining the mean waiting times at all n queues. These methods provided a significant reduction in computational complexity compared to the original buffer occupancy method described by Cooper (1970), Cooper and Murray (1969), and Eisenberg (1972) which required solving a system of \(n^3\) equations for determining the mean waiting times at all n queues. The mean value analysis developed by Winands et al. (2006), as further discussed in Sect. 7, also provides an efficient way to determine mean sojourn times, as demonstrated in Van der Gaast et al. (2017) for a model with batch arrivals. It additionally offers a basis for approximations of mean queue lengths and mean delays.

We close this subsection by remarking that (i) Eq. (11) of Resing (1993) provides exact (non-numerical) moment expressions for branching-type polling models and (ii) Choudhury and Whitt (1996) present an elegant method to obtain moments and tail probabilities in polling models via numerical inversion of transform expressions.

5 Two-queue polling systems which are not of branching type

There appears to be a sharp division between ‘easy’ (branching-type) and ‘complicated’ polling models. Such a division is not uncommon in queueing theory; one also sees it, e.g., in queueing networks that do or do not satisfy the conditions to have a product form for their joint queue-length distribution. If a polling system does not satisfy the branching property, then an exact analysis of queue length and waiting-time distributions generally seems out of reach. Just like in queueing networks, there are a few two-queue exceptions; in the present section, we consider some of those. We restrict ourselves to the case of non-zero switchover times. Starting point is a relation between the two-dimensional queue-length generating functions \(V_{b_1}(z)= V_{b_1}(z_1,z_2)\) and \(V_{b_2}(z_1,z_2)\) at server visits to \(Q_1\) and \(Q_2\), respectively. When the branching property holds, this relation is given by (25), which could be iterated to yield an infinite product. Now, consider the case in which \(Q_1\) receives exhaustive service and \(Q_2\) receives 1-limited service. Then (26) is replaced by

$$\begin{aligned} V_{b_1}(z_1,z_2)= & {} \frac{\beta _2(z_1,z_2) \sigma _2(z_1,z_2)}{z_2} [\sigma _1(z_1,z_2) V_{b_1}(h_1(z_1,z_2),z_2) \nonumber \\&- \sigma _1(z_1,0) V_{b_1}(h_1(z_1,0),0)] \nonumber \\&+ \sigma _2(z_1,z_2) \sigma _1(z_1,0) V_{b_1}(h_1(z_1,0),0). \end{aligned}$$
(37)

Ibe (1990) has obtained the marginal queue-length transform for \(Q_1\) at polling instants of that queue; Groenendijk ( 1990 b, Section 6.3) used (37) to obtain an explicit expression for \(V_{b_1}(z_1,z_2)\). The key to solving (37) is the observation that because service at \(Q_1\) is exhaustive, one has \(h_1(z_1,z_2) = \pi _1(\lambda _2(1-z_2))\) with \(\pi _1(\cdot )\) the LST of a busy period of M / G / 1 queue \(Q_1\) in isolation. Because this function does not depend on \(z_1\), \(V_{b_1}(h_1(z_1,0),0)\) is a constant, not depending on \(z_1\). Hence, the only unknown functions in (37) are \(V_{b_1}(z_1,z_2)\) and \(V_{b_1}(h_1(z_1,z_2),z_2)\), and the substitution \(z_1 = \pi _1(\lambda _2(1-z_2))\) (plus the normalization condition) solves the problem. For a study of the two-queue case with exhaustive service at \(Q_1\) and k-limited service at \(Q_2\), we refer to Ozawa (1990) and Winands et al. (2009).

It is perhaps not that surprising that the two-queue exhaustive/1-limited model is easy to analyze; in the case of zero switchover times, it reduces to a classical queueing model with two customer classes and non-preemptive priority for class 1. It is surprising, though that the two-queue gated/1-limited model has not succumbed to an exact analysis; in Boon et al. (2011), it is suggested that determination of \(V_{b_1}(z_1,z_2)\) for that model might be accomplished by solving a so-called boundary value problem of a complicated type.

Several two-queue polling models have been solved via a formulation as a boundary value problem; we now turn to this line of research.

Eisenberg (1979) studies a two-queue polling model with 1-limited service at both queues and without switchover times. He transforms the problem of determining \(V_{b_1}(z_1,z_2)\) into the problem of solving a singular integral equation (a complex Fredholm integral equation of the second kind). As the author indicates, due to the difficult nature of the mathematics, some steps in the solution remain to be proven. In Cohen and Boxma (1981), a different approach for this same model is given. Below, we sketch that approach, for the more general case of non-zero switchover times (cf. Boxma and Groenendijk 1988). Starting point in Boxma and Groenendijk (1988) again is the functional equation for \(V_{b_i}(z_1,z_2)\):

$$\begin{aligned} K(z_1,z_2) V_{b_1}(z_1,z_2)= & {} V_{b_1}(0,z_2)[\beta _2(z_1,z_2) \sigma _1(z_1,z_2) \sigma _2(z_1,z_2) (z_1 - \beta _1(z_1,z_2))] \nonumber \\&+ V_{b_2}(z_1,0) [z_1 \sigma _2(z_1,z_2) (z_2 - \beta _2(z_1,z_2))], \end{aligned}$$
(38)

with \(K(z_1,z_2)\) the kernel of the functional equation, defined as

$$\begin{aligned} K(z_1,z_2) := z_1 z_2 - \beta _1(z_1,z_2) \beta _2(z_1,z_2) \sigma _1(z_1,z_2) \sigma _2(z_1,z_2) . \end{aligned}$$
(39)

The appearance of the functions \(V_{b_1}(0,z_2)\) and \(V_{b_2}(z_1,0)\) corresponds to a server arriving at an empty queue. Once they have been obtained, \(V_{b_1}(z_1,z_2)\) is also known. The key in the analysis in Boxma and Groenendijk (1988) is that, according to its definition as a probability generating function, \(V_{b_1}(z_1,z_2)\) should be analytic inside the product of unit circles \(|z_1|<1\), \(|z_2|<1\). Hence, every zero of \(K(z_1,z_2)\) in that region should also be a zero of the right-hand side of (38). The ensuing relation between \(V_{b_1}(0,z_2)\) and \(V_{b_2}(z_1,0)\) is thus translated into a Riemann boundary value problem—a problem in which two functions are related on a closed contour, while one function is analytic inside that contour (and continuous up to the boundary) and the other function is analytic outside that contour (and continuous up to the boundary). By solving such a Riemann problem, \(V_{b_1}(0,z_2)\) and \(V_{b_2}(z_1,0)\) are obtained. In Cohen and Boxma (1981), for the case of zero switchover times, a similar approach was followed, resulting in a (somewhat simpler) Dirichlet boundary value problem.

Cohen (1987) studies a two-queue polling model with semi-exhaustive (also called decrementing) service: the server stays in a non-empty queue until the number of customers present has become one smaller than the number found upon its arrival to the queue. The joint queue-length distribution at visit completion epochs is obtained by formulating and solving a Riemann boundary value problem.

Several studies consider two-queue polling models with Bernoulli service. Under this service discipline, if both queues are non-empty and the server is at \(Q_j\), a customer from \(Q_j\) is served with probability \(p_j\) and a customer from the other queue is served with probability \(1-p_j\). The case with \(p_1=1\) and \(0 \le p_2 < 1\) was solved by Weststrate and van der Mei (1994) via an iterative process. The case that \(p_1\), too, is less than one is harder. Both Lee (1997, zero switchover times) and Feng et al. (1998, non-zero switchover times) treat this model using boundary value techniques. Lee formulates a Riemann boundary value problem with a shift, and translates it to a Fredholm integral equation which he solves. Feng et al. (1998) also formulate and solve a Riemann boundary value problem.

Finally, we would like to observe that it seems unlikely that an exact analysis will be provided for an n-queue polling model, with \(n > 2\), in which none of the queues has a branching-type service discipline. This belief is based on the lack of a boundary value approach in dimensions higher than two. Analytic–numerical approaches like the power-series algorithm could be used in such cases, see Blanc (1991).

6 The input process

The polling literature focuses almost exclusively on the case of customers arriving according to independent Poisson processes, the service requirements at the various queues, moreover, being independent sequences; the resulting input processes hence are independent compound Poisson processes. In this section, we consider some generalizations of these assumptions.

(i) BMAP arrivals.Saffer and Telek (2010) consider a polling model with either exhaustive or gated service, in which the arrival processes at the n queues are independent Batch Markovian Arrival Processes (BMAP). They developed a generalization of the so-called buffer occupancy method, a classical method for analyzing queue lengths in polling systems, first presented by Cooper and Murray (1969).

(ii) Renewal arrivals. Bertsimas and Mourtzinou (2009) consider a polling model with independent renewal arrival processes at the various queues. For the case of gated service at all queues, they derive expressions for the mean delays in heavy traffic, expressing these in cycle time variances which can be obtained by solving a system of \(n \times n\) equations. Van der Mei and Winands (2008) build upon their result, allowing general switchover times and providing closed-form expressions for scaled mean delays in heavy traffic. Boon et al. (2014) combine light- and heavy-traffic approximations, via interpolation, to come up with accurate mean waiting-time approximations for polling systems with both gated and exhaustive service.

Another type of approximation is provided in a few papers of Tran-Gia, see in particular Tran-Gia (1992). He presents a discrete-time analysis of polling systems with finite buffers, 1-limited service, and general renewal input. His method is based on the use of efficient discrete convolution operations, using fast convolution algorithms like the Fast Fourier transform.

(iii) Correlated arrivals. Levy and Sidi (1989) study a polling system with correlated Poisson arrival streams. They consider gated and exhaustive service, and derive linear equations, whose solution yields the mean delays. They also derive a pseudo-conservation law for the mean delays. They extend their analysis in Levy and Sidi (1991) to the case of Poisson arrivals of customer batches with correlated numbers \((K_1,\ldots ,K_n)\), destined for queues \(Q_1,\ldots ,Q_n\). A workload decomposition and general pseudo-conservation law for a polling model with such a batch Poisson arrival process is presented in Boxma (1989). Van der Gaast et al. (2017) derive the sojourn time LST of a batch, for exhaustive, gated and Globally gated service; a batch here may contain customers of various queues.

(iv) Lévy input. Recently there has been a growing interest in queueing models with as input a Lévy process (‘Lévy-driven queues’, see Debicki and Mandjes 2015). Lévy processes are processes with stationary, independent increments. Compound Poisson processes, Brownian motion and linear increment processes are some special cases. The generalization from a compound Poisson input (as in an M / G / 1 queue) to a Lévy input implies that one can no longer speak of customers and queue lengths; the focus naturally shifts to workloads. There is hardly any literature on Lévy-driven polling systems. A pioneering paper is due to Eliazar (2005a), who studies Lévy-driven polling systems under the gated discipline, using a dynamical systems approach. Czerniack and Yechiali (2009) consider fluid input at all queues, which may be seen as a special case of Lévy input. In Boxma et al. (2011) a very general arrival process is allowed: the input process is an n-dimensional Lévy subordinator (i.e., non-decreasing sample paths, which is of course natural for an input process). Correlations between the inputs at the various queues are allowed. Moreover, the input process may change at polling and switchover instants, implying that one can have different input processes at different server positions. The transition from compound Poisson process to Lévy subordinator implies that one no longer has the branching Property 4.1, which is stated in terms of numbers of customers. Boxma et al. (2011) identify the analogous branching property in a continuous state space setting, that allows describing the multi-dimensional workload at successive polling instants at a fixed queue as a multi-type continuous state space, discrete-time, branching process. This is referred to as a multi-type Jirina branching process (Jirina 1958; MTJBP). The class of service disciplines that satisfy the new branching property is rich, and contains the exhaustive and gated disciplines. Altman and Fiems (2007) had also observed the relation between Lévy-driven polling models and MTJBPs, in a special case in which all the queues are fed by identical Lévy subordinators. Employing the Kella–Whitt martingale, the LST of the joint steady-state workload distribution at an arbitrary epoch is also obtained in Boxma et al. (2011). Martingales are also the main tool in proving a workload decomposition result for a polling system with multi-dimensional Lévy input (Boxma and Kella 2014).

7 Scheduling

Until 10 years ago, very few papers diverged from the FCFS assumption for service within a queue of a polling system. In this section, we pay attention to two lines of research which deviate from the FCFS adagium: (i) polling systems with multiple classes of customers per queue, and fixed priorities, and (ii) polling systems in which there is only one class of customers per queue, but with a service discipline within a queue that is not FCFS but, e.g., Last-Come First-Served (LCFS), processor sharing, Random Order of Service (ROS), or Shortest Job First (SJF).

(i) Multiple customer classes with fixed priorities.Shimogawa and Takahashi (1992) derive a PCL for a polling system with fixed priorities within queues, and Fournier and Rosberg (1991) consider polling systems with local priorities and with global priorities (where the server moves to the next queue if some queue has a customer of higher priority than the ones in the presently visited queue). They develop a PCL for both model variants.

While most polling + priority studies originated from a computer-communication background, polling systems with multiple customer classes and fixed priorities also arise naturally in the Stochastic Economic Lot Scheduling Problem (SELSP), where multiple types of products have to be produced on a single machine with significant setup times. In the SELSP, orders for the same product type are being placed by customers of different priority levels, giving rise to polling models with not only several queues (corresponding to orders for the various product types) but also several customer classes per queue, see Winands (2007). This formed one of the motivations for a series of papers of Boon and Adan (2009), Boon et al. (2010a, b). They analyze the joint queue-length distribution for polling models which are of type PS, except for the additional assumption that within each queue there are several classes of customers with fixed priorities. That analysis relies on a relation to multi-type branching processes, cf. Sect. 4. They also determine the waiting-time distributions of the various classes of customers. This is done for exhaustive, gated and Globally gated service. A key step of the approach is to determine the joint distribution of the past and residual cycle time at the arrival epoch of the tagged customer. For gated service, the waiting time of a customer of priority level k in \(Q_i\) consists of that residual cycle time, plus the services of higher priority customers arriving during the cycle, plus the services of customers of equal priority arriving during the past part of the cycle. For exhaustive service, the procedure is somewhat similar, with a slightly different definition of the cycle time: for gated service, a cycle for \(Q_i\) starts at the beginning of a visit to \(Q_i\), whereas for exhaustive service it turns out to be convenient to let the cycle start at the completion of a visit to \(Q_i\).

(ii) One customer class per queue; non-FCFS service. There are quite a few real-world examples of polling situations in which non-FCFS scheduling might be required. In the computer science community, polling models are used to study the Bluetooth and 802.11 protocols, and scheduling policies at routers and i/o subsystems in web servers. The high workload variability in many of these settings makes non-FCFS scheduling appealing, see Wierman et al. (2007). In Wierman et al. (2007) it is argued that the lack of research on scheduling in polling systems is not due to a lack of applications, but rather due to the beliefs that the impact of within-queue scheduling will be small, and that the ensuing mathematical analysis will be very hard. Using the Mean Value Analysis (MVA) framework that was developed for polling systems in Winands, Winands et al. (2006), in Wierman et al. (2007) mean response (=sojourn) times in polling systems with exhaustive or gated service are determined for a wide array of service disciplines: LCFS, Processor Sharing, SJF and Shortest Remaining Processing Time First (SRPT). It turns out that, while varying the scheduling strategy at queues with gated service does not have a major effect, it does strongly affect mean delays in the case of exhaustive service. This holds in particular for SRPT, just as in an ordinary M / G / 1 queue. The reason that the effect is particularly pronounced for exhaustive service is that small jobs which arrive during a visit of their queue take advantage of preemption and thus have very small delays.

The above analysis is extended to sojourn time distributions in Boxma et al. (2009). The approach globally consists of the following steps: (i) determine the joint queue-length distribution at server visit epochs to a queue (restricting attention to polling models which satisfy the branching property); (ii) determine the LST of the cycle time distribution for some queue \(Q_i\); (iii) use this to determine the joint LST of the past and residual part of that cycle time, at the arrival epoch of a customer at \(Q_i\); (iv) for various service disciplines at \(Q_i\), and now focusing on gated and Globally gated, careful book-keeping yields the sojourn time LST at \(Q_i\). The analysis for exhaustive service seems more complicated; in Ayesta, Ayesta et al. (2012) the sojourn time LST is obtained for the case of an M / M / 1 processor-sharing queue in a polling system, under the constraint that all other queues also satisfy the branching property. See also Kim and Kim (2017) for the case of phase-type service at the processor-sharing queue.

8 Asymptotics

In this section, we consider two kinds of asymptotics: many-queue asymptotics and heavy-traffic asymptotics.

8.1 Many-queue asymptotics

Asymptotic regimes where the number of queues in a polling system grows large have received little attention so far. A few authors have studied the case in which the switchover times between successive queues go to zero when the number of queues grows large. In the limit, the polling system then behaves as a “continuous” spatial system with a single server which moves at constant speed along a circle, stopping to perform services when it encounters customers. These customers arrive uniformly on the circle, according to a Poisson process. Initial studies of such a continuous polling system were provided in Coffman and Gilbert (1986) and Fuhrmann and Cooper (1985b). Their model is generalized by Kroese and Schmidt (1992) via an approach that makes use of random measure theory and stochastic integration theory, and which thus also provides a rigorous mathematical basis for the study of continuous polling models.

An interesting model generalization is also proposed by Eliazar (2005b). He considers a polling system with gated service and n queues, with a Lev́y input process and general interdependent switchover times. Letting \(n \rightarrow \infty \), he proves convergence in law to a limiting polling system on the circle. His proof is based on an asymptotic analysis of stochastic Poincaré maps. The obtained limit is identified as a so-called snowplowing system on the circle (a snowplower cycling along a track, clearing off snow while moving (cf. Knuth 1973, pp. 254–255 and 259–264).

Motivated by applications in ferry-assisted wireless local-area networks, Kavitha and Altman have studied several continuous polling variants, see, for instance, Kavitha and Altman (2012), in which nonclassical service disciplines are considered, and in which the continuous polling system is analyzed by discretizing the system in such a way that known pseudo-conservation laws (cf. Sect. 3.3) can be applied. Their results rely heavily on fixed-point analysis of infinite-dimensional operators.

Kroese and Schmidt (1994) considers a greedy service policy: after completion of a service, the server always moves in the direction of the nearest customer. The stability condition for this system and several interesting open problems are discussed in Rojas Nandayapa et al. (2011). Those open problems concern stability issues as well as characterization of the random measure describing the steady-state customer positions. This is done in a more general setting than polling on a circle; customers may arrive in some space, and are served by one or more servers roaming that space. We refer to Rojas Nandayapa et al. (2011) for an extensive set of references on continuous polling.

In Meyfroyt et al. (2018), another type of scaling with a large number of queues is studied. Motivated by token passing algorithms for communication channels with medium access control and a large number of nodes, Meyfroyt et al. (2018) consider the following scenario: the number of queues grows large, while the total arrival rate is kept fixed and the individual switchover time and service time distributions remain the same. This asymptotic regime leads to cycles of infinite length and queue lengths with non-trivial distributions. Explicit pre-limit expressions are derived for the covariance of queue lengths, the covariance of visit times and the variance of the cycle time for symmetric polling systems in which the server uses a branching-type discipline. This leads to explicit expressions for \(\mathrm{lim}_{n \rightarrow \infty } {E}[\mathbf{C}/n]\) and \(\mathrm{lim}_{n \rightarrow \infty } n \mathrm{Var}(\mathbf{C}/n)\). Those results reveal that since \(\mathrm{Var}(\mathbf{C}/n)\) is of order 1 / n, the scaled cycle time \(\mathbf{C}/n\) converges in probability to a deterministic value. This implies that the queue lengths at the various nodes become asymptotically independent. In the limit, the individual queues appear to behave as discrete-time bulk service queues. It is suggested in Meyfroyt et al. (2018) that these properties of \(\mathbf{C}/n\) and of the individual queues remain valid for symmetric polling systems with a large number of queues and more general non-idling service disciplines—which are not necessarily of the branching type.

8.2 Heavy-traffic asymptotics

Pioneering papers regarding the heavy-traffic behavior of polling systems were written by Coffman et al. (1995, 1998). In Coffman et al. (1995), the focus is on a two-queue polling model with renewal arrival processes and exhaustive service at both queues, and with zero switchover times. The authors first apply standard heavy-traffic assumptions and scalings; they let \(\sqrt{m}(1-\rho )\) approach a constant with m going to infinity, and show that the normalized total workload process \(W(mt)/\sqrt{m}\)) weakly converges to reflected Brownian motion (RBM). For this, they can rely on a known G / G / 1 result because of work conservation. They subsequently show that the scaled workloads of individual queues change at a rate that becomes infinite in the limit. They then formulate an averaging principle for individual workloads, in which during one polling cycle, these scaled workloads linearly decrease to zero (during visit periods of the corresponding queue) and linearly increase (during the subsequent intervisit period), while the total scaled workload in the system during such a cycle basically stays the same. Individual workloads change a factor \(\sqrt{m}\) faster than the total workload. Put differently: when the total scaled workload equals x, the scaled workload at an individual queue is uniformly distributed on [0, x]. While in Coffman et al. (1995), a rigorous proof is only provided for the two-queue case with identical service time distributions, the authors convincingly argue that such an averaging principle should also hold in the n-queue case, with not necessarily identical service time distributions.

Coffman et al. (1998) prove that the averaging principle carries over to the case of non-zero switchover times. Because of those switchover times, they first have to replace the RBM heavy-traffic limit for the total workload by a Bessel-type diffusion limit. Two key elements of their subsequent approach are: (i) they first prove the averaging principle for a so-called threshold queue, a single queue in isolation with a server which only starts serving when the workload exceeds some value T and (ii) they strongly rely on a semi-martingale representation of the workload process, which allows them to use general convergence results for semi-martingales.

The Coffman–Puhalskii–Reiman papers have given rise to several lines of research. Olsen (2001) provides a heuristic refinement of the averaging principle, which improves the accuracy of the resulting approximation for waiting-time distributions under moderate load. In several studies, it is argued, without a rigorous proof, that the averaging principle of Coffman et al. (1995, 1998) holds in far greater generality. We refer to Section 2 of Markowitz et al. (2000) for an excellent discussion of the heavy-traffic averaging principle and further references, here only mentioning the interesting extensions to polling systems in tandem in Reiman and Wein (1999) and to the stochastic economic lot scheduling problem (Markowitz et al. 2000). Olsen and Van der Mei (2005), too, conjecture that the heavy-traffic averaging principle holds in considerable generality, and apply it to polling models with renewal arrivals, exhaustive or gated service at the queues, and service according to a polling table. They also use their heavy-traffic limiting result to provide accurate approximations for waiting-time distributions under moderate to heavy load. A similar approach is followed in Boon et al. (2016) for a network with a single roving server, leading to a heavy-traffic limiting result for the distribution of the total sojourn time of a customer in the network when following a specific path. In combination with a novel light-traffic approximation, this yields an approximation for the mean total sojourn time along a specific path, which is highly accurate for a wide range of traffic loads. Jennings (2010) uses a new technique to prove the validity of a heavy-traffic averaging principle for a vector of weighted queue lengths in a polling system with zero switchover times, and with a certain parameterized set of gated and exhaustive service disciplines. Each queue length is weighted by its mean processing time.

Finally, we mention three results of a different type. First, Van der Mei (2007) develops a heavy-traffic approach which is quite different from the one in Coffman et al. (1995, 1998). He restricts himself to branching-type polling systems, and then exploits Theorem 4 of Quine (1972) for multi-type Galton–Watson branching processes, in which the maximal eigenvalue of the so-called mean matrix (of numbers of descendants) approaches the critical value 1. Using Resing’s relation between the numbers of particles in multi-type branching processes Resing (1993) and the numbers of customers in the various queues at server polling epochs, he is able to obtain the heavy-traffic limiting behavior of the queue lengths, see also Abidini et al. (2017b) for a related heavy-traffic result for a polling model with retrials and so-called glue periods that models the dynamics of optical switches; in Abidini et al. (2017b), heavy-traffic asymptotics for the joint queue-length process are derived. Interestingly, Kroese (1997) provides a heavy-traffic analysis of a continuous polling system on the circle (cf. Sect. 8.1), by exploiting the relationship between such systems and age-dependent branching processes.

Second, Boon and Winands (2014) consider a two-queue polling system with zero switchover times and \(k_i\)-limited service at \(Q_i\), \(i=1,2\), under Markovian assumptions. Applying a singular perturbation technique, they derive the heavy-traffic behavior of the joint queue-length vector. The queue length of the critically loaded queue (\(Q_2\)) appears to be exponentially distributed after an appropriate scaling, whereas the queue length of \(Q_1\) is distributed as that of a queue in isolation with Erlang-\(k_2\) distributed vacations. This reveals a heavy-traffic behavior that is quite different from the heavy-traffic behavior of the branching-type polling models studied in the papers mentioned above.

Third, Bekker et al. (2015) consider polling models with the gated or globally gated service policy and several non-FCFS service disciplines. They derive asymptotic closed-form expressions for the LST of scaled (by a factor \(1-\rho \)) waiting times and sojourn times in heavy traffic. For FCFS, it was already known that the scaled sojourn times are of the form \(\mathbf{U} {\varvec{\Gamma }}\), with \(\mathbf{U}\) and \({\varvec{\Gamma }}\) independent, \(\mathbf{U}\) uniformly distributed and \({\varvec{\Gamma }}\) Gamma distributed. In Bekker et al. (2015), this result is also shown to hold for LCFS, while one has \(\tilde{\mathbf{U}} {\varvec{\Gamma }}\) for ROS with \(\tilde{\mathbf{U}}\) having a trapezoidal distribution; for processor sharing and SJF one gets \(\tilde{\mathbf{U}}^* {\varvec{\Gamma }}\), with \(\tilde{\mathbf{U}}^*\) having a generalized trapezoidal distribution. These results lead to accurate waiting- and sojourn time approximations. Vis et al. (2015) consider the same heavy-traffic problem for the case of exhaustive service at all queues. In that case, the scaled sojourn times are of the form \({\varvec{\Theta }}{\varvec{\Gamma }}\), where \({\varvec{\Theta }}\) is related to a uniformly distributed random variable.

9 Some miscellaneous topics

In this section, we discuss some miscellaneous topics, which did not fit in the framework of the previous sections: (i) multiple-server polling systems; (ii) disciplines with service limits; (iii) optimization of polling systems; and (iv) queue-length-dependent server behavior. Unfortunately, we could not cover some interesting topics like the concept of the dormant server which stays at a queue when the system has become empty; the concept of the smart customer whose arrival rate is determined by the server location; and the concept of fairness. The latter concept may deserve more attention than it has so far received in the literature (Shapira and Levy (2015); Van Wijk et al. (2012)), because it is closely related to the important question which queue to serve next, and which service discipline to use at a queue.

9.1 Multiple-server polling systems

As reflected in Assumption 1, in Sect. 2, we have focused on polling systems with a single server. Although multiple-server polling systems find a wide range of applications, they have received relatively limited attention, and hardly any exact distributional results are available, since the combination of several queues and multiple servers yields highly complex behavior.

In multiple-server models, it is useful to distinguish between two scenarios with either coupled servers which always visit the various queues as a group or independent servers which each visit the queues individually. Browne and Weiss (1992) establish index-type rules for determining the visit order that myopically minimizes the expected length of individual cycles in systems with coupled servers and exhaustive or gated service. Browne et al. (1992) and Browne and Kella (1995) analyze two-queue models with an infinite number of coupled servers and deterministic service times at one of the two queues. Vlasiou and Yechiali (2008) analyze polling systems with an infinite number of coupled servers and random visit durations. Borst (1995) explores the class of models with multiple coupled servers that satisfy a slight generalization of branching Property 4.1, and allow an exact analysis of the joint queue-length distribution at polling epochs, the marginal queue-length distributions, and the waiting-time distributions.

Models with multiple independent servers arise in scenarios, where several queues may be served concurrently, as is commonly the case in a wide range of applications, e.g., token ring and optical communication networks, elevator systems, and signalized traffic intersections. In a pioneering paper, Morris and Wang (1984) derive the mean cycle time of each server and the mean intervisit time to a queue, and present approximate expressions for the mean sojourn time for both a gated-type and a limited-type service discipline. An interesting phenomenon observed in Morris and Wang (1984) is that multiple independent servers have a tendency to cluster if they follow identical routes, especially in high load conditions. This phenomenon is somewhat reminiscent of the tendency for city buses and trams to bunch together on heavily traveled routes, and may be visualized and understood as follows. A server that is behind will tend to move fast, as it only encounters recently served queues, whereas a server at the front will tend to be slowed down by queues that have not been served for a while. Thus, the servers tend to form bunches while constantly leapfrogging over one another. Obviously, the bunching of servers is alleviated if they follow different routes, and Morris and Wang (1984), therefore, advocate the use of ‘dispersive’ policies to improve the system performance. Gamse and Newell (1982a) and Gamse and Newell (1982b) use multiple-server polling models to study elevator operations, where similar bunching behavior can occur.

Borst and Van der Mei (1998) provide mean waiting-time approximations which exploit pseudo-conservation-like concepts (which had proven to be valuable in the single-server case, cf. Boxma and Meister 1987a, b) and explicitly account for the tendency of the servers to cluster as function of their visit orders. Van der Mei and Borst (1997) demonstrate how performance metrics in polling systems with multiple independent servers may be calculated using the so-called power-series algorithm.

In recent papers, Antunes et al. (2010) and Robert and Roberts (2010) propose mean-field approximations for the capacity of multiple-server polling systems with a large number of queues and a limit on the number of servers that can visit a queue simultaneously, motivated by applications in passive optical networks. Finally, it is worth observing that the analysis of optimal dynamic routing policies and service disciplines for polling systems with multiple independent servers is closely related to that of selecting an optimal service vector in ‘switched’ networks with several potential schedules and reconfiguration delays as considered in Armony and Bambos (2003), Brzezinski and Modiano (2005), Celik et al. (2016), Hung and Chang (2008), and Wang and Javidi (2017).

9.2 Disciplines with service limits

Disciplines with service limits, as described in Sect. 2, are commonly adopted in practice to regulate the amount of service provided to each of the queues during a visit. Such service limits can either be specified in terms of the maximum number of customers served during a visit or the maximum time duration of a visit, and can be leveraged to bound the cycle time. Moreover, these limits provide a mechanism to achieve prioritization, by assigning different service limits to different queues, according to their relative importance.

Although these disciplines are widely implemented, they are difficult to analyze and hence it is not well understood how to set the service limits so as to achieve target performance levels. Note that k-limited service disciplines do not satisfy Property 4.1 and exact results are only available for special cases, such as completely symmetric systems with 1-limited service and a few two-queue scenarios, as discussed in Sect. 5. Polling systems with time-limited service have not yielded to an exact analysis in any degree of generality either. Coffman, Fayolle and Mitrani Coffman et al. (1988) and De Haan, Boucherie and Van Ommeren in a series of papers (see, e.g., De Haan et al. 2009) present interesting results for exponentially distributed time limits. Leung (1991) develops a numerical technique for analyzing systems with a probabilistically limited service discipline.

The fact that disciplines with service limits are widely deployed, yet extremely hard to analyze, has considerably added to the importance of the pseudo-conservation laws discussed in Sect. 3 in constructing and validating waiting-time approximations. Boxma and Meister (1987b) use the pseudo-conservation law to derive waiting-time approximations for 1-limited service. 1990 (b) presents a more refined procedure to compute such approximations. For the general case of k-limited service, the pseudo-conservation law still contains an unknown term. Fuhrmann and Wang (1988) obtain waiting-time approximations for k-limited service by bounding that term. Everitt (1986), Everitt (1989) derives such approximations by approximating that term. Chang and Sandhu (1992) present a more refined procedure to calculate waiting-time approximations for k-limited service.

9.3 Optimization of polling systems

Optimization in polling systems is a multi-faceted problem which has been actively pursued, though it remains somewhat under-explored compared to the analysis of polling systems. We refer to Boxma (1991) (static optimization) and Yechiali (1991) (semi-dynamic optimization) for surveys, and here only highlight a few of the main issues.

In the optimization of polling systems, there are two key factors that play a role: first, what is the performance metric to be optimized, and second, what is the class of feasible scheduling policies. As for the first factor, a commonly adopted optimization criterion is a weighted sum of the mean waiting times at the various queues. Concerning the second factor, usually the class of feasible scheduling policies consists of a family of strategies of similar structure that differ by some (vectorial) parameter. Typical examples include a routing vector (routing probabilities, or polling table), a service vector (service probabilities, or service limits), or a routing vector and a service vector simultaneously, which we now briefly discuss in succession.

Optimization of the routing policy for a given service policy

A considerable research effort has been devoted to static optimization, i.e., optimization of static routing policies, in which the routing decisions are made independently of the state of the system. Boxma et al. (1990) consider a system with random polling, and either exhaustive or gated service at each of the queues. They address the problem of finding the routing probabilities that minimize \({\sum \nolimits _{i = 1}^{n}} \rho _i {E}\mathbf{W}_i\), the latter quantity being explicitly known from the pseudo-conservation law for random polling, cf. Boxma and Weststrate (1989). They subsequently use this to determine a polling table that minimizes \({\sum \nolimits _{i = 1}^{n}} \rho _i {E}\mathbf{W}_i\), or, more generally Boxma et al. (1993), to determine a polling table that minimizes a weighted sum of the mean waiting times, the latter quantity now being approximated in terms of the occurrence ratios of the queues in the polling table. Kruskal (1969) studies a similar problem with deterministic arrival, service, and switchover processes. In all cases, the optimal visit ratios are given by surprisingly simple square-root formulas.

Also, a considerable amount of research effort has been put to semi-dynamic optimization, i.e., optimization of semi-dynamic routing policies, in which periodically the visit order for some future period is determined, based on some partial knowledge of the state of the system; see for instance Browne and Yechiali (1989) and Fabian and Levy (1994).

Optimization of the service policy for a given routing policy

Borst et al. (1995) consider a system with a k-limited service strategy at each of the queues, and address the problem of determining the vector of service limits \((k_1, \ldots , k_n)\) that minimizes a weighted sum of the mean waiting times. Blanc and Van der Mei (1995) study a similar optimization problem in a system with a Bernoulli service strategy at each of the queues.

Simultaneous optimization of routing policy and service policy

Borst et al. (1994) consider a system operated with a fixed time polling (ftp) scheme. An ftp scheme specifies which queue should be visited at what time, i.e., it specifies not only the order of the visits, but also the starting times of the visits, and addresses the problem of constructing an ftp scheme that minimizes a weighted sum of the mean waiting times.

For a model with zero switchover times, the optimal (non-preemptive) polling policy is known to be given by the \(c \mu \)-rule, cf. Meilijson and Yechiali (1977), and Buyukkoc et al. (1985). For a symmetric two-queue model with non-zero switchover times, Hofri and Ross (1987) show that the policy that minimizes the sum of discounted switchover times and the holding cost, is exhaustive service in a non-empty system, and is of threshold type for switching from an empty queue to another. For an asymmetric two-queue model with switchover costs rather than switchover times, Koole (1998) shows that the policy that minimizes the sum of discounted switching cost and holding cost, is not a threshold policy, but that the best threshold policy approaches the optimal policy very well. See the next subsection for some further threshold policies.

9.4 Queue-length-dependent server behavior

We briefly mention some studies which are devoted to the exact analysis of two-queue polling models with threshold switching. Lee and Sengupta (1993) consider a two-queue system without switchover times. If there are more than L customers at \(Q_1\) after a customer departure, then the server next serves a type-1 customer. If there are at most L customers at \(Q_1\) after a customer departure, then the server checks the type of the last served customer, and serves a customer of the other type (if there is one). Boxma et al. (1995) study a two-queue model with exponential service times and exhaustive service at \(Q_1\). Service at \(Q_2\) is also exhaustive, unless the size of \(Q_1\) reaches a threshold T during a service at \(Q_2\); in the latter case the server switches to \(Q_1\), either preemptively or at the end of the service. The same model, but with general service time distributions and without preemption, is considered in Boxma and Down (1997); that paper also contains a simple, yet accurate, mean queue-length approximation which is suitable for optimization purposes. The two-queue model with general service time distributions in Feng et al. (2001) has two thresholds M and \(N>M\) in \(Q_2\). After a service completion in \(Q_1\) that leaves \(Q_1\) non-empty, the server still moves to \(Q_2\) if its queue length exceeds N. After a service completion in \(Q_2\) that leaves \(Q_2\) with at most M customers, while \(Q_1\) is not empty, the server switches to \(Q_1\); otherwise, it stays at \(Q_2\). The analysis in each of these four papers Boxma and Down (1997), Boxma et al. (1995), Feng et al. (2001), Lee and Sengupta (1993) focuses on queue-length PGFs, and relies on arguments from complex function theory. The two-queue model of Avrachenkov et al. (2016) also has a threshold-based policy, but the capacities of both queues are finite. They use a matrix-analytic approach, and expose an interesting oscillation phenomenon. We also refer to this paper for further references on threshold switching.

Remark 9.1

Next to queue-length-dependent server behavior, one could also allow queue-length-dependent customer behavior. In Adan et al. (2016), a two-queue polling model is analyzed in which customers join the shortest queue. The joint queue-length distribution is determined both via the compensation approach and via reduction to a Riemann–Hilbert boundary value problem. Alternatively, one could allow customers to use some form of information about the server position. For example, the arrival rate of customers could depend on whether the server is visiting \(Q_i\), or switching to \(Q_j\). In the case of branching-type service disciplines, one can then still obtain joint queue-length distributions by exploiting properties of multi-type branching processes; a similar statement even holds in the case of Lévy-driven polling models (Boxma et al. 2011).

10 Suggestions for further research

Polling is a quite broad topic, and there are several ways of listing suggestions for further polling studies. One option is to link an open problem to each of the 10 assumptions of Sect. 2. Indeed, it would be interesting to obtain more results for multiple-server polling systems (Assumption 1); to devote more attention to spatial polling models (Assumption 2; see Altman and Foss 1997); to relax the assumption of Poisson arrivals (Assumption 3); to allow the loss of customers (Assumption 4); to consider non-cyclic routing, in particular Markovian routing (Assumption 5). This is, a.o., relevant in the setting of random access in wireless communications, see Dorsman et al. (2015). It is interesting to notice Resing (1993) that the joint queue-length process in Markovian polling models is not a multi-type branching process, even if the service policies at all queues are of branching type); to get a better grip on service policies which are not of branching type (Assumption 6); to obtain more results for polling systems with non-FCFS service order (Assumption 7); to study the effect of large switchover times, and also to allow the possibility that a switchover time is skipped when the corresponding queue is empty (Assumption 8; see Boon et al. 2011); to consider a network of queues with one or more polling servers (Assumption 9; Altman and Yechiali 1994; Armony and Yechiali 1999; Beekhuizen and Resing 2008; Boon and Winands 2014; Van Houdt 2010; Sidi 1992); and to study stability conditions Foss and Last (1996) but also the transient behavior of polling systems (Assumption 10).

Rather than “polling” these 10 topics in an exhaustive manner, we prefer to focus on what in our opinion are a few particularly relevant directions for further research:

  1. (i)

    Exact results for non-branching models are quite scarce, and exact results for branching-type polling models are typically given in the form of sums of infinite products of generating functions. Hence, there is a strong need for readily applicable expressions, which give useful qualitative insights and reasonably accurate quantitative results, like those provided in Federgruen and Katalan (1994). In particular, there seems to be a need for more asymptotic analysis. Firstly, we need to improve our insight into the heavy-traffic behavior of the class of branching-type polling systems, possibly based on the theory of multi-type branching processes, see Van der Mei (2007), but it is even more important to develop a methodology to study the heavy-traffic behavior of those polling systems in which the branching property is violated. Secondly, large-deviation asymptotics for polling systems deserve attention. Finally, we have only begun to understand the asymptotics for n, the number of queues, growing large. The scaling limits which are developed via asymptotic analysis may subsequently provide the basis for useful approximations, see Bertsimas and Mourtzinou (2009) and Boon et al. (2014).

  2. (ii)

    Relatively few studies have been devoted to the dynamic optimization of polling systems: which queue to serve next, and for how long? From an application perspective, it seems important to develop a methodology, possibly based on Markov decision processes, to tackle such problems systematically, also covering scenarios with simultaneous service of several queues subject to certain constraints.