1 Introduction

In this paper, we consider a server assignment problem. There are K customer classes, and each customer class \(1\le i\le K\) has holding costs \(c_i\) per unit time, per customer. There is a single server that can serve class i at rate \(\mu _i\). Arrivals occur according to independent Poisson streams, independently of the service process. Each class i customers abandons the system at rate \(\beta _i\), \(1\le i\le K\), independently of whether he is being served or waiting in the queue. The question we address in this paper is: what service policy minimises the expected discounted total and average cost?

In the K-competing queues model without abandonments, it is well-known that the \(c \mu \)-rule is optimal. The \(c \mu \)-rule gives full priority to the queue with the highest index \(c_i\mu _i\), that is, the queue that gives the highest cost reduction per unit time. This result was shown to be optimal in 1985 simultaneously by Baras et al. (1985) and by Buyukkoc et al. (1985).

Recently, there has been a revived interest in the K-competing queues model, with the additional feature of customer abandonment due to impatience. In this case the \(c \mu \)-rule is not always optimal. When we model this problem as a continuous-time Markov decision process (MDP) the abandonments induce unboundedness of the transition rates as a function of the state. Hence, uniformisation is not possible and the standard (discrete time) techniques are not available. In the literature several approaches have been tried to deal with this difficulty. We may categorise them in three main approaches.

  1. 1.

    Study of a relaxation or approximate version of the original problem (see e.g. Atar et al. 2010; Ayesta et al. 2011; Larrañaga et al. 2013, 2015). The obtained policies may serve as a heuristics.

  2. 2.

    Application of specific coupling techniques to obtain an optimal policy. Typically these papers (see e.g. Salch et al. 2013; Down et al. 2011, see also Ertiningsih et al. 2015) are limited to special cases, as the coupling gets more tedious in a more general setting. On the other hand, non-Markovian service time distributions and/or a non-Markovian arrival process may be handled.

  3. 3.

    Truncation of the process to make it uniformisable. Then use discrete-time techniques to derive properties of the optimal policy (see e.g. Down et al. 2011; Bhulai et al. 2014; Blok and Spieksma 2015). This is the solution method that we will follow in this paper.

The first approach is most prominent in the literature. In Atar et al. (2010) consider a K-competing queues problem with many servers. In their paper, the \(c \mu /\beta \)-rule is introduced. This rule prioritises the queue with the highest index \(c_i\mu _i/\beta _i\). We will refer to this rule as the \(c \mu /\beta \)-rule. In the paper, it is shown, that it is asymptotically optimal to follow the \(c \mu /\beta \)-rule in the overloaded regime, as the number of servers tends to infinity.

Ayesta et al. (2011) studied the problem as well. They derive priority rules similar to the \(c \mu /\beta \)-rule by analytically solving the case with one or two customers initially present and without arrivals.

Larrañaga et al. (2013) have studied a fluid approximation of the multi-server variant of the competing queues problem. In this fluid approximation optimality of the \(c \mu /\beta \)-rule in the overloaded regime is shown and it is shown that for \(K=2\) a switching curve policy is optimal in the underloaded regime. In Larrañaga et al. (2015) the same authors study asymptotic optimality of the multi-server competing queues problem for the average expected cost criterion. The authors consider the problem as a restless multi-armed bandit problem, and compute and show that the Whittle index is asymptotically optimal for convex holding cost. The asymptotics concern large states, and light and heavy traffic regimes. The paper also connects the \(c \mu /\beta \)-rule to the Whittle index for fluid approximations.

Other papers do not focus on heuristics, but try to find a subset of the input parameters for which a strict priority rule can be proven to be optimal. Salch et al. (2013) study the competing queues system with a restriction to a maximum of K arrivals. Customers may be impatient, but do not leave the system when they become impatient. Thus, the model is, in fact, a scheduling problem, and the criterion is to minimise the expected weighted number of impatient customers. With the use of a coupling and an interchange argument optimality of a priority policy is proved, provided a set of three conditions on the service, impatience and cost rates holds.

The paper of Down et al. (2011) considers a two-competing queues reward system, where the two classes have equal service rates. A coupling argument is employed to show that if type 1 customers have the largest abandonment rate and reward per unit time, then prioritising these customers is optimal.

The approach that we will carry out is the following. First, we model the problem as a continuous-time MDP. To make the MDP uniformisable, a truncation is necessary. After uniformisation, the truncated processes can be analysed by value iteration. To justify appropriate convergence of the truncated processes to the original model, a limit theorem is required. To our knowledge, so far such a theorem is available only for the discounted cost criterion, see Blok and Spieksma (2015). Via a vanishing discount approach, the results are transferred to the average cost criterion (see Blok and Spieksma 2017 for the justification). Therefore, we will first show for the discounted cost criterion that prioritising type i customers is optimal, if type i has maximum index with respect to c, \(c \mu \) and \(c\mu /\beta \). These conditions are similar to Salch et al. (2013), however the conditions of Salch et al. (2013) are implied by our conditions. Since the resulting index policy is optimal for all small discount factors, even strong Blackwell optimality of this policy follows.

In the paper of Down et al. (2011) a similar approach is used. The limit argument relies on specific properties of the model and a special truncation that does not affect optimality of the aforementioned priority policy. Due to the involved nature of the truncation, it seems unlikely that Down et al. (2011) can be extended to more dimensions or to heterogeneous service rates. The results of our paper can therefore be viewed as an extension of Down et al. (2011). In this paper, we use a different truncation technique called Smoothed Rate Truncation (SRT). This technique has been introduced by Bhulai et al. (2014) and can be utilised to make a process uniformisable while keeping the structural properties in tact.

The paper is organised as follows. In Sect. 2 we give a complete description of the model, and we present the main results. Section 3 contains the core of our analysis. First, it describes the Smoothed Rate Truncation in more detail, then the structural properties of the value function are derived. In Sect. 4 we prove the main theorem. This can be done by invoking the limit theorems of Blok and Spieksma (2017) and Blok and Spieksma (2015). Section 5 presents some numerical examples that show that none of the used conditions are redundant. In the “Appendix”, we provide the proofs of the propositions in Sect. 3.

2 Modelling and main result

2.1 Problem formulation

We consider K stations that are served by a single server. Customers arrive to the stations according to independent Poisson processes with rates \(\lambda _i> 0\) for \(i=1,\ldots , K\), respectively. The service requirements of class i customers are exponentially distributed with parameter \(\mu _i>0\). Customers have limited patience: they are willing to wait an exponential time with parameter \(\beta _i>0\) for class i. We allow abandonment during service as well, resulting in an abandonment rate in station i of \(\beta _i x_i\) if there are \(x_i\) customers present at station i. In Sect. 2.2 we will also discuss alternative modelling choices.

The service requirements, abandonments and arrivals are all stochastically independent of each other. Class i customers carry holding costs \(c_i\ge 0\) per unit time, \(i=1,\ldots , K\). The service regime is pre-emptive.

We will study this problem in the framework of Markov decision theory. To this end, let the state space be \({S}={\mathbb {N}}_0^K\). The action space is \({\mathcal {A}}=\{1,\ldots , K\}\), where action \(a\in \{1,\ldots , K\}\) corresponds to assigning the server to station i if \(a=i\). Thus, we only allow idling if one or more queues are empty. By \( {\varPi }=\{\pi :S\rightarrow {\mathcal {A}}\}\) denote the collection of stationary deterministic policies. For \(\pi \in {\varPi }\), a rate matrix \(Q(\pi )\) and cost rate \(c(\pi )\) are given by

$$\begin{aligned} q_{xy}(\pi )&= \left\{ \begin{array}{ll} \mu _j {\mathbf {1}}_{\{\pi (x)=j\}} &{} \text {if } y=x-e_j,\ x_{\pi (x)}>0,\ j=1,\ldots K, \\ \lambda _i &{} \text {if } y=x+e_i,\ i=1,\ldots ,K,\\ x_i\beta _i &{} \text {if } y=x-e_i,\ x_i>0, i=1,\ldots ,K,\\ -{\sum }_{r\ne x} q_{xr}(\pi ) &{} \text {if } y=x, \\ \end{array} \right. \\ c_x(\pi )&= \sum _i c_i x_i, \end{aligned}$$

where \(e_i\) stands for the i-th unit vector. One can then define a measurable space \(({\varOmega },{{{\mathcal {F}}}})\), a stochastic process \(X=\{X_t\}_{t\ge 0}\), , a filtration \(\{{{{\mathcal {F}}}}_t\}_{t\ge 0}\) to which X is adapted, and a probability distribution on \(({\varOmega },{{{\mathcal {F}}}})\), such that X is the minimal Markov process with q-matrix \(Q(\pi )\), for each initial distribution \(\nu \) on , and each policy \(\pi \in {\varPi }\). By , \(t\ge 0\), we denote the corresponding minimal transition function and by the expectation operator corresponding to . Notice that we will write , , when \(\nu =\delta _x\) is the Dirac measure at state x.

The problem of interest is finding the policy \(\pi \in {\varPi }\) that minimises the total expected discounted cost and the expected average cost. Let \(\alpha >0\). To this end, define

to be the total expected \(\alpha \)-discounted cost under policy \(\pi \), given that the system is in state x initially, . Then the minimum total expected cost value function \(V_\alpha \) is defined by

$$\begin{aligned} V_\alpha (x)=\inf _{\pi \in {\varPi }}V_\alpha ^\pi (x),\quad \pi \in {\varPi }. \end{aligned}$$

If \(V_\alpha ^\pi =V_\alpha \), then \(\pi \) is said to be an \(\alpha \)-discount optimal policy. If there exists \(\alpha _0>0\), such that \(\pi \) is \(\alpha \)-discount optimal for \(\alpha \in (0,\alpha _0)\), then \(\pi \) is called a strongly Blackwell optimal policy.

By \(g^\pi \) given by

we denote the expected average cost under policy \(\pi \), given the initial state x, . Again, the minimum expected average cost is defined by

and if \(g^\pi =g\), then \(\pi \) is an optimal policy.

It is not to be expected that the optimal policy has a simple description in general. In this paper, we will restrict to providing sufficient conditions for optimality of an index policy.

2.2 Main result

The two main results of our paper are Theorems 1 and 2, providing sufficient conditions for optimality of the Smallest Index Policy.

Definition 1

The Smallest Index Policy assigns the server to the non-empty station with the smallest index. The policy only idles, if no customers are present.

Theorem 1

Suppose that the stations can be ordered such that, for \(1\le i\le j\le K,\) the following three conditions hold

$$\begin{aligned} \begin{array}{cc|c} 1.&{}c_i\ge c_{j}&{} \quad notation: c \searrow \\ 2.&{}c_i\mu _i \ge c_{j}\mu _{j} &{} \qquad \qquad \qquad \qquad c \mu \searrow \\ 3.&{}c_i\mu _i/\beta _i\ge c_{j}\mu _{j}/\beta _{j}&{}\qquad \qquad \qquad \qquad \quad c\mu /\beta \searrow \\ \end{array} \end{aligned}$$
(1)

then the Smallest Index Policy is \(\alpha \)-discount optimal for any \(\alpha >0\), and hence also strongly Blackwell optimal.

Theorem 2

Under the conditions of Theorem 1, the Smallest Index Policy is average cost optimal.

The proofs are postponed until Sect. 4. In Sect. 5 we give examples showing, that if any of the three conditions of (1) is omitted, the Smallest Index Policy can fail to be optimal.

Alternative modelling choices In our model the cost function is a holding cost \(\sum {c_ix_i}\) per unit time, when the system is in state x. In many applications a penalty (say \(P_i\) for class i) is charged, if a customer abandons the system due to impatience. Then the cost per unit time is given by \(\sum _i P_i\beta _i x_i\). Substitution of \(c_i=P_i\beta _i\), \(i=1,\ldots , K\) implies equivalence of these cost structures.

We modelled the system, such that customers can leave the system while being in service. In some models, it may be more realistic that abandonment does not take place, after service has started. However, if the abandonment rates are smaller than the service rates, i.e., \(\beta _i< \mu _i\) for all i, then our analysis is still valid after an appropriate parameter change. That is, we consider the system with service rates \({\hat{\mu }}_i=\mu _i-\beta _i>0\). Abandonments during service or service completions in the revised model correspond to a service completion in the original one.

If, for one or more classes, the abandonment rates are greater than or equal to the associated service rates, then this substitution is clearly not possible. However, serving that customer class delays the process of emptying the system. It follows directly that in this case, it can never be optimal to serve these classes of customers. Hence, when there are only customers of that type present then the server should idle in order to minimise the expected average cost. Therefore, the optimal policy never serves class i if \(\mu _i\le \beta _i\). For the remaining customer classes with \(\mu _i>\beta _i\), the Smallest Index Policy is optimal, whenever these classes can be ordered, such that \(c\searrow ,\ c{\hat{\mu }}\searrow ,\ c{\hat{\mu }}/\beta \searrow \).

Finally, it is possible to allow idling at all times. However, it can easily be shown that it is not optimal to have unforced idling. Therefore, we ignore this option for the sake of notational convenience.

2.3 Structural properties

As mentioned in the introduction, Sect. 1, we will first study the \(\alpha \)-discounted cost problem. Crucial in establishing optimality of the Smallest Index Policy are certain properties of the value function. If \(V_\alpha \) is non-decreasing (I) and weighted Upstream Increasing (wUI), then optimality of the Smallest Index Policy can be directly deduced from the \(\alpha \)-discounted cost optimality equation under certain conditions on the Markov decision problem (cf. Blok and Spieksma 2015, 2017) that we will not discuss explicitly in this paper. We will next define the structural properties (I) and (wUI).

Definition 2

The function \(f:{S}\rightarrow {\mathbb {R}}\) is called weighted Upstream Increasing (wUI) if \(f\in wUI\), with wUI defined by

The function \(f:{S}\rightarrow {\mathbb {R}}\) is called non-decreasing (I) if \(f\in I\), with I defined by

The following lemma makes the connection between the structural properties of the \(\alpha \)-discounted cost value function and optimality of the Smallest Index Policy.

Lemma 1

Let the discount factor \(\alpha >0\). Then, the \(\alpha \)-discounted cost value function \(V_\alpha \) is well-defined and finite. Suppose \(V_\alpha \in wUI \cap I\), then the Shortest Index Policy is \(\alpha \)-discount optimal.

Proof

One can view the MDP as a negative dynamic programming problem (cf. Strauch 1966), for which simple conditions allow to draw the conclusions that we aim for. Since later on we will have to include perturbations, we will use (Blok and Spieksma 2015, Theorem 4.2). The conditions in that theorem are all easily verified, except for the following two conditions:

P1:

There exist a function , and a constant \(\gamma <\alpha \) with the properties that

If F satisfies the first property, then F is called a \(\gamma \)-drift function for the MDP.

P2:

There exist a function and a constant \(\xi \), such that the following properties are satisfied.

  • G is a \(\xi \)-drift function for the MDP.

  • G is an F-moment function, i.e. there exists an increasing sequence \(\{K_n\}_n\), , \(|K_n|<\infty \), , \(n\rightarrow \infty \), such that

    $$\begin{aligned} \inf _{x\not \in K_n}\frac{G_x}{F_x}\rightarrow \infty ,\quad n\rightarrow \infty . \end{aligned}$$

We check property P1. Take \(F_x=e^{\epsilon (x_1+\cdots +x_K)}\), with \(\epsilon \) to be determined. Then,

$$\begin{aligned} \sum _y q_{xy}(a)F_y= & {} F_x\Big \{\sum _i\lambda _i(e^\epsilon -1)+\sum _ix_i\beta _i(e^{-\epsilon }-1)+\mu _a{\mathbf{1}}_{\{x_a>0\}}(e^{-\epsilon }-1)\Big \}\\\le & {} F_x\sum _i\lambda _i(e^\epsilon -1). \end{aligned}$$

Clearly, one can choose \(\epsilon >0\) sufficiently small, so that

$$\begin{aligned} \gamma :=\sum _i\lambda _i(e^\epsilon -1)<\alpha . \end{aligned}$$

Since F increases exponentially quick in \(x_i\), \(1\le i\le K\), and c is linear in \(x_i\), the second condition in P1 is easily verified.

Property P2 immediately follows by setting \(G_x=e^{\epsilon '(x_1+\cdots +x_K)}\), for any \(\epsilon '>\epsilon \).

Let \(V_\alpha \in wUI \cap I\), let \(1\le {j_1}\le j_2 \le K\). Suppose x is such that \(x_{j_1}, x_{j_2} >0\), then \(V_\alpha \in wUI\) implies

$$\begin{aligned} \mu _{j_1}[V_\alpha (x-e_{j_1}) -V_\alpha (x)]\le \mu _{j_2}[V_\alpha (x-e_{{j_2}})-V_\alpha (x)]. \end{aligned}$$

Now by virtue of (Blok and Spieksma 2015, Theorem 4.2), \(V_\alpha \) is a solution to the Discount Optimality Equation, i.e.

$$\begin{aligned} \alpha V_\alpha (x)&= \sum _{i=1}^K c_i x_i + \sum _{i=1}^K \lambda _i V_{\alpha }(x+e_i) +\sum _{i=1}^K x_i\beta _i v^{\alpha }(x-e_i)\\&\quad +\, \min _{1\le j \le K} \{ \mu _j [V_{\alpha }((x-e_j)^+) -V_{\alpha }(x)]\} - \sum _{i=1}^K (\lambda _i + x_i\beta _i ) V_{\alpha }(x). \end{aligned}$$

The DCOE yields that if class \({j_1}\) and \({j_2}\) customers are both present, then it is optimal to serve class \({j_1}\) rather than class \({j_2}\).

Further, since \(V_\alpha \) is non-decreasing we have for \(1\le j\le K\), and x with \(x_j>0\) that

$$\begin{aligned} \mu _j V_\alpha (x-e_j)- \mu _j V_\alpha (x)\le 0, \end{aligned}$$

with 0 corresponding to the cost if an empty queue is served. Hence idling is never optimal; it is optimal to serve a customer whenever possible. We conclude that the Shortest Index Policy is optimal. \(\square \)

3 Discrete time discounted cost analysis

3.1 Smoothed Rate Truncation

The abandonment rates increase linearly in the number of waiting customers. Hence the transition rates are unbounded as a function of the state. Thus, the system is not uniformisable and so there is no discrete-time equivalent to the continuous-time problem. To make discrete-time theory available, we approximate the MDP with a sequence of (essentially) finite state MDPs. Unfortunately, standard state space truncations generally destroy the structural properties of interest due to boundary effects.

To this end, we have developed the Smoothed Rate Truncation (SRT). This perturbation technique was first introduced in Bhulai et al. (2014). In that paper, SRT is applied to a Markov cost process, and properties of the value function are proven. The distinguishing feature of SRT is that the transition rates are decreased in all states, also close to the origin. This makes the jump rates highly state dependent and complicates the analysis, but it is the key feature of SRT that ensures that the properties are preserved.

The idea of SRT is as follows. Every transition that moves the system into a higher state in one or more dimensions is linearly decreased as a function of these coordinates. This naturally generates a finite subset of the space, that cannot be left with positive probability under any policy. As a consequence, recurrent classes under any policy are always finite. As we get closer to the boundary of the finite set, the rates are smoothly truncated to 0. On the finite state space, the transition rates are bounded. Outside the finite set, the rates can be arbitrarily chosen, since these states are inessential. In particular, they can be chosen such that the jump rates are uniformly bounded.

In our model, a truncation parameter \(N=(N_1,\ldots , N_K)\in {\mathcal {N}}=(\mathbb {N}\cup \infty )^K\) defines the size of the state space. Since the empty state can always be reached, and there is a positive probability of an arrival in any queue within the finite set (not on the boundary clearly), the set of essential states is given by \(S^N=\{x\in S| x_i\le N_i, i=1,\ldots ,K\}\).

SRT prescribes a truncation of all transitions that move the system into a ‘larger’ state. In this model only arrivals move the system to a larger state, hence for all i the arrival rates \(\lambda _i\) are replaced by new rates \(\lambda _i^N(x)\) in state x. The smoothed arrival rates are given by

$$\begin{aligned} \lambda _i^N(x):= \left( 1-\frac{x_i}{N_i}\right) ^+\lambda _i . \end{aligned}$$

The result is a uniformisable MDP for each \(N\in {\mathcal {N}}\), which leads to a collection of parametrised MDPs \(\{X^N\}_{N\in {\mathcal {N}}}\). Here \(N=\infty ^K\in {\mathcal {N}}\) corresponds to the original model. For \(\pi \in {\varPi },\ N\in {\mathcal {N}}\) the transition rate matrix \(Q^N(\pi )\) is given by

$$\begin{aligned} q_{xy}^N(\pi )= \left\{ \begin{array}{ll} \lambda _i^N(x) &{} \text {if } y=x+e_i,\ i=1,\ldots ,K,\\ \mu _j {\mathbf {1}}_{\{\pi (x)=j\}} &{} \text {if } y=x-e_j,\ x_j>0,\ j=1,\ldots , K,\\ \min \{x_i,N_i\}\beta _i &{} \text {if } y=x-e_i,\ x_i>0,\ i=1,\ldots ,K,\\ -{\sum }_{r\ne x} q^N_{xr}(\pi ), &{}\text {if } y=x. \\ \end{array} \right. \end{aligned}$$

As has been mentioned already, outside \(S^N\) it is possible to choose the rates as we like, for these states are inessential. In particular, we can choose the new abandonment rates of class i to be bounded by \(N_i\beta _i\).

Furthermore, the perturbed MDP is easily checked to satisfy the conditions of (Blok and Spieksma 2015, Theorems 4.2 and 5.1). The main ingredients of its verification are analogous to the proof of Lemma 1. The results in Blok and Spieksma (2015) guarantee that the value function \(V^{(N)}_\alpha \) of the N-perturbed MDP is well-defined with \(V^{(N)}_\alpha \rightarrow V_\alpha \), \(N\rightarrow \infty ^K\), and any limit point of \(\alpha \)-discount optimal policies for the N-perturbation, \(N\rightarrow \infty ^K\), is \(\alpha \)-discount optimal for the original MDP.

3.2 Dynamic programming

Apart from the parameter space \({\mathcal {N}}\), we will need to introduce a special subset \({\mathcal {N}}(\lambda )\), given by

$$\begin{aligned} {\mathcal {N}}(\lambda )=\Big \{ N\in {\mathcal {N}}\Big |\frac{\lambda _{i}}{N_{i}} \le \frac{\lambda _{i+1}}{N_{i+1}} \text{ for } 1\le i < K \Big \}. \end{aligned}$$

Throughout the rest of this section, we fix the truncation parameter \(N\in {\mathcal {N}}\) and discount factor \(\alpha >0\). Our goal is to show that \(V_\alpha ^{(N)}\in wUI\cap I\) for all \(\alpha >0\) and \(N\in {\mathcal {N}}(\lambda )\). We use the following short-hand notation

$$\begin{aligned} {\bar{\lambda }} := \sum _{i=1}^K \lambda _i,\qquad \mu :=\sum _{i=1}^K\mu _i,\qquad \beta _N := \sum _{i=1}^K \beta _i N_i. \end{aligned}$$

Without loss of generality, we may assume that \({\bar{\lambda }} + \beta _N +\mu = 1\). The discrete-time uniformised MDP is defined by

$$\begin{aligned} P^{(N,d)}(\pi )= I+ Q^{N}(\pi ),\ {\bar{c}}=\frac{c}{1+\alpha },\ {\bar{\alpha }}=\frac{1}{1+\alpha },\qquad \pi \in {\varPi }. \end{aligned}$$

Let \(V_{{\bar{\alpha }}}^{(N,d)}\) denote the expected discrete-time \({\bar{\alpha }}\)-discounted optimal cost:

$$\begin{aligned} V_{{\bar{\alpha }}}^{(N,d)}(x)=\inf _{\pi \in {\varPi }}\sum _{n=0}^\infty {\bar{\alpha }}^n\Big (P^{(N,d)}(\pi )\Big )^n{\bar{c}}. \end{aligned}$$

Then, \({V}_{{\bar{\alpha }}}^{(N,d)}=V_{\alpha }^{(N)}\) (cf. Serfozo 1979). Moreover, we can approximate \({V}_{{\bar{\alpha }}}^{(N,d)}\) by using the value iteration algorithm. Indeed, the uniformised N-perturbed MDP in discrete time satisfies the conditions from Wessels (1977) for value iteration to converge. This is easily deduced from the fact that the N-perturbed MDP in continuous time satisfies the conditions of (Blok and Spieksma 2015, Theorems 4.2 and 5.1), which are the continuous time versions of the conditions developed by Wessels in discrete time.

Let \(v_{n,{\bar{\alpha }}}^{(N,d)}: {S}\rightarrow {\mathbb {R}}\) for \(n\ge 0\) be given by the following iteration scheme. Put \(v_{0,{\bar{\alpha }}}^{(N,d)}\equiv 0\), and

$$\begin{aligned} v_{n+1,{\bar{\alpha }}}^{(N,d)}(x)&= \sum _{i=1}^K {\bar{c}}_i x_i +{\bar{\alpha }}\Big \{ \sum _{i=1}^K \Big (1-\frac{x_i}{N_i}\Big )^+\lambda _i v_{n,{\bar{\alpha }}}^{(N,d)}(x+e_i) +\sum _{i=1}^K \min \{x_i,N_i\}\beta _i v_{n,{\bar{\alpha }}}^{(N,d)}(x-e_i)\\&\quad +\, \min _{0\le j \le K} \big \{ \mu _j \big [v_{n,{\bar{\alpha }}}^{(N,d)}((x-e_j)^+)-v_{n,{\bar{\alpha }}}^{(N,d)}(x)\big ]\big \} \\&\quad +\,\sum _{i=1}^K \big (\min \big \{\frac{x_i}{N_i},1\big \}\lambda _i + (N_i - x_i)^+\beta _i \big )+ \mu \big ) v_{n,{\bar{\alpha }}}^{(N,d)}(x)\Big \}\\&={\bar{\alpha }}\Big (\sum _{i=1}^K c_i x_i +\Big \{ \sum _{i=1}^K \Big (1-\frac{x_i}{N_i}\Big )^+\lambda _i v_{n,{\bar{\alpha }}}^{(N,d)}(x+e_i) +\sum _{i=1}^K \min \{x_i,N_i\}\beta _i v_{n,{\bar{\alpha }}}^{(N,d)}(x-e_i)\\&\quad +\, \min _{0\le j \le K} \big \{ \mu _j \big [v_{n,{\bar{\alpha }}}^{(N,d)}((x-e_j)^+)-v_{n,{\bar{\alpha }}}^{(N,d)}(x)\big ]\big \} \\&\quad +\,\sum _{i=1}^K \big (\min \big \{\frac{x_i}{N_i},1\big \}\lambda _i + (N_i - x_i)^+\beta _i \big )+ \mu \big ) v_{n,{\bar{\alpha }}}^{(N)}(x)\Big \}\Big ). \end{aligned}$$

We will prove by induction that \(v_{n,{\bar{\alpha }}}^{(N,d)}\in wUI \cap I\) on \({S}^N\), for all \(n\ge 0\).

To employ the induction argument, we need three additional structural properties: convexity, supermodularity and bounded increasingness. We will specify these hereafter. The induction hypothesis \(v_{0,{\bar{\alpha }}}^{(N,d)}\equiv 0\) trivially satisfies all these properties. For the induction step, we will use Event Based Dynamic Programming (EBDP). This method uses event operators—representing arrivals, departures or cost—as building blocks to construct the iteration step of the value iteration algorithm.

Definition 3

Let \(f:{S}\rightarrow {\mathbb {R}}\), then define

  1. 1.
    1. (a)

      The total smoothed arrivals operator

      $$\begin{aligned} {\mathcal {T}}_{SA}^{N}f:={\bar{\lambda }}^{-1}\sum _{i=1}^K \lambda _i {\mathcal {T}}_{SA(i)}^Nf, \end{aligned}$$

      using

    2. (b)

      the smoothed arrivals operator given by

      $$\begin{aligned} {\mathcal {T}}_{SA(i)}^Nf(x):=\left\{ \begin{array}{ll} \left( 1-\frac{x_i}{N_i}\right) f(x+e_i) +\frac{x_i}{N_i}f(x), &{} \quad x_i\le N_i,\\ f(x), &{} \quad \text {else.} \end{array} \right. \end{aligned}$$
  2. 2.
    1. (a)

      The total increasing departures operator

      $$\begin{aligned} {\mathcal {T}}_{ID}^{N}f:=\beta _N^{-1}\sum _{i=1}^K \beta _i N_i {\mathcal {T}}_{ID(i)}^Nf, \end{aligned}$$

      using

    2. (b)

      the increasing departures operator

      $$\begin{aligned} {\mathcal {T}}_{ID(i)}^Nf(x):=\left\{ \begin{array}{ll} \frac{x_i}{N_i}f(x-e_i)+ \left( 1-\frac{x_i}{N_i}\right) f(x), &{} \quad x_i \le N_i,\\ f(x-e_i), &{} \quad \text {else.} \end{array} \right. \end{aligned}$$
  3. 3.

    The cost operator

    $$\begin{aligned} {\mathcal {T}}_Cf(x):= \sum _{i=1}^K c_ix_i +f(x). \end{aligned}$$
  4. 4.

    The cost + increasing departures operator

    $$\begin{aligned} {\mathcal {T}}_{CID}^{N}:=\beta _N^{-1} {\mathcal {T}}_C(\beta _N {\mathcal {T}}_{ID}^{N}). \end{aligned}$$
  5. 5.

    The movable server operator

    $$\begin{aligned} {\mathcal {T}}_{MS}f(x):= \min _{1\le j\le K} \Big \{\frac{\mu _j}{\mu } f((x-e_j)^+) + \Big (1-\frac{\mu _j}{\mu }\Big )f(x)\Big \}. \end{aligned}$$
  6. 6.

    For \(f_1,f_2,f_3:{S}\rightarrow {\mathbb {R}}\), the uniformisation operator

    $$\begin{aligned} {\mathcal {T}}_{UNIF}^N(f_1,f_2,f_3):= {\bar{\lambda }} f_1 +\beta _N f_2 + \mu f_3. \end{aligned}$$
  7. 7.

    The discount operator

    $$\begin{aligned} {\mathcal {T}}_{DISC}^{{\bar{\alpha }}}f:= {\bar{\alpha }}f. \end{aligned}$$

Now, \(v_{n+1,{\bar{\alpha }}}^{(N,d)}\) is constructed from \(v_{n,{\bar{\alpha }}}^{(N,d)}\) as follows

$$\begin{aligned} {\mathcal {T}}_{DISC}^{{\bar{\alpha }}}\left( {\mathcal {T}}_{UNIF}^N\left( {\mathcal {T}}_{SA}^{N}v_{n,{\bar{\alpha }}}^{(N,d)} ,{\mathcal {T}}_{CID}^{N}v_{n,{\bar{\alpha }}}^{(N,d)},{\mathcal {T}}_{MS}v_{n,{\bar{\alpha }}}^{(N,d)}\right) \right) = v_{n+1,{\bar{\alpha }}}^{(N,d)}. \end{aligned}$$

As has been mentioned, it is sufficient to verify that \( v_{n,{\bar{\alpha }}}^{(N,d)}\) has the desired structural properties on the finite set \({S}^N\). Therefore, we define the following collections of functions restricted to \({S}^N\).

Definition 4

(Properties on \({S}^N\))

  1. 1.

    Weighted upstream increasing functions on \({S}^N\)

    $$\begin{aligned}&wUI_N= \{f:{S}\rightarrow {\mathbb {R}}\ |\quad \mu _i(f(x+e_i+e_{i+1})-f(x+e_{i+1}))\\&\quad -\mu _{i+1}(f(x+e_i+e_{i+1})-f(x+e_i)) \ge 0,\\&\qquad \text { for all } x,x+e_i+e_{i+1} \in {S}^N,\ 1\le i< K\}. \end{aligned}$$
  2. 2.

    Increasing functions on \({S}^N\)

    $$\begin{aligned} I_N=\{f:{S}\rightarrow {\mathbb {R}}\ |\ f(x+e_i)-f(x)\ge 0,\text { for all } x,x+e_i \in {S}^N,\ 1\le i \le K\}. \end{aligned}$$
  3. 3.

    Supermodular functions on \({S}^N\)

    $$\begin{aligned}&\,{Super}_N=\{f:{S}\rightarrow {\mathbb {R}}\ |\ f(x+e_i+e_j)-f(x+e_i)-f(x+e_j)+f(x)\ge 0,\\&\quad \text { for all } x,x+e_i+e_{j} \in {S}^N,\ 1\le i<j\le K\}. \end{aligned}$$
  4. 4.

    Convex functions on \({S}^N\)

    $$\begin{aligned}&Cx_N=\{f:{S}\rightarrow {\mathbb {R}}\ |\ f(x+2e_i)\\&\quad -2f(x+e_i)+f(x)\ge 0,\text { for all } x,x+2e_i \in {S}^N,\ 1\le i \le K\}. \end{aligned}$$
  5. 5.

    Bounded increasing functions on \({S}^N\)

    $$\begin{aligned} Bd_N=\{f:{S}\rightarrow {\mathbb {R}}\ |\ f(x+e_i)-f(x)\le \frac{c_i}{\beta _i},\text { for all } x,x+e_i \in {S}^N.\ 1\le i \le K\}. \end{aligned}$$

The following propositions are sufficient for the desired structural properties to propagate through the induction step.

Proposition 1

The smoothed arrivals operator has the following propagation properties

  1. (i)
    $$\begin{aligned} {\mathcal {T}}_{SA}^{N}: I_N\rightarrow I_N, Cx_N\rightarrow Cx_N, \,{Super}_N\rightarrow \,{Super}_N, Bd_N\rightarrow Bd_N. \end{aligned}$$
  2. (ii)

    If moreover \(N\in {\mathcal {N}}(\lambda )\), then

    $$\begin{aligned} {\mathcal {T}}_{SA}^{N}: I_N\cap wUI_N\rightarrow wUI_N. \end{aligned}$$

Proposition 2

The increasing departure operator has the following propagation properties

$$\begin{aligned} {\mathcal {T}}_{ID}^{N}: I_N\rightarrow I_N, Cx_N\rightarrow Cx_N, \,{Super}_N\rightarrow \,{Super}_N. \end{aligned}$$

Proposition 3

The cost operator has the following propagation properties

$$\begin{aligned} {\mathcal {T}}_C: I_N\rightarrow I_N, Cx_N\rightarrow Cx_N, \,{Super}_N\rightarrow \,{Super}_N. \end{aligned}$$

Proposition 4

The cost + increasing departures operator has the following propagation properties

  1. (i)
    $$\begin{aligned} {\mathcal {T}}_{CID}^{N}: Bd_N\rightarrow Bd_N. \end{aligned}$$
  2. (ii)

    If moreover, for all \(1\le i< K\), it holds that \(c_i\ge c_{i+1},\ c_i\mu _i\ge c_{i+1}\mu _{i+1},\ {c_i\mu _i}/{\beta _i}\ge {c_{i+1}\mu _{i+1}}/{\beta _{i+1}}\), then

    $$\begin{aligned} {\mathcal {T}}_{CID}^{N}: I_N\cap wUI_N\cap \,{Super}_N\cap Bd_N\rightarrow wUI_N.\end{aligned}$$

Proposition 5

The movable server operator has the following propagation properties

  1. (i)
    $$\begin{aligned}&{\mathcal {T}}_{MS}: I_N\cap wUI_N\rightarrow I_N\cap wUI_N,\\&I_N\cap wUI_N\cap Cx_N\cap \,{Super}_N\rightarrow Cx_N\cap \,{Super}_N. \end{aligned}$$
  2. (ii)

    If moreover, for all \(1\le i< K,\ {c_i\mu _i}/{\beta _i}\ge {c_{i+1}\mu _{i+1}}/{\beta _{i+1}}\) then

    $$\begin{aligned} {\mathcal {T}}_{MS}:I_N\cap wUI_N\cap Bd_N\rightarrow Bd_N. \end{aligned}$$

Proposition 6

The uniformisation operator has the following propagation properties:

$$\begin{aligned} {\mathcal {T}}_{UNIF}^N: wUI_N^3 \rightarrow wUI_N, I_N^3 \rightarrow I_N, Cx_N^3 \rightarrow Cx_N, \,{Super}_N^3 \rightarrow \,{Super}_N, Bd_N^3 \rightarrow Bd_N. \end{aligned}$$

Proposition 7

The discount operator has the following propagation properties:

$$\begin{aligned} {\mathcal {T}}_{DISC}^{{\bar{\alpha }}}: wUI_N\rightarrow wUI_N, I_N\rightarrow I_N, Cx_N\rightarrow Cx_N, \,{Super}_N\rightarrow \,{Super}_N, Bd_N\rightarrow Bd_N. \end{aligned}$$

The proofs of the propositions are provided in the “Appendix”.

Corollary 1

Let \(N\in {\mathcal {N}}(\lambda )\), \(0<{\bar{\alpha }}<1\) and for \(1\le i <K \), \(c_i\ge c_{i+1},\ c_i\mu _i\ge c_{i+1}\mu _{i+1},\ {c_i\mu _i}/{\beta _i}\ge {c_{i+1}\mu _{i+1}}/{\beta _{i+1}}\).

  1. (i)

    Then, for all \(n\ge 0\)

    $$\begin{aligned} v_{n,{\bar{\alpha }}}^{(N,d)}\in wUI_N\cap I_N\cap Cx_N\cap \,{Super}_N\cap Bd_N; \end{aligned}$$
  2. (ii)

    consequently,

    $$\begin{aligned} V_{{\bar{\alpha }}}^{(N,d)}\in wUI_N\cap I_N\cap Cx_N\cap \,{Super}_N\cap Bd_N. \end{aligned}$$

Proof

Denote \({\mathcal {A}}= wUI_N\cap I_N\cap Cx_N\cap \,{Super}_N\cap Bd_N\). First notice that \(v_{0,{\bar{\alpha }}}^{(N,d)}\in {\mathcal {A}}\). Further, under the above conditions we have that \({\mathcal {T}}_{SA}^{N},\ {\mathcal {T}}_{CID}^{N},\ {\mathcal {T}}_{MS},\ {\mathcal {T}}_{DISC}^{{\bar{\alpha }}}: {\mathcal {A}} \rightarrow {\mathcal {A}}\) and \({\mathcal {T}}_{UNIF}^N: {\mathcal {A}}^3 \rightarrow {\mathcal {A}}\). This means that

$$\begin{aligned} {\mathcal {T}}_{DISC}^{{\bar{\alpha }}}({\mathcal {T}}_{UNIF}^N({\mathcal {T}}_{SA}^{N},{\mathcal {T}}_{CID}^{N},{\mathcal {T}}_{MS})): {\mathcal {A}} \rightarrow {\mathcal {A}}. \end{aligned}$$

Now suppose that \(v_{n,{\bar{\alpha }}}^{(N,d)}\in {\mathcal {A}}\). Then also \(v_{n+1,{\bar{\alpha }}}^{(N,d)} = {\mathcal {T}}_{DISC}^{{\bar{\alpha }}}({\mathcal {T}}_{UNIF}^N({\mathcal {T}}_{SA}^{N},{\mathcal {T}}_{CID}^{N},{\mathcal {T}}_{MS}))v_n^{{\bar{\alpha }},N} \in {\mathcal {A}}\). Assertion i) follows by induction.

Assertion ii) immediately follows from i) due to convergence of value iteration [see Wessels 1977, (Blok and Spieksma 2017, Theorem 5.2)]. \(\square \)

4 Proof of main theorems

Proof of Theorem 1

Suppose that for \(1\le i <K \), \(c_i\ge c_{i+1}\), \( c_i\mu _i\ge c_{i+1}\mu _{i+1}\), \( {c_i\mu _i}/{\beta _i}\ge {c_{i+1}\mu _{i+1}}/{\beta _{i+1}}\). Let the continuous-time discount factor \(\alpha >0\), then the discrete time discount factor \({\bar{\alpha }}= {1}/{\alpha +1}\) satisfies \(0<{\bar{\alpha }}<1\). Take \(N\in {\mathcal {N}}(\lambda )\), then Corollary 1 implies that \(V_{{\bar{\alpha }}}^{(N,d)}=V_{\alpha }^{(N)}\in wUI_N\cap I_N\). This model satisfies the parametrised Markov processes theorem in (Blok and Spieksma 2015, Theorem 5.1), which implies continuity in the truncation parameter. This means that \(V_{\alpha }^{(N)}\rightarrow V_{\alpha }\) as \(N\rightarrow \infty \). Hence, \(V_{\alpha }\in wUI \cap I\) and so by virtue of Lemma 1, the smallest index policy is \(\alpha \)-discount optimal. \(\square \)

Proof of Theorem 2

Suppose that for \(1\le i <K \), \(c_i\ge c_{i+1}\), \( c_i\mu _i\ge c_{i+1}\mu _{i+1}\), \( {c_i\mu _i}/{\beta _i}\ge {c_{i+1}\mu _{i+1}}/{\beta _{i+1}}\). By Theorem 1 the Smallest Index Policy, \(\pi ^\alpha \) say, is \(\alpha \)-discount optimal, for all \(\alpha >0\).

Notice, that the model satisfies the assumptions of (Blok and Spieksma 2017, Theorem 5.7). This theorem implies the existence of a sequence \((\alpha _m)\) with \(\lim _{m\rightarrow \infty }\alpha _m=0\), such that the limit \(\lim _{m\rightarrow \infty } \pi ^{\alpha _m}\) is average optimal. Since \(\pi ^\alpha \) is the smallest index policy for all \(\alpha >0\), so is the limit policy. Hence the Smallest Index Policy is average optimal. \(\square \)

5 Numerical results

The triple inequality on the input parameters of the process guaranteeing optimality of the Smallest Index Policy induces a lot of parameter configurations that fall outside the scope of the theorems. This naturally gives rise to the question whether all three inequalities are necessary.

From numerical calculations, it follows that we cannot omit one of the three conditions. If one of these three inequalities is violated, then the examples below show that the Smallest Index Policy need not be optimal. We carried out the calculations for \(K=2\).

Fig. 1
figure 1

Optimality of switching curve if first condition is violated

Fig. 2
figure 2

Optimality of switching curve if second condition is violated

Fig. 3
figure 3

Optimality of highest index if third condition is violated

  1. 1.

    Consider the following parameter setting

    We see that the first condition is violated, \(c_1<c_2\), while the other conditions are satisfied. The optimal policy is a switching curve policy: for ‘small’ states action 2 is optimal and for ‘large’ states action 1 is optimal, see Fig. 1. Note that colour green corresponds to action 1, i.e. serving queue 1, and colour red to action 2 i.e. serving queue 2.

  2. 2.

    The next parameter setting is given by

    Observe, that the first and the third condition hold, but the second condition is violated. In Fig. 2 the optimal policy is displayed. We see that the Smallest Index Policy need not be optimal. There is a small region—with only few customers—where it is optimal to take action 2. In the larger states action 1 is optimal, that is, the Smallest Index Policy is optimal.

  3. 3.

    The final parameter setting is given by

    Here only the first and second condition are satisfied. Figure 3 shows that it can be optimal to serve the station with the highest index instead of the smallest index.

Another observation can be made, based on these examples. In all cases a switching curve policy is optimal. Since an index policy can be viewed as a degenerate switching curve policy, we conjecture that a switching curve policy is always optimal.