1 Introduction

1.1 Motivation

We investigate the performance of a multi-skilled queueing system formed by parallel servers with Processor Sharing (PS) queues. Jobs of different classes of customers arrive to the system following a Poisson process. There is one dedicated server for each class of customer and one multi-skilled server that can execute jobs of all classes. Furthermore, we assume that jobs are assigned to the servers according to the Bernoulli policy. Our goal is to find the optimal load balancing so as to minimize the weighted mean number of jobs in the system.

The main application of our model comes from wireless networks. Consider a region divided in different subregions. Each dedicated server models an antenna that provides service to a unique subregion and the multi-skilled server models a central antenna that provides service to all the subregions. Using the results of this article, one can determine how the traffic of each subregion must be shared between the antenna of that region and the central one in order to minimize the performance of the system. This architecture has been previously considered by Taboada et al. (2017) in a different context where dedicated servers (or microcells in their model) can be switched on and off so as to minimize the weighted sum of the mean delay and the mean power consumption in the system.

1.2 Related work

Load balancing has been widely investigated in different contexts. In data centers, for example, various policies depending upon the information available to the dispatcher have been proposed. In general, optimal policies for the typical performance measures such as mean processing times are not easy to determine albeit in some specific cases. For example, when no information on the state of the servers is available, the optimal Bernoulli routing policy was determined in Altman et al. (2011) for mono-skilled servers only. For FCFS servers, a policy based on Sturm sequences (Gaujal et al. 2006) are known to be optimal. With more information on the server state, a number of heuristics such as Join the Shorter of d queues (Mitzenmacher 2001; Vvedenskaya et al. 1996) and Join the Shortest Queue (Graham 2000) have been analyzed in the large server asymptotic case. In addition, there are various pull-based policies such as Join the Idle Queue (Lu et al. 2011) that are known to work well in practice. Another important routing policy is the Size Interval Task Assignment (Harchol-Balter et al. 1999) where jobs of different sizes are executed in different servers and, therefore, the service requirement of incoming tasks need to be known. This policy has been further studied in Feng et al. (2005) and the author in Harchol-Balter (2000) presented a variation of this policy in which the size of jobs does not need to be known.

Load balancing has also been investigated for balancing energy costs in data centers using Energy Packet Networks model (Fourneau 2020), whereas in Liu et al. (2015) it is considered that data centers are located in different geographical zones. The above works are mostly concerned with mono-skilled or homogeneous servers.

In networks with multi-skilled agents or servers, skill-based routing policies have been proposed and investigated (Koole et al. 2003; Wallace and Whitt 2005). These works are mainly oriented towards call-center architectures with Erlang-B or Erlang-C type of queues. An illustrative example is an overflow-type policy, where each incoming call has a list of agents ordered by priority, with highest priority given to mono-skilled ones, and is routed to the first available agent of this list. If no agent is available, the call can be queued or blocked depending on the architecture. These routing policies are usually difficult to analyze and the cited works are interested in approximations for the various performance measures for a given policy. In these models, obtaining the optimal policy analytically is not easy. We refer to Chen et al. (2020) for a recent survey on multi-skilled systems.

Multi-skilled queues appear also in the analysis of redundancy systems (Gardner et al. 2015; Bonald et al. 2017) in which incoming requests can be sent simultaneously to a subset of queues. We do not investigate the redundancy aspect.

The network topology we consider makes our model different from Altman et al. (2011) in which all servers can execute all type of tasks.

1.3 Contributions

The main contributions of the article are summarized as follows:

  • We provide a necessary and sufficient condition for the stability of the system.

  • We show that the optimization problem in terms of probabilities can be reformulated in terms of the loads of the servers, which is a convex problem with linear constraints. We also show that if there are routing probabilities that satisfy the original problem (that we will call (PROB-OPT)), then it is possible to find loads that satisfy the reformulated problem (that we will call (LOAD-OPT)).

  • We fully characterize the optimal loads on the servers for two classes of customers. For more than two customers, we provide in Proposition 2 conditions under which each class of traffic satisfies one of the following: (i) it sends all its traffic to its dedicated server, (ii) it sends all its traffic to the multi-skilled server and (iii) it shares its traffic among the multi-skilled and its dedicated server.

  • Using the result of Proposition 2, we present a fixed-point algorithm whose convergence ensures that the optimal loads on the servers are achieved. This algorithm starts with an initial condition of the set of servers (according to one of the three possible traffic sharing policies of Proposition 2) and its fixed point is given by the partition of the set of servers. Providing an analytical proof of this convergence on the partition of the set of servers seems to be an extremely difficult task. However, we illustrate the convergence of this algorithm using numerical experiments.

  • We compare the performance of our model with the performance of two models. The first model consists of a system where all the servers are multi-skilled and we show the existence of a switching curve, i.e., when the arrival rate of one of the traffic increases, the model whose performance is better changes. The second model consists of a system with no sharing, that is, all the servers are dedicated or mono-skilled, and we provide conditions on the arrival rates such that the performance of the no-sharing model is larger than the performance of our model.

  • We delve into the comparison of the aforementioned models using numerical experiments. First, we show the uniqueness of the switching curve when we compare our model with a system where all the servers are multi-skilled. We also observe that, in a system formed by servers with equal capacity and different (but not extremely large) holding costs, the region where our model outperforms the all-sharing system is very large.

1.4 Organization

In the next section, we describe the network model and define the optimization problem. Section 3 gives the stability condition and presents an equivalent problem in terms of loads on the servers. In Sect. 4, the main results on the structure of the optimal policy are provided for the model under consideration. We compare in Sect. 5 the performance of our model with the performance of models with other network topologies. We present our numerical experiments in Sect. 6. Finally, we discuss our main conclusions in Sect. 7.

2 Model description

2.1 Notation

We consider a server farm with Processor Sharing (PS) queues and an input traffic of different classes. Let \({\mathcal {K}}=\{1,2,\dots ,C\}\) be the set of classes. We assume that jobs of class \(i\in {\mathcal {K}}\) arrive to the system according to a Poisson process and have generally distributed service timesFootnote 1. Let \(\eta _i\) be the traffic intensity of jobs of class i. The class of a job defines the set of servers that can be assigned to this job (Fig. 1).

Fig. 1
figure 1

The model under study in this article

We consider a system with \(C+1\) servers. Let \({\mathcal {S}}=\{0,1,\dots ,C\}\) be the set of servers. For a server \(j\in {\mathcal {S}}\), we denote by \(r_j\) the capacity or speed of Server j (i.e., the amount of traffic that can be served per unit of time) and by \(c_j\) its holding cost. We denote by \({\mathcal {S}}_i\) the set of servers that can execute jobs of class i and, for \(A\subset {\mathcal {K}}\), \({\mathcal {S}}_A=\cup _{i\in A}{\mathcal {S}}_i.\) For \(j=1,\dots ,C\), Server j executes jobs of class j, i.e., they are dedicated servers. On the other hand, Server 0 executes jobs of all the classes, i.e., it is a multi-skilled server.

For \(i=1,\dots ,C\), we denote by \(p_{i}\) the probability that a class i job is executed in its dedicated server, i.e., Server i. For \(j=1,\dots ,C\), the load of Server j is defined as follows

$$\begin{aligned} \rho _j({\mathbf {p}})=\frac{\eta _jp_{j}}{r_j}, \end{aligned}$$
(1)

whereas for Server 0 as

$$\begin{aligned} \rho _0({\mathbf {p}})=\frac{\sum _{j\in {\mathcal {K}}}\eta _j(1-p_{j})}{r_0}. \end{aligned}$$
(2)

2.2 Problem formulation

For a given routing strategy \({\mathbf {p}}=(p_{i})\), the mean number of jobs of server j is denoted by \({\mathbb {E}}[N_j({\mathbf {p}})]\). In this article, we aim to find the routing matrix that minimizes the total cost of the system. More specifically, we analyze the following optimization problem:

$$\begin{aligned} {\min _{{\mathbf {p}}} \quad }&\sum _{j\in {\mathcal {S}}}c_j{\mathbb {E}}[N_j({\mathbf {p}})] \end{aligned}$$
(PROB-OPT)
$$\begin{aligned}&0\le p_{i}\le 1, \text{ for } \text{ all } i\in {\mathcal {K}} ; \end{aligned}$$
(3)
$$\begin{aligned}&\eta _ip_{i} < r_i, \text{ for } \text{ all } i\in {\mathcal {K}} ; \end{aligned}$$
(4)
$$\begin{aligned}&\sum _{i\in {\mathcal {K}}}\eta _i(1-p_{i})< r_0. \end{aligned}$$
(5)

The first constraint ensures that \(p_i\)’s are probabilities. The second and third constraints ensure that all the servers are stable, that is, that the total incoming traffic into a server is smaller than its service capacity.

3 Preliminary results

We first study the existence of a feasible solution of (PROB-OPT). This is the same as characterizing the conditions under which the system can be stabilized. In the following proposition, we provide this result.

Proposition 1

(Stability) The system under consideration can be stabilized if and only if

$$\begin{aligned} r_0+\sum _{i\in A}r_i>\sum _{i\in A}\eta _i, \ \forall A\subset {\mathcal {K}}. \end{aligned}$$
(6)

Proof

See “Appendix A”. \(\square \)

Since servers are M/G/1-PS queues, we know from Thm 3.8 and Thm 3.9 of Kelly (1979) that the probability of being n jobs in Server i is \(\rho _i^n(1-\rho _i)\). Therefore, it follows directly that \({\mathbb {E}}[N_j({\mathbf {p}})]=\frac{\rho _j}{1-\rho _j}\) and, as a consequence, we can reformulate (PROB-OPT) in terms of the loads on the servers as follows:

$$\begin{aligned} {\min _{\mathbf \rho } \quad }&\sum _{j\in {\mathcal {S}}}c_j\frac{\rho _j}{1-\rho _j} \end{aligned}$$
(LOAD-OPT)
$$\begin{aligned}&\sum _{i\in {\mathcal {K}}}\eta _i= r_0\rho _0+\sum _{j\in {\mathcal {K}}}r_j\rho _{j}; \end{aligned}$$
(7)
$$\begin{aligned}&0\le \rho _{j} < 1, \text{ for } \text{ all } j \in {\mathcal {S}}; \end{aligned}$$
(8)
$$\begin{aligned}&\sum _{i\in A}\eta _i\le r_0\rho _0+\sum _{j\in {\mathcal {A}}}r_j\rho _{j}, \ \forall A\subset {\mathcal {K}}. \end{aligned}$$
(9)

We now show that the optimization problems we have considered so far are related. More precisely, we show that, if there are routing probabilities that satisfy (PROB-OPT), then it is possible to find loads that satisfy (LOAD-OPT).

Lemma 1

Let \( {\mathbf {p}}\) be a routing strategy that satisfies (3)–(5). Then, for all \(j\in {\mathcal {S}}\), \(\rho _j({\mathbf {p}})\) also satisfies the constraints of (LOAD-OPT).

Proof

First, we observe that, if (3)-(5) are satisfied, using (1) and (2), it follows that \(0\le \rho _j < 1\) for all \(j\in {\mathcal {S}}\).

We now show that \(\sum _{i\in {\mathcal {K}}}\eta _i= r_0\rho _0+\sum _{j\in {\mathcal {K}}}r_j\rho _{j}\) in the following way:

$$\begin{aligned} r_0\rho _0+\sum _{j\in {\mathcal {K}}}r_j\rho _{j}=\sum _{i\in {\mathcal {K}}}\eta _i(1-p_i)+\sum _{i\in {\mathcal {K}}}\eta _ip_i=\sum _{i\in {\mathcal {K}}}\eta _i, \end{aligned}$$

where the first equality is given using (1) and (2).

Finally, we focus on the constraint \(\sum _{i\in A}\eta _i\le r_0\rho _0+\sum _{j\in {\mathcal {A}}}r_j\rho _{j}, \ \forall A\subset {\mathcal {K}}\). Using again (1) and (2), we have for all \(A\subset {\mathcal {K}}\) that

$$\begin{aligned} r_0\rho _0+\sum _{j\in A }r_j\rho _{j}=\sum _{i\in {\mathcal {K}}}\eta _i(1-p_i)+\sum _{i\in {\mathcal {A}}}\eta _ip_i\ge \sum _{i\in {\mathcal {A}}}\eta _i(1-p_i)+\sum _{i\in {\mathcal {A}}}\eta _ip_i=\sum _{i\in {\mathcal {A}}}\eta _i. \end{aligned}$$

And the desired result follows. \(\square \)

Note that (LOAD-OPT) is a convex problem with linear constraints and has an unique solution as long as the stability condition in Proposition 1 is verified. Moreover, from the above lemma, the solution of (PROB-OPT) can be obtained by optimizing directly over the loads. Then, the optimal routing probabilities can be determined later from (1), once the optimal load on each server is determined.

Remark 1

Let us remark that we assume that servers are M/G/1-PS queues. However, the results of this article are also valid if we assume that servers are M/M/1 queues with any work-conserving queueing discipline (note that the mean number of customers of a server j in both cases is \(\rho _j/(1-\rho _j)\)).

4 Analysis of the solution of (LOAD-OPT)

Let \(\delta _j=\sqrt{\frac{c_j/r_j}{c_0/r_0}}\) for all \(j\in {\mathcal {K}}\). We denote by \(C_b\) the set of classes that route traffic to two servers, by \(C_0\) the set of classes that routes all the traffic to Server 0 and by \(C_d\) the set of classes that send all the traffic to its dedicated server.

In the following proposition, we present the first result of this section. It gives the conditions under which a class of traffic belongs to \(C_b\), \(C_0\) or \(C_d\).

Proposition 2

Jobs of class i routes traffic to Server 0 if and only if

$$\begin{aligned} \delta _i>\frac{1-\frac{\eta _i}{r_i}}{1-\rho _0^*}, \end{aligned}$$

and all the traffic of class i is routed to Server 0 if and only if

$$\begin{aligned} \delta _i\ge \frac{1}{1-\rho _0^*}, \end{aligned}$$

where \(\rho _0^*\) is the optimal load at Server 0 and is given by

$$\begin{aligned} \rho _0^*=1-\frac{r_0+\sum _{j\in C_b}r_j-\sum _{j\in C_b\cup C_0}\eta _j}{r_0+\sum _{j\in C_b}\delta _jr_j}. \end{aligned}$$
(10)

Besides, if \(j\in C_d\) the optimal load of Server j is \(\frac{\eta _j}{r_j}\), if \(j\in C_b\) the optimal load of Server j is

$$\begin{aligned} \rho _j^*=1-\delta _j(1-\rho _0^*) \end{aligned}$$
(11)

and if \(j\in C_0\) the optimal load of Server j is zero.

Proof

See “Appendix B”. \(\square \)

The above result leads to this corollary which gives a simple sufficient condition to determine when a given class will not send all its traffic to the multi-skilled server.

Corollary 1

Let \(j\in {\mathcal {S}}\). If \(\delta _j<1\), then \(j\notin C_0\).

Proof

Since \(\delta _j<1\), we have that the condition \( \delta _j<\frac{1}{1-\rho _0^*} \) is always satisfied and this implies that \(j\notin C_0\) according to Proposition 2. \(\square \)

From the above corollary, it follows another interesting property that says that, if \(\delta _j<1\) for all \(j\in {\mathcal {K}}\), then \(C_0=\emptyset \).

The next result gives an ordering which can help identify classes that use both the dedicated and the multi-skilled server. This can be seen as a way to determine, for a given set of input parameters (arrival rate, server speeds, holding costs, etc.), the skills for which we need to train the multi-skilled servers in order for the system to be optimal.

Proposition 3

Let \(\delta _i\le \delta _j\).

  1. (a)

    If \(i\in C_0\), then \(j\in C_0\).

  2. (b)

    If \(i\in C_b\cup C_0\) and \(\frac{\eta _i}{r_i}\le \frac{\eta _j}{r_j}\), then \(j\in C_b\cup C_0\).

Proof

We first show (a). We consider that \(i\in C_0\). Since \(\delta _j\ge \delta _i\), it follows that \( \delta _j\ge \delta _i\ge \frac{1}{1-\rho _0^*}. \)

Therefore, from Proposition 2, \(j\in C_0\).

We now show (b). We consider that \(i\in C_b\cup C_0\). Since \(\frac{\eta _i}{r_i}\le \frac{\eta _j}{r_j}\) and \(\delta _j\ge \delta _i\), it follows that

$$\begin{aligned} \delta _j\ge \delta _i\ge \frac{1-\frac{\eta _i}{r_i}}{1-\rho _0^*}\ge \frac{1-\frac{\eta _j}{r_j}}{1-\rho _0^*}. \end{aligned}$$

Therefore, from Proposition 2, \(j\in C_b\cup C_0\). \(\square \)

We note that (b) of the above result can be stated as follows: if class j routes all the traffic to Server j, class i routes all its traffic to Server i when \(\frac{\eta _i}{r_i}\le \frac{\eta _j}{r_j}\) and \(\delta _i\le \delta _j\). In the following result, we show that, under similar conditions, the set of classes that send traffic to two servers can never be \(\{i,j\}\).

Proposition 4

If \(\delta _i\le \delta _j<1\) and \(\frac{\eta _i}{r_i}\le \frac{\eta _j}{r_0+r_j}\), then \(C_b\) cannot be \(\{i,j\}\).

Proof

We assume that \(C_b=\{i,j\}\). For this case, it follows from (10) that

$$\begin{aligned} \frac{1}{1-\rho _0^*}\ge \frac{r_0+r_j\delta _j+r_i\delta _i}{r_0+r_j+r_i-\eta _j-\eta _i}, \end{aligned}$$

where the above inequality is an equality if \(C_0=\emptyset \).

Since \(\frac{\eta _i}{r_i}\le \frac{\eta _j}{r_0+r_j}\), we have for class i that

$$\begin{aligned}&\delta _i>\frac{1-\frac{\eta _i}{r_i}}{1-\rho _0^*}\ge \left( 1-\tfrac{\eta _i}{r_i}\right) \frac{r_0+r_j\delta _j+r_i\delta _i}{r_0+r_j+r_i-\eta _j-\eta _i}\\&\quad \iff \delta _i(r_0+r_j+r_i-\eta _j-\eta _i)\\&\quad> \left( 1-\tfrac{\eta _i}{r_i}\right) (r_0+r_j\delta _j+r_i\delta _i)\\&\quad \iff \delta _i(r_0+r_j-\eta _j)> \left( 1-\tfrac{\eta _i}{r_i}\right) (r_0+r_j\delta _j)\\&\quad \iff \delta _i>\frac{1-\frac{\eta _i}{r_i}}{1-\frac{\eta _j}{r_0+r_j}} \frac{r_0+r_j\delta _j}{r_0+r_j} \ge \frac{r_0+r_j\delta _j}{r_0+r_j}\\&\quad \iff \delta _i>\delta _j+\frac{r_0(1-\delta _j)}{r_0+r_j}>\delta _j, \end{aligned}$$

which is in contradiction with \(\delta _i\le \delta _j\). \(\square \)

Let \(T_i\) denote the sojourn time of jobs of class i. We now provide an interesting result related to the sojourn time of jobs.

Proposition 5

If \(\delta _i\le \delta _j\). Then, \(c_i{\mathbb {E}}[T_i]\le c_j{\mathbb {E}}[T_j]\).

Proof

We know that the sojourn time of jobs of class i and of class j follow an exponential distribution with rate \(\tfrac{1}{r_i(1-\rho _i^*)}\) and \(\tfrac{1}{r_j(1-\rho _j^*)}\) respectively. Therefore,

$$\begin{aligned} c_i{\mathbb {E}}(T_i)&=\dfrac{c_i}{r_i(1-\rho _i^*)}\\&=\dfrac{c_i}{r_i\delta _i}\dfrac{1}{1-\rho _0^*}\\&=\sqrt{\frac{c_0}{r_0}}\,\sqrt{\frac{c_i}{r_i}}\dfrac{1}{1-\rho _0^*}\\&\le \sqrt{\frac{c_0}{r_0}}\,\sqrt{\frac{c_j}{r_j}}\dfrac{1}{1-\rho _0^*}\\&=c_{j}{\mathbb {E}}(T_{j}). \end{aligned}$$

And the desired result follows. \(\square \)

4.1 Charaterization of the solution of (LOAD-OPT) with \(C=2\)

We now focus on the case \(C=2\). Throughout this article, we refer to this case as the M model. Without loss of generality, we assume that \(\delta _1\le \delta _2\). The goal of this section is to fully characterize the solution of (LOAD-OPT) with \(C=2\).

We first note that, from Proposition 3, it can never be given the following cases: (i) \(C_0=\{1\}\) and \(C_d=\{2\}\) and (ii) \(C_0=\{1\}\) and \(C_b=\{2\}\). For the remaining cases, we have the following options:

  1. 1.

    \(C_d=\{1,2\}\). In this case, each class sends all its traffic to the dedicated server. Therefore, \(\rho ^*_i=\frac{\eta _i}{r_i}\) for \(i=1,2\) and \(\rho ^*_0=0\). According to Proposition 2 this occurs when

    $$\begin{aligned} \delta _i\le 1-\tfrac{\eta _i}{r_i},\ \ i=1,2. \end{aligned}$$
  2. 2.

    \(C_d=\{1\}\) and \(C_b=\{2\}\). In this case, all the traffic of class 1 is sent to Server 1 and the traffic of class 2 is sent to Server 0 and Server 2. As a result, \(\rho ^*_1=\frac{\eta _1}{r_1}\) and, from (10) and (11) we obtain that \(\rho _2^*=1-\delta _2\frac{r_0+r_2-\eta _2}{r_0+\delta _2r_2}\) and \(\rho ^*_0=1-\frac{r_0+r_2-\eta _2}{r_0+\delta _2r_2}\). According to Proposition 2, this case occurs when \(\delta _1\le \frac{1-\tfrac{\eta _1}{r_1}}{1-\rho ^*_0}\) and \(\frac{1}{1-\rho _0^*}>\delta _2>\frac{1-\frac{\eta _2}{r_2}}{1-\rho _0^*}\), i.e.,

    $$\begin{aligned} \delta _1&\le \left( 1-\tfrac{\eta _1}{r_1}\right) \frac{r_0+\delta _2r_2}{r_0+r_2-\eta _2}\quad \text { and }\\ \frac{r_0+\delta _2r_2}{r_0+r_2-\eta _2}>\delta _2&> \left( 1-\tfrac{\eta _2}{r_2}\right) \frac{r_0+\delta _2r_2}{r_0+r_2-\eta _2}, \end{aligned}$$

    which simplifying gives

    $$\begin{aligned} \delta _1\le \left( 1-\tfrac{\eta _1}{r_1}\right) \frac{r_0+\delta _2r_2}{r_0+r_2-\eta _2}\quad \text { and }\quad \frac{1}{1-\frac{\eta _2}{r_0}}>\delta _2>1-\frac{\eta _2}{r_2}. \end{aligned}$$
  3. 3.

    \(C_d=\{1\}\) and \(C_0=\{2\}\). In this case, all the traffic of class 1 is sent to Server 1 and the traffic of class 2 is sent to Server 0. As a result, \(\rho ^*_1=\frac{\eta _1}{r_1}\), \(\rho ^*_0=\frac{\eta _2}{r_0}\) and \(\rho _2^*=0\). According to Proposition 2, this case occurs when \(\delta _1\le \frac{1-\tfrac{\eta _1}{r_1}}{1-\rho _0^*}\) and \(\delta _2\ge \frac{1}{1-\rho _0^*},\) i.e.

    $$\begin{aligned} \delta _1\le \frac{1-\tfrac{\eta _1}{r_1}}{1-\frac{\eta _2}{r_0}} \text { and } \delta _2\ge \frac{1}{1-\frac{\eta _2}{r_0}}. \end{aligned}$$
  4. 4.

    \(C_0=\{1,2\}\). In this case, the traffic of both classes is sent to Server 0. Hence, \(\rho _i^*=0\) for \(i=1,2\) and from (10) that \( \rho ^*_0=1-\frac{r_0-\eta _1-\eta _2}{r_0}=\frac{\eta _1+\eta _2}{r_0}. \) According to Proposition 2 and using that \(\delta _1\le \delta _2\), this case occurs when \(\delta _1\ge \frac{1}{1-\rho _0^*},\) i.e.,

    $$\begin{aligned} \delta _1\ge \frac{r_0}{r_0-\eta _1-\eta _2}. \end{aligned}$$
  5. 5.

    \(C_b=\{1,2\}\). In this case, the traffic of class i is sent to Server 0 and Server i, for \(i=1,2\). From (10) and (11), it results that \(\rho _0^*=1-\frac{r_0+r_1+r_2-\eta _1-\eta _2}{r_0+\delta _1r_1+\delta _2r_2}\) and, for \(i=1,2\), \(\rho _i^*=1-\delta _i\frac{r_0+r_1+r_2-\eta _1-\eta _2}{r_0+\delta _1r_1+\delta _2r_2}\). Moreover, we conclude from Proposition 2 that this occurs when, for \(i=1,2\), \(\frac{1}{1-\rho _0^*}>\delta _i>\frac{1-\frac{\eta _i}{r_i}}{1-\rho _0^*}\), which using that \(\delta _2\ge \delta _1\) gives

    $$\begin{aligned} \delta _1&>\left( 1-\tfrac{\eta _1}{r_1}\right) \frac{r_0+\delta _1r_1+\delta _2r_2}{r_0+r_1+r_2-\eta _1-\eta _2}\quad \text { and}\\ \frac{r_0+\delta _1r_1+\delta _2r_2}{r_0+r_1+r_2-\eta _1-\eta _2}>\delta _2&>\left( 1-\tfrac{\eta _2}{r_2}\right) \frac{r_0+\delta _1r_1+\delta _2r_2}{r_0+r_1+r_2-\eta _1-\eta _2}. \end{aligned}$$

    We simplify the above expressions and we obtain

    $$\begin{aligned} \delta _1&>\left( 1-\tfrac{\eta _1}{r_1}\right) \frac{r_0+\delta _2r_2}{r_0+r_2-\eta _2}\quad \text { and}\\ \frac{r_0+\delta _1r_1}{r_0+r_1-\eta _1-\eta _2}>\delta _2&>\left( 1-\tfrac{\eta _2}{r_2}\right) \frac{r_0+\delta _1r_1}{r_0+r_1-\eta _1}. \end{aligned}$$
  6. 6.

    \(C_b=\{1\}\) and \(C_d=\{2\}\). We observe that this case is symmetric to the case 2 (where \(C_b=\{2\}\) and \(C_d=\{1\}\)) and using the same arguments, we get the following conditions

    $$\begin{aligned} \frac{1}{1-\frac{\eta _1}{r_0}}>\delta _1>1-\frac{\eta _1}{r_1}\quad \text { and}\quad \left( 1-\tfrac{\eta _2}{r_2}\right) \delta _2\le \frac{r_0+\delta _1r_1}{r_0+r_1-\eta _1}. \end{aligned}$$
  7. 7.

    \(C_b=\{1\}\) and \(C_0=\{2\}\). In this case, the traffic of class 1 is sent to Server 0 and Server 1, whereas all the traffic of class 2 to Server 0. As a result, we have that \(\rho _2^*=0\) and from (10) and (11), we obtain that \(\rho _0^*=1-\frac{r_0+r_1-\eta _1-\eta _2}{r_0+\delta _1r_1}\) and \(\rho _1^*=1-\delta _1\frac{r_0+r_1-\eta _1-\eta _2}{r_0+\delta _1r_1}\). According to Proposition 2, we conclude that this case occurs when \(\frac{1}{1-\rho _0^*}>\delta _1>\frac{1-\frac{\eta _1}{r_1}}{1-\rho _0^*}\) and \(\delta _2 \ge \frac{1}{1-\rho _0^*}\), i.e.,

    $$\begin{aligned} \delta _2 \ge \frac{r_0+\delta _1r_1}{r_0+r_1-\eta _1-\eta _2}>\delta _1>\left( 1-\tfrac{\eta _1}{r_1}\right) \frac{r_0+\delta _1r_1}{r_0+r_1-\eta _1-\eta _2}, \end{aligned}$$

    which after some simplification results in

    $$\begin{aligned} \delta _2\ge \frac{r_0+\delta _1r_1}{r_0+r_1-\eta _1-\eta _2}>\delta _1>\frac{1-\frac{\eta _1}{r_1}}{1-\frac{\eta _2}{r_0}}. \end{aligned}$$
Fig. 2
figure 2

Conditions 1–7 determine a partition of the half-plane \(0\le \delta _1\le \delta _2\)

The conditions described in items 1-7 split the feasible half-plane \(0\le \delta _1\le \delta _2\) into, at most, 7 disjoint regions. A detailed example of such partition is shown in Fig. 2.

4.2 Computation of the solution of (LOAD-OPT) with \(C>2\)

figure a

As we saw before, the characterization of the solution of (LOAD-OPT) with \(C=2\) requires to distinguish seven different cases. This suggest that the characterization of the solution of (LOAD-OPT) with an arbitrary number of classes is to be out of reach. However, we provide a fixed-point algorithm using the result of Proposition 2. The pseudocode of this algorithm is shown in Algorithm 1. The main idea of this algorithm is that it starts from an initial partition \({\mathcal {C}}_0, {\mathcal {C}}_b\) and \({\mathcal {C}}_d\) that is used to compute \(\rho _0\) (see Lines 2 and 19). This value of \(\rho _0\) is then used to determine the set of classes that belong respectively to \({\mathcal {C}}_0\) (see Line 9-10), to \({\mathcal {C}}_b\) (see Line 12-13) and to \({\mathcal {C}}_d\) (see Line 14-15). The algorithm stops in the first iteration where \({\mathcal {C}}_0, {\mathcal {C}}_b\) and \({\mathcal {C}}_d\) do not change. When this occurs, according to Proposition 2, the optimal loads are obtained using (10) and (11) with the resulting partition of the algorithm. Unfortunately, we did not succeed is showing the convergence of this algorithm. However, as we will see in the numerical section, we study the converge of this algorithm and, in all the experiments we have carried out, the convergence is given in a very small number of steps.

We remark that this algorithm can be also used to analyze the economies when including a multi-skilled server into a system with C dedicated servers. For this purpose, we need to initiate the algorithm with an initial partition such that \({\mathcal {C}}_d={\mathcal {K}}\) and \({\mathcal {C}}_0={\mathcal {C}}_b=\emptyset \) and with some values of \(\eta _1,\dots ,\eta _C\) and \(r_1,\dots ,r_C\) such that the system with only dedicated servers is stable. In that case, the output of the algorithm will be one of the following possibilities: (i) the algorithm stops after the first iteration and (ii) the algorithm does not stop after the first iteration. In the former case, we can conclude that it is not beneficial to add a multi-skilled server, whereas in the latter one we can compare the cost at the initial state and the cost when the algorithm stops to compare the performance of both systems.

5 Performance analysis

We now determine the scenarios in which it is profitable to either train or hire multi-skilled agents. For this, we compare the value of the objective function when there are only dedicated servers to that with also a multi-skilled one, as well as the case in which all the servers are multi-skilled and can serve all the jobs.

5.1 Comparison with all full-skilled servers system

First, we compare the cost of the model with C dedicated servers and a single full-skilled server with the cost a system formed by \(C+1\) servers with the same values of the holding costs and capacities as Server 0, but all the servers can serve jobs of all the classes. We call the latter model ASSAC (All Servers Serve All Classes).

Lemma 2

Consider that \(\eta _j\rightarrow 0\) for all \(j>1\) and \(\frac{\delta _1}{1-\frac{\eta _1}{r_1}}<1\). Then, the cost of the system with dedicated servers is \(\delta _1^2\) times smaller than the cost of the ASSAC model when \(\eta _1\) is small enough.

Proof

In the system with dedicated servers, when \(\frac{\delta _1}{1-\frac{\eta _1}{r_1}}<1\) and \(\eta _j\rightarrow 0\) for all \(j>1\), all the jobs of class 1 are executed in Server 1 and the load of the rest of the servers is zero. Hence, the cost of this system is

$$\begin{aligned} \sum _{j\in {\mathcal {S}}}c_j\frac{\rho _j}{1-\rho _j}= c_1\frac{\frac{\eta _1}{r_1}}{1-\frac{\eta _1}{r_1}}. \end{aligned}$$
(12)

In the ASSAC model, the traffic is uniformly shared among all the servers and, therefore, the cost of this system when \(\frac{\delta _1}{1-\frac{\eta _1}{r_1}}<1\) and \(\eta _j\rightarrow 0\) for all \(j>1\) is

$$\begin{aligned} \sum _{j\in {\mathcal {S}}}c_j\frac{\rho _j}{1-\rho _j}=c_0\frac{\frac{\eta _1}{r_0}}{1-\frac{\eta _1}{(C+1)r_0}}. \end{aligned}$$
(13)

When \(\eta _1\) is small enough, (12) and (13) are approximately \(\frac{c_1\eta _1}{r_1}\) and \(\frac{c_0\eta _1}{r_0}\), respectively. And the desired result thus follows since ratio of the former and the latter is \(\delta _1^2\). \(\square \)

From the above lemma, we have that the optimal cost of the system with dedicated servers is smaller than that of ASSAC in the considered regime.

Proposition 6

Consider that \(\eta _j\rightarrow 0\) for all \(j>1\) and \(\frac{\delta _1}{1-\frac{\eta _1}{r_1}}<1\). Then, the optimal cost of (LOAD-OPT) is smaller than the optimal cost of ASSAC when \(\eta _1\) is small enough.

We now show that the optimal cost of (LOAD-OPT) can be larger than that of ASSAC. The intuition behind this result is that the stability region of the ASSAC model is wider than the stability region for the model with one dedicated server. By taking the load close to the boundary of the stability region of the model with one dedicated server, the cost can be made to go infinity. For the ASSAC model, however, the availability of spare capacity means that the cost remains finite.

Proposition 7

Consider that \(\eta _j\rightarrow 0\) for all \(j=2,\dots ,C\) and \(\eta _1\rightarrow r_0+r_1\). Then, the optimal cost of ASSAC is smaller than the optimal cost of (LOAD-OPT).

Proof

We first observe that the cost of ASSAC when \(\eta _j\rightarrow 0\) for all \(j>1\) and and \(\eta _1\rightarrow r_0+r_1\) is given by

$$\begin{aligned} c_0(C+1)\frac{\frac{r_0+r_1}{(C+1)r_0}}{1-\frac{r_0+r_1}{(C+1)r_0}}=c_0\frac{\frac{r_0+r_1}{r_0}}{1-\frac{r_0+r_1}{(C+1)r_0}}, \end{aligned}$$

which is clearly finite.

However, for the model with dedicated servers, class-1 jobs are served by Server 0 and Server 1, whose load tends to one when \(\eta _1\rightarrow r_0+r_1\). Therefore, its cost tends to infinity. \(\square \)

From the above propositions, it follows the existence of a switching curve when \(\delta _1<1\). In Sect. 6, we study numerically this curve.

5.2 Comparison with no sharing system

We consider a system formed by \(C+1\) dedicated servers, but Server 0 can only serve jobs of class 1, whereas for \(i\ge 2\) Server i can serve only jobs of class i. We call this model as system without sharing since jobs of different classes are not served in the same server. We compare the optimal cost of this system with the optimal cost of (LOAD-OPT).

In Proposition 7, we have shown that the optimal cost of (LOAD-OPT) tends to infinity when \(\eta _1\rightarrow r_0+r_1\) and \(\eta _j\rightarrow 0\) for \(j=2,\dots ,C\), whereas the optimal cost of ASSAC is finite. In the following result, we show that there is a regime where the optimal cost of the system without sharing is infinity, where the optimal cost of (LOAD-OPT) is finite. The intuition is similar here when we note that the stability region of the no-sharing model is included in that of the model with one shared server.

Proposition 8

Consider that \(\eta _j\rightarrow 0\) for all \(j=1,\dots ,C-1\) and \(\eta _C\rightarrow r_C\). Then, the optimal cost of the system without sharing is larger than the optimal cost of (LOAD-OPT).

6 Numerical experiments

In this section, we present the numerical experiments that complement the main theoretical findings of this article.

6.1 Comparison with ASSAC

We first focus on the performance comparison of Sect. 5.1, where we showed the existence of a switching curve when \(\delta _1<1\). For \(C=2\), we analyze the value of the objetive function of the models under comparison in Sect. 5.1, which are the M model (i.e., the model we consider in Sect. 4.1) and the ASSAC model with three servers. In Fig. 3, we fix the values of the capacities and holding costs and we consider \(\eta _1\) and \(\eta _2\) such that both models are stable, that is, when \(\eta _i<r_0+r_i\), for \(i=1,2\) and \(\eta _1+\eta _2<r_1+r_2+r_0\). We set \(r_1=r_2=r_0=1\) and \(c_0=20\), \(c_1=1\) and \(c_2=2\). We represent with ’x’ where the cost of the ASSAC model is smaller and with a filled ’o’ the region where the value of the objective function of the M model is smaller. As it can be observed in Fig. 3, the switching curve is unique, that is, when we increase \(\eta _1\) (or \(\eta _2\)) there is a single value where the model that outperforms changes. Another interesting conclusion of this experiment is that the region where the M model outperforms is very large.

Fig. 3
figure 3

Comparison of optimal costs of Sect. 5.1 when \(\delta _i<1\) for \(i=1,2\)

6.2 Comparison with no-sharing system

In the next set of experiments, we concentrate on the performance comparison of Sect. 5.2. We compare the value of the objective function of both models for the values of \(\eta _1\) and \(\eta _2\) such that both systems are stable, i.e., \(\eta _1<r_0+r_1\) and \(\eta _2<r_2\) considering the same values of the parameters as in Fig. 3. As we said in Sect. 5.2, the stability region of the system without sharing is smaller than that of the M model. This implies that M model outperforms the system without sharing when \(\eta _2\rightarrow 1\). This phenomenon can be clearly observed in Fig. 4. We are also interested in comparing these models out of the boundary. For this purpose, we present in Fig. 5 a zoomed version of Fig. 4. From this illustration, we conclude that the performance of both models is very similar when \(\eta _1\in (0,2)\) and \(\eta _2\in (0,0.8)\).

Fig. 4
figure 4

Ratio of the optimal costs under comparison in Sect. 5.2 (\(\eta _1\in (0,2)\) and \(\eta _2\in (0,1)\))

Fig. 5
figure 5

Ratio of the optimal costs under comparison in Sect. 5.2 (\(\eta _1\in (0,2)\) and \(\eta _2\in (0,0.8)\))

6.3 The solution of (LOAD-OPT) for \(C>2\)

We now study the Algorithm 1 since, as we said before, its convergence ensures that the solution of (LOAD-OPT) is obtained. We first consider a system with \(C=5\) classes of traffic and the values of the parameters presented in Table 1. We have chosen these parameters since the solution of (LOAD-OPT) for these values satisfies that \( {\mathcal {C}}_d=\{3\}\), \( {\mathcal {C}}_0=\{4\}\) and \( {\mathcal {C}}_b=\{1,2,5\}\), i.e., all the sets of the partition are non-empty.

We consider three different initial conditions: first, all the classes belong to \(C_d\) (see solid line in Fig. 6); second, classes 1, 3 and 4 belong to \(C_d\), whereas classes 2 and 5 to \(C_b\) (see dashed line in Fig. 6); and finally, classes 1, 2, 3 and 4 belong to \(C_d\) and class 5 to \(C_b\) (see dotted line in Fig. 6).

Table 1 Parameters of the system considered in Sect. 6.3
Fig. 6
figure 6

Convergence of the fixed-point algorithm presented in Sect. 4.2. The x-axis represents the iterations of the algorithm and the y-axis the load of each server

We illustrate in Fig. 6 the evolution of the loads of each server over the iterations of the algorithm. In the upper line of Fig. 6 we show the loads of Server 0, Server 1 and Server 2, whereas in the bottom line the loads of Server 3, Server 4 and Server 5. We observe that, when the algorithm converges, the load of Server 4 is zero, which means that, for this case, \(\rho _4^*=0\). We also see that, for Server 3, the initial load in the scenario that is represented by the solid line (that is, the scenario where all the classes send all the traffic to its dedicated server) equals to the load when the algorithm converges. This means that, for class 3, we have that \( \rho _3^*=\frac{\eta _3}{r_3}\).

It is important to remark that, as we can also observe in Fig. 6, the algorithm converges to the same values for the three different initial partitions under consideration. We have also started the system with other initial partitions and the obtained results confirmed that the algorithm always converges. Another interesting property of this algorithm is that the number of iterations required to reach the convergence is very small. Indeed, when the initial partition of \({\mathcal {K}}\) is such that all the classes belong to \(C_d\), the algorithm converges after 12 iterations. Moreover, for the rest of the cases, the algorithm converges for a less number of iterations. When the initial partition of \({\mathcal {K}}\) is such that classes 1, 3 and 4 belong to \(C_d\) and classes 2 and 5 to \(C_b\), it converges after 2 iterations and when the initial partition of \({\mathcal {K}}\) is such that classes 1, 2, 3 and 4 belong to \(C_d\) and class 5 to \(C_b\), it converges after 3 iterations.

We now present further numerical work we have performed to analyze the convergence of Algorithm 1 for larger systems. For this set of experiments, we consider that the number of dedicated servers, C, varies from 10 to 200 with step 10. For each case we run our algorithm 10 times where, in each run, the parameters of the system are randomly chosen (but satisfying the stability condition); the results are depicted in Fig. 7, where the blue bars represent the minimum number of iterations required for convergence and the yellow bars the difference up to the maximum. The main conclusions of these experiments are twofold: first, we observe that the algorithm converges in all the cases; and second, that the number of iterations required to converge varies between 2 and 6 in all the cases. This means that the convergence of this algorithm is very fast even for large systems with 200 dedicated servers.

Fig. 7
figure 7

Minimum and maximum number of iterations until convergence for systems of different size

7 Conclusions

We study the optimal Bernoulli routing in a system with C dedicated servers and a single multi-skilled server. We first provide a necessary and sufficient condition for the stability of the system. We then reformulate this problem as a optimization problem in terms of the loads of the system and we show the equivalence of both problems. We provide structural properties on the solution of the derived problem, which allows us to fully characterize the optimal loads of the system when \(C=2\) and also to present a fixed point algorithm whose convergence ensure that the optimal loads are obtained. We compare the performance of this system with optimal loads with a system where all the servers are multi-skilled and also with a system where all the servers are dedicated. Finally, we explore numerically the convergence of the fixed point algorithm and show that, in all the considered cases, the algorithm converges in a very few number of steps.

For future work, we are interested in generalizing the results of this article to systems with a more complex topology. Besides, we think that an interesting extension of the performance analysis of this work would be to consider other popular load balancing policies such as Power of Two and Join the Shortest Queue.