Lost-customers approximation of semi-open queueing networks with backordering -- An application to minimise the number of robots in robotic mobile fulfilment systems

We consider a semi-open queueing network (SOQN), where a customer requires exactly one resource from the resource pool for service. If there is a resource available, the customer is immediately served and the resource enters an inner network. If there is no resource available, the new customer has to wait in an external queue until one becomes available ("backordering"). When a resource exits the inner network, it is returned to the resource pool and waits for another customer. In this paper, we present a new solution approach. To approximate the inner network with the resource pool of the SOQN, we consider a modification, where newly arriving customers will decide not to join the external queue and are lost if the resource pool is empty"lost customers". We prove that we can adjust the arrival rate of the modified system so that the throughputs in each node are pairwise identical to those in the original network. We also prove that the probabilities that the nodes with constant service rates are idling are pairwise identical too. Moreover, we provide a closed-form expression for these throughputs and probabilities of idle nodes. To approximate the external queue of the SOQN with backordering, we construct a reduced SOQN with backordering, where the inner network consists only of one node, by using Norton's theorem and results from the lost-customers modification. In a final step, we use the closed-form solution of this reduced SOQN, to estimate the performance of the original SOQN. We apply our results to robotic mobile fulfilment systems (RMFSs). Instead of sending pickers to the storage area to search for the ordered items and pick them, robots carry shelves with ordered items from the storage area to picking stations. We model the RMFS as an SOQN, analyse its stability and determine the minimal number of robots for such systems using the results from the first part.

In an OQN, customers arrive from an external source, request service at several nodes and then leave the system.
In contrast, in a CQN, there are no external arrivals and departures. The number of customers, which are cycling in the system, is fixed.
An SOQN has characteristics of both OQNs and CQNs, see e.g. Jia and Heragu [12]. It is similar to an OQN in the sense that customers arrive from an external source and leave the system after service. It is similar to a CQN in the sense that there is an overall capacity constraint for the inner network. A customer needs a resource from an associated resource 1.1. Contribution I: New solution approach to SOQN-BO 1. Introduction pool for service. If there is a resource available, the customer is immediately served and the resource enters an inner network. If there is no resource available, the new customer has to wait in an external queue until one becomes available -we call this waiting regime "backordering". When a resource exits the inner network, it returns to the resource pool and waits for the next customer.
There are several application areas of SOQNs. For example, they are adopted for performance analysis of manufacturing systems and service systems (logistics, communication, warehousing and health care), see Roy [25].

Contribution I: New solution approach to SOQN with backordering
In this paper we focus on semi-open queueing networks (SOQNs). The literature on SOQNs is overwhelming, so we point only to the most relevant sources for our present investigation. A detailed overview about SOQNs and their solution methods is presented in Jia and Heragu [12], Ekren, Heragu, Krishnamurthy, and Malmborg [11] and Roy [25]. Roy also compares the numerical accuracy of the solution methods. A recent article, which is not included in these reviews, is provided by Kim, Dudin, Dudin, and Kim [13].
The most common solution approaches are the matrix-geometric method, aggregation method, network decomposition approach, parametric decomposition method and performance measure bounds.
For SOQNs with backordering and an inner network, which consists of more than one node, closed-form expressions for the steady-state distributions are not available. In this paper we present a new solution approach, which is depicted in Figure 1.
To approximate the resource network (= inner network and resource pool) of the SOQN, we consider a modification, where new arriving customers will be lost if the resource pool is empty ("lost customers"). For such a modification, closed-form expressions for the steady-state distribution in product form are available. We prove that we can adjust the arrival rate in the modified network so that the throughputs in each node are pairwise identical to those in the original network. We also prove that the probabilities that the nodes with constant service rates are idling are also pairwise identical. Moreover, we provide a closed-form expression both for the throughputs and for these special idle probabilities.
To approximate the external queue of the SOQN, we use an indirect two-step approach. In the first step, we reduce the complexity of the modified SOQN by applying Norton's theorem. In the next step, we remove the lost-customer property of the reduced SOQN with lost customers to get the backordering property again.
The idea to approximate backordering systems by lost-customer systems follows Krenzler [14,Section 2.1.4,p. 34ff.]. This approximation method is interesting because in the literature there are already several results for queueing networks with lost customers, e.g. lost sales models in Otten [24], Schwarz, Sauer, Daduna, Kulik, and Szekli [26] and Krishnamoorthy, Lakshmy, and Manikandan [16]. For an overview of the literature on systems with lost sales, we refer to Bijvank and Vis [4].
1.3. Structure of the paper 1. Introduction robots carry shelves -here called pods -with ordered items from the storage area to picking stations. At every picking station there is a person -the picker -who takes items from the pods and packs them into boxes according to customers' orders. When the picker does not need the pod any more, the robot either transports the pod directly back to the storage area or first makes a stopover at a replenishment station. Such a fulfilment system poses many decision problems. An overview is presented in Merschformann, Lamballais, de Koster, and Suhl [23,Section 4]. They can be classified at strategic, tactical and operational levels. The literature on RMFS is, similar to that on SOQNs, already overwhelming, so we only point to some references closely related to our investigations.
RMFSs are modelled as SOQNs in several articles. For an overview of the literature, we refer to Azadeh, de Koster, and Roy [2, Section 7.2 and Table 4] and Azadeh et al. [3,Section 6]. They classify articles according to the decision problem of interest and the relevant methodology. Furthermore, a recent overview is presented in Lamballais, Merschformann, Roy, and Suhl [17]. In our paper, we focus on decisions on the optimal number of robots. Most of these articles analyse different decision problems from the ones we look at in this paper. Yuan and Gong [30] calculate the optimal number of robots. They compare two protocols: pooled and dedicated robots. Their investigations are based on OQNs, rather than SOQNs and they do not consider a replenishment station in their model. Zou, Xu, Gong, and de Koster [31] determine the number of robots for different battery recovery strategies. Their investigations are based on a nested SOQN.

Structure of the paper
In Section 2, we describe a general SOQN with backordering and analyse the stability. In Section 2.3, we calculate the throughputs and special idle probabilities in steady state. After that, we introduce our new approximation method for an SOQN with backordering. Our approach is depicted in Figure 1. In Section 3, we present an approximation of the resource network. For that, in Section 3.1, we analyse a modification, where newly arriving customers are lost, if the resource pool is empty. Then, in Section 3.2, we adjust this system. In Section 3.3, we calculate the throughputs and special idle probabilities in steady state. In Section 4, we present an approximation for the external queue. For that, in Section 4.1, we reduce the complexity of the modified SOQN with lost customers by applying Norton's theorem. Then, in Section 4.2, we remove the lost-customer property to get the backordering property again. In Section 5, we present an application of our results. We model the RMFS as an SOQN with backordering and apply our results to the problem: "What is the optimal number of robots?" We formulate an algorithm to calculate the minimum number of robots and present numerical examples. Finally, Section 6 concludes the paper.    Figure 1: Overview of the models. We use the same colour for parts, which we change in a single approximation step.

Notation and preliminaries
We call a "generator" a matrix M ∈ R K×K with countable index set K if all its off-diagonal elements are non-negative and all its row sums are equal to zero.
Throughout this paper it is assumed that all random variables are defined on a common probability space (Ω, F, P ). Furthermore, by "Markov process" we mean a time-homogeneous continuous-time strong Markov process with discrete state space -also known as a Markov jump process. All Markov processes are assumed to be regular and have cdlg paths, i.e. each path of a process is right-continuous and has left limits everywhere. We call a Markov process regular if it is non-explosive (i.e. the sequence of jump times of the process diverges almost surely) and its transition intensity matrix is a generator matrix.

Description of the model
An SOQN with backordering is shown in Figure 2. Henceforth, we call it an SOQN-BO. It represents a queueing network with an additional resource pool.
Customers arrive one by one according to a Poisson process with rate λ BO > 0. Every customer requires exactly one resource from resource pool for service. If there is a resource available, the customer is immediately served and the resource enters an inner network. If there is no resource available, the new customer has to wait in an external queue under the first-come, first-served (FCFS) regime until a resource becomes available ("backordering"). When the resource exits the inner network, it returns to the resource pool, which is henceforth referred to as node 0, and waits for the next customer. Whenever the external queue is not empty and a resource item is returned to the resource pool, this item is instantaneously synchronised with the customer at the head of the line. The resources therefore move in a closed network. We will call this network resource network. The maximal number of resources in the resource pool is N .
The inner network consists of J ≥ 1 numbered service stations (nodes), denoted by J := {1, . . . , J}. Each station j consists of a single server with infinite waiting room under the FCFS regime or the processor sharing regime. Customers in the network are indistinguishable. The service times are exponentially distributed random variables with mean 1. If there are n j > 0 customers present at node j, service at node j is provided with intensity ν j (n j ) > 0. All service and inter-arrival times constitute an independent family of random variables.
Movements of resources in the inner network are governed by a Markovian routing mechanism: After synchronisation with a customer, a resource visits node j with probability r(0, j) ≥ 0. A resource, when leaving node i, selects with probability r(i, j) ≥ 0 to visit node j next, and then enters node j immediately. It starts service if it finds the server idle, otherwise it joins the tail of the queue at node j. This resource can also leave the inner network with probability r(i, 0) ≥ 0. It holds J j=0 r(i, j) = 1 with r(0, 0) := 0 for all i ∈ J 0 := {0, 1, . . . , J}. Given the departure node i, the resource's routing decision is made independently of the network's history. We assume that the routing matrix R := r(i, j) : i, j ∈ J 0 is irreducible.
To obtain a Markovian process description, we denote by X ex (t) the number of customers in the external queue at time t ≥ 0, by Y 0 (t) the number of resources in the resource pool at time t ≥ 0 and by Y j (t), j ∈ J, the number of resources present at node j in the inner network at time t ≥ 0, either waiting or in service. We call this Y j (t) queue length at node j ∈ J at time t ≥ 0. Then Y(t) := Y j (t) : j ∈ J 0 is the queue length vector of the resource network at time t ≥ 0. We define the joint queue length process of the semi-open network with backordering by Then, due to the independence and memorylessness assumptions, Z BO is a homogeneous Markov process with state space E := (0, n 0 , n 1 , . . . , n J ) : n j ∈ {0, . . . , N } ∀j ∈ J 0 , j∈J 0 n j = N ∪ (n ex , 0, n 1 , . . . , n J ) : n ex ∈ N, n j ∈ {0, . . . , N } ∀j ∈ J, j∈J n j = N .
Z BO is irreducible on E.

Stability
In this section, we analyse the stability of our system. This is important from a practical point of view: It ensures that the system processes customers quickly enough. The Markov process Z BO has an infinitesimal generator Q := (q(z;z) : z,z ∈ E) with the following transition rates for (n ex , n) , (0, n) ∈ E, where n := n j : j ∈ J 0 : Furthermore, q(z;z) = 0 for any other pair z =z, and The Markov process Z BO is a level-independent quasi-birth-and-death process in the sense of Latouche and Ramaswami [20, Def. 1.3.1, p. 12]. The level is the length n ex of the external queue. The phase is the state of the resource network. For a level equal to zero, the phase space is E 0 := (n 0 , n 1 , . . . , n J ) : n j ∈ {0, . . . , N } ∀j ∈ J 0 , When the level is positive, there is at least one customer in the external queue. This is only possible if the resource pool is empty. Hence, for all positive levels the phase space is Arranging the states by level, the corresponding infinitesimal generator Q of Z BO can be written as A 1 is a non-negative matrix with the following positive elements: a 1 ((0, n 1 , . . . , n J ) ; (0, n 1 , . . . , n J )) = λ BO .
For n ex = 0 For n ex > 0 which implies n 0 = 0 There is no closed-form expression for the steady-state distribution for the case J > 1. Latouche and Ramaswami developed a logarithmic reduction algorithm for the level-independent quasi-birth-and-death process to compute the steady-state distribution (Latouche and Ramaswami [19], [20,Theorem 6.4.1 and Lemma 6.4.3, p. 142ff.]). For the case J = 1, we calculate a closedform expression for the steady-state distribution, which we present in Section 2.4.
Next we determine the stability condition of the system. For that we define the following traffic equation: In matrix notation, the above equation can be expressed as η·R = η with η := η j ∈ R + : j ∈ J 0 . Lavenberg [21] showed that the system with backordering is stable if the arrival rate λ BO is smaller than the maximal arrival rate λ BO,max where λ BO,max is the throughput through node 0 of a closed network in Figure 3. In this network every resource which goes through node 0, spends zero time at node 0 and goes immediately to the next node according to a branching vector (r(0, j) : j ∈ J). Lavenberg calls this network saturated -it is obtained when there are infinitely many customers in the external queue.
We will call this network in Figure 3 stability network. Lavenberg did not only prove that the system is stable for λ BO < λ BO,max , he also proved that it is unstable if λ BO > λ BO,max , but he did not show what will happen if λ BO = λ BO,max . In our stability analysis we use matrix geometrical methods which cover all the cases "<", "=" and ">". In order to simplify notation we define the following constant: .

Proposition 1. The system is stable if and only if
Proof. We use matrix-geometric methods to prove the stability condition. According to Latouche [18,Theorem 1]: Given the irreducible inter-level generator matrix A := A −1 + A 0 + A 1 of Z BO and the stochastic solution α := (α( n) : n ∈ E + ) of the equation α · A = 0, then the process Z BO is stable if and only if Because A 1 is a diagonal matrix with λ BO on its diagonal and α is a stochastic vector, we have immediately for the left-hand side of (4) Therefore, in a stable system, the arrival rate λ BO must be strictly less than the right-hand side of inequation (4). We define To calculate λ BO,max , we first need to calculate the stochastic vector α. The non-negative non-diagonal elements of the generator A are of the form a ((0, n 1 , . . . , n J ) ; (0, n 1 , . . . , n i − 1, . . . , n j + 1, . . . , n J )) We now solve for all n := (0, n 1 , . . . , node 0 inner network with J nodes Figure 3: Stability network -a closed network described by the inter-level generator matrix A from the proof of Proposition 1. Interpretation I: a generalised Gordon-Newell network with zero service time at node 0.
Interpretation II: a classical Gordon-Newell network obtained after rerouting at node 0.
Equation (6) is the global balance equation of a generalised Gordon-Newell network with node set J 0 , N customers and zero service time at node 0. We will call this network a stability network, see Figure 3.
Equation (6) also has another important interpretation, which allows us to use standard algorithms from performance analysis of Gordon-Newell networks. Note that the state of node 0 never changes, therefore we define α (n 1 , . . . , n J ) := α (0, n 1 , . . . , n J ) and we define the routing matrix R : Routing matrix R results from routing matrix R when every resource which wants to go to node 0 skips this node and goes immediately to the next node according to (r(0, j) : j ∈ J). Now equation (6) can be written as (9) is the global balance equation of a Gordon-Newell network with nodes J = {1, 2, . . . , J}, with N customers, and with routing matrix R . The steady-state distribution of this network is where η := η j : j ∈ J is a solution of the traffic equation η · R = η and . Now we can switch between both interpretations without recalculating η and C stb BO (J, N ): .
We now calculate λ BO,max explicitly.
The expression ( * ) is the throughput through node i in the Gordon-Newell network with routing matrix R : We can now write According to Bolch, Greiner, de Meer, and Trivedi [5, p. 374, (8.14)] Remark 2. The right-hand side of equation (11) is the throughput through node 0 in the stability network in Figure 3. Formally, this means The advantage of representation (12) is that it uses throughputs T H stb i (N ), i ∈ J, of a classical Gordon-Newell network with routing matrix R . We can calculate these throughputs very efficiently with standard methods, like, for example, mean value analysis (MVA).
With (3) we have another representation of T H stb 0 (N ): To calculate efficiently the constants on the right-hand side of (13), we can use, for example, the convolution algorithm.
From a practical point of view, it is important to know the long-term behaviour of the system. This behaviour is equal to the behaviour of the system in steady state. We analyse it in the following sections, 2.3 and 2.4.

Throughputs and idle times
We consider a stable SOQN-BO in steady state. Let η := (η j : j ∈ J 0 ) be a solution of x = x · R. Recall that η is uniquely defined up to a constant.
Note, formula (14) occurs as an approximation in a different setting in [10] as (26). We give a proof for correctness in our setting.
Proof. We define for j ∈ J 0 in steady state: • the mean number of departures from j per time unit is D j , and • the mean number of arrivals at j per time unit is V j .
From steady state assumption follows Therefore, the vector V := (V j : j ∈ J 0 ) fulfils the set of equations This implies that V j = η j · K for some constant K > 0.
and therefore, Note, that (14) shows that T H BO,j does not depend on normalisation of η.
denote a random vector which is distributed according to the stationary queue length at the nodes in J of the SOQN-BO. If the service rate at node j does not depend on the queue length, i.e. ν j (·) = ν j , j ∈ J, then the probability that node j is idling is Note, that the formula above also describes the proportion of time that node j is idle.
Proof. We define for j ∈ J in steady state: • the mean number of customers in service is B j , • the mean service time is S j , and • the arrival intensity is λ j .
According to Little's formula, for every node j it holds B j = λ j · S j .
In steady state the arrival rate λ j at node j is equal to its throughput T H BO,j . Hence, from Proposition 12, we know that In every node j, with constant service rate ν j , the mean service time S j is ν −1 j . Now we substitute these known results for λ j and S j in Little's formula and get for the mean number of customers in service Consequently, the probability that node j is idling is

Special case: J = 1
In this section, we consider the special case where the inner network consists only of one node (J = 1) as shown in Figure 4. For this special case, Avi-Itzhak and Heyman calculated the steady-state distribution in [1].
If Z BO is stable, the limiting and steady-state distribution π BO := (π BO ((n ex , n 0 , n 1 )) : (n ex , n 0 , n 1 ) ∈ E) of the process Z BO is with normalisation constant Proof. See [1, equations (20) and (21)]. In that paper, the authors calculated the steady-state probabilities for P (X ex + Y 1 = m). From these results we can easily obtain the probabilities of the states of all queues, because for m ≤ N it holds Proposition 6. Let ( X ex , Y 0 , Y 1 ) denote a random vector which is distributed according to the limiting and steady-state distribution of Z BO in the SOQN-BO with J = 1.
(i) For the marginal distributions it holds: and for n ex > 0 and for (ii) The average external queue length is and the average waiting time of customers in the external queue is Proof.
(i) For (17) we calculate For (18) we calculate For (19) we calculate For (20) we calculate (ii) We use the marginal distributions in (i) to calculate the average external queue length in steady state.
An important practical question about SOQN systems in this paper is: "Do more resources allow more system throughput?" Proposition 8 shows that, even for the simple system with J = 1, the surprising answer is: "It depends." We also know that in systems with J ≥ 1 whose service rates are non-decreasing in the number of customers, more resources lead to equal or higher throughput. Proposition 7. Let J ≥ 1 and all service rates be non-decreasing in the number of customers, i.e. is ν j (n + 1) ≥ ν j (n), n ∈ {0, . . . , N − 1}, for all j ∈ J. Then λ BO,max = T H stb 0 (·) is non-decreasing on N.
Proposition 8. Let J = 1. Then the following are equivalent: (ii) ν 1 is non-decreasing on N.
Proof. Because of (3) and (12), (i) is equivalent to We note that which is (ii).

Approximation of the resource network
Unfortunately, the steady-state distribution of the SOQN-BO for J > 1 is not known. However, for a modified system, we get closed-form expressions and product-form results for J ≥ 1, as shown in the next section.

SOQN with lost customers
In this section, we consider a modification of the model from Section 2.1, where newly arriving customers are lost if the resource pool is empty. Henceforth, we call it SOQN-LC. In the literature, this behaviour is called "lost customers", "lost sales", "lost arrivals" or "loss systems". These customers arrive one by one according to a Poisson process with rate λ LC > 0. But because of lost customers, the actual arrival rate to the synchronisation node is smaller. We call this effective arrival rate λ eff (λ LC ). The SOQN-LC is on the right of Figure 5, and the original SOQN-BO from Section 2.1 is on the left of Figure 5.
It is well known that such an SOQN-LC turns formally into a closed Gordon-Newell network which consists only of the resource network, see [8, p. 21].
To obtain a Markovian process description, we denote by Y 0 (t) the number of resources in the resource pool at time t ≥ 0 and by Y j (t), j ∈ J, the number of resources present at node with normalisation constant

Adjustment
We want to use the modified system (SOQN-LC) with known results to approximate the SOQN-BO. But before we do so, we have to ensure that both systems process in the mean the same number of external customers. That means they have the same throughput through synchronisation node 0. Our main idea is: In order to compensate customers' loss, we will adjust the input rate λ LC of the modified system until it reaches the desired throughput. But is it even possible? Yes, it is! We will prove it in the next Theorem 10. But first we need to calculate the throughput of both systems.
Lemma 9. The throughput of the SOQN-BO in steady state is λ BO . The throughput of the SOQN-LC in steady state is where π LC,0 (0) is the probability that there are no resources in the resource pool.
Proof. The throughput of the SOQN-BO in steady state is equal to the arrival rate λ BO because all customers pass through the system. In contrast, in the SOQN-LC, the proportion π LC,0 (0) of the customers is lost. From the steady-state distribution (24), we directly calculate for the SOQN-LC the probability π LC,0 (0) of the resource pool being empty as π LC,0 (0) := j∈J n j =N π LC (0, n 1 , . . . , n J ) Then the effective arrival rate λ eff (λ LC ) and the throughput T H LC,j of the system is Now we can adjust λ LC in such a way that both systems have the same throughput. We assume that both systems -with backordering and with lost customers -are stable. For the SOQN-BO, according to Proposition 1, stability is equivalent to λ BO ∈ (0, λ BO,max ). For the SOQN-LC, stability is granted for any arrival rate λ LC ∈ (0, ∞), because the state space E LC is finite. Proof. The idea of the proof is to show that for any λ BO ∈ (0, λ BO,max ), the function λ eff (λ LC ) from Lemma 9 can have values larger than a prescribed λ BO , and smaller than λ BO , and is continuous. Thus, by the intermediate value theorem, there exists λ LC such that λ eff (λ LC ) = λ BO .
We first show that λ eff (λ LC ) = λ LC · 1 − C stb (J,N ) C LC (J 0 ,N ) can be larger than any given λ BO . We analyse the normalisation constant C LC (J 0 , N ): To simplify the notation, we define constant b(n 0 ) := C stb (J, N − n 0 ). Then λ eff (λ LC ) can be expressed as Hence, it holds Therefore, λ eff (λ LC ) can be larger than any stable arrival rate λ BO ∈ (0, λ BO,max ). Now we show that λ eff (λ LC ) can be smaller than any stable λ BO ∈ (0, λ BO,max ). It follows from lim follows that λ eff is a continuous function in λ LC ∈ (0, ∞), which proves our claim by the intermediate value theorem.
The proof of Proposition 11 is presented in Appendix A. Interested readers will find the explicit results for adjusted λ LC in the special cases N = 1 and N = 2 in Remark 15 in Appendix A.
Some arguments for applying the resource network of the SOQN-LC as approximation for that of the SOQN-BO. The result of Theorem 10 only guarantees that for a stable SOQN-BO with a prescribed external arrival rate λ BO there exists an SOQN-LC with the same resource network where the resource pool has the same throughput. Because the SOQN-LC is a standard Gordon-Newell network, the local throughputs can be computed directly by standard algorithms. We can compare these local throughputs with those of the SOQN-BO. Surprisingly, not only the throughputs at the queues of the resource network are the same by construction. All throughput pairs in the respective nodes of the resource network are identical, too. This observation suggests to use the local characteristics of the queues in the resource network of the SOQN-LC as approximation for the respective performance measures of the SOQN-BO. The point is: The performance characteristics of the SOQN-BO are not directly accessible, while the performance characteristics of the SOQN-LC are explicitly known from product-form network theory, and even more, well-established algorithmic procedures are at hand to evaluate these.
In the next section, we prove the coincidence of the respective local throughputs. Thereafter, we show that in nodes with constant service rates even the probabilities for an empty queue in both resource networks are pairwise identical.

Throughputs and idle times
Proposition 12. The local throughput T H LC,j at nodes j ∈ J 0 in the SOQN-LC with adjusted arrival rate is pairwise the same as that of the respective nodes in the SOQN-BO given in Proposition 3. With it holds Proof. It was shown in the proof of Theorem 10 that it holds which is the standard formula for the throughput at node 0 in the resource network, which is a standard Gordon-Newell network. Therefore, and by using the standard formula for local throughputs in Gordon-Newell networks follows The explicit formula for the throughputs in Proposition 12 has a further interesting consequence. It allows us to determine efficiently the steady-state marginal distribution of the queue length at every node j ∈ J without knowing the adjusted value λ eff (λ LC ) of λ LC . Proposition 13. Let Y LC := (Y LC,j : j ∈ J) denote a random vector which is distributed according to the stationary queue length at the nodes in J of the SOQN-LC with adjusted arrival rate.
If the service rate at node j does not depend on the queue length, i.e. ν j (·) = ν j , j ∈ J, then the probabilities that the nodes j ∈ J 0 in the SOQN-LC with adjusted arrival rate are idling are pairwise the same as those of the respective nodes in the SOQN-BO given in Corollary 4: Proof. According to Proposition 12, the throughputs are equal. The rest of the proof is the same as the proof of Corollary 4.

Approximation of the external queue
Although, after adjusting λ LC , the behaviour of the resources in both systems -the original SOQN-BO and the modified SOQN-LC -is very similar, their external queues are still very different. No matter how highly we increase λ LC , the external queue length of the SOQN-LC is always zero, while the external queue of the SOQN-BO has strictly positive mean. Therefore, we cannot use the modified system directly to estimate the external queue of the original system. Instead, in the following two sections, we will use a two-step approach to approximate the external queue: step 1 In Section 4.1, we will reduce the modified system to a simple system with lost customers.
step 2 In Section 4.2, we will combine the results from Section 4.1 and the results from Section 2.4 for a simple system with backordering to approximate the external queue.

Reduced SOQN with lost customers
Because the SOQN-LC from Section 3.1 is a Gordon-Newell network, we can reduce complexity further by applying Norton's theorem proved by Chandy, Herzog, and Woo [7] to construct a two-node Gordon-Newell with the same throughput. The inner network is replaced by only one composite node (J := {1}), which consists of a single server with infinite waiting room under the FCFS regime. The service time is exponentially distributed with mean 1. The service speed is determined by a queue-length-dependent service intensity. According to Chandy et al. [7, p. 39, eq. (20)], the service intensity ϕ is given by Remarkably, ϕ(N ) is the same as λ BO,max in (3). We deduce from (13) that and it does not depend on λ LC .

Back to backordering
We can use the result of Theorem 10, that for every stable SOQN-BO, there exists an SOQN-LC with adjusted arrival rate λ LC such that both systems have the same throughput in the steady state. So we can remove the lost-customer property to get again the backordering property as shown in Figure 7.  Having done this, we can eventually approximate the external queue of the large SOQN-BO with J > 1 by a reduced SOQN-BO with J = 1. We calculate T H stb 0 (m), m ∈ {1, . . . , N }, of the large system and substitute these throughputs in the reduced system like in (30): ν 1 (m) := ϕ(m) = T H stb 0 (m). Then we insert these rates ν 1 (m) into the formulas for P ( X ex = n ex ), L ex and W ex in Proposition 6, which we know to be exact for J = 1. For J > 1 we expect that the results are close to the real values, but we cannot give error bounds at the present. .
We approximate the average number of customers in the external queue with (20) and (21) Note, with our approximation method we arrive to the same formula for L apprx ex as Dallery [10, eq. (22)] and thus our method produces the same formula as the aggregation technique of Dallery [10,Section 6].
We approximate the average waiting time of customers in the external queue with (22) (32)

Application to RMFS
In this section, we use our approximation algorithm to calculate the minimal number of robots for a robotic mobile fulfilment system (RMFS). In an RMFS, robots are expensive resources. Therefore, we want to keep their number small while still maintaining the necessary quality of service.
We will model an RMFS as SOQN-BO and evaluate its performance analytically because this is much faster than evaluating a simulation model. Because of the large state space, it is impractical to solve this SOQN-BO exactly with matrix-geometric methods, even for a small number of robots. For example, for 10 robots, we need to calculate ca. 9 · 10 4 2 entries of a special matrix. That is why we will use our approximation methods instead, in order to quickly estimate the main performance metrics.

Description of RMFS
Firstly, we define the components and the order fulfilment processes in an RMFS together with an illustrated example in Figure 8. The central components are: • movable shelves, called pods, on which items are stored, • storage area -the area where the pods are stored, • workstations, where the items are picked from pods by pickers (picking stations) or the items are stored to pods (replenishment stations), • mobile robots, which can move underneath pods and carry them to workstations. Figure 8 illustrates an example of order fulfilment processes in an RMFS. On the upper left hand we have three customers' orders. The orders contain different items, which are illustrated with different colours. To fulfil customers' orders, we send them or parts of them to picking stations. To the same station we send pods with all the necessary items. Each pod is carried by a robot. In this way, customers' orders generate tasks for robots. The robots, with their pods, queue up in front of the picking stations. A picker takes all the necessary items from the pod at the head of the queue. Then he sends it, with its robot, back to the storage area. As soon as the customer's order or part of it is fulfilled, we remove it from the picking station. The order in which we send the customers' orders, how we split them apart, and which pod we send, is a complex topic. See, for example, Xie, Thieme, Krenzler, and Li [29]. In the present paper, we focus on the generated robots' tasks, which we will call just tasks.
In this example, each customer's order is split into two parts. Three parts are sent to picking station 1, and three other parts are sent to picking station 2. To fulfil these partial orders, a robot transports one pod to picking station 1, and another robot transports one pod to picking station 2, from the storage area.
From time to time, we need to refill the pods. To do this, we send these pods to the replenishment station. There, employees will refill these pods and send them back to the storage area.
In this example, after picking, pod 2 is sent to the replenishment station to refill it with the blue items.

Modelling as SOQN
A robotic mobile fulfilment system can be modelled as an SOQN-BO. This is depicted in Figure  9.
An RMFS is open with respect to tasks and closed with respect to robots, which are the resources in this model. In this section, we consider only an RMFS with two picking stations and one replenishment station, but the results from Sections 2 and 3 about general SOQN can also be applied to an RMFS with more than two picking stations and more than one replenishment station.
Customers' orders arrive at the RMFS one by one with rate λ CO and generate tasks. The number of tasks, that a single customer order can generate depends on many parameters. In particular, it depends on the efficiency of the algorithm which tries to find an optimal match between customers' orders and pods. The matching problem is NP-hard, and, to the best of our knowledge, there are no formulas known which determine how many pods an order will require. Therefore, we assume that there exists some average pod/order ratio σ pod/order which we can find empirically for a particular RMFS. We assume that this ratio depends only on the pods' contents and customers' order contents, and it does not depend on the number of robots. The matching algorithm also adds some delay in order to assign pods to orders. We assume that this delay depends only on the pods' contents, customers' order contents and order input rate and it does not depend on the number of robots. We assume that we can find this delay empirically for our particular RMFS and it is on average W alg .
Thus, the customers' orders generate a stream of "bring a pod to a picking station" tasks. This stream has the rate λ BO = λ CO · σ pod/order > 0. The delay, introduced by the matching algorithm, does not change this rate.
We simplify all the complexity that occurs until each task is created and model the task stream as a Poisson arrival stream with rate λ BO = λ CO · σ pod/order . To be processed (= to enter the inner network), each such task requires exactly one idle robot from the robot pool (resource pool), which is henceforth referred to as node 0. If there is no idle robot available, the new task has to wait in an external queue until a robot becomes available. The maximal number of robots in the resource pool is N . The inner network in the example in Figure 9 consists of 11 nodes, denoted by J := {sp, pp 1 , pp 2 , p 1 , p 2 , p 1 s, p 2 s, p 1 r, p 2 r, r, rs} .
The notations of the nodes are presented in Table 1 on page 28. The robot with assigned task moves through the network.
The following processes occur from the perspective of a robot: • The idle robot waits to be assigned to a task (bring a particular pod).
• The robot moves with the assigned task to a pod.
• With this pod, the robot moves to one of the picking stations, more precisely, with probability q pp 1 ∈ (0, 1) to picking station 1 and with probability q pp 2 ∈ (0, 1) to picking station 2, whereby q pp 1 + q pp 2 = 1.
• The robot queues with the pod at the picking stations.
• After picking at picking station 1 resp. picking station 2, the robot: either * carries the pod directly back to the storage area with probability q p 1 s ∈ (0, 1) resp. q p 2 s ∈ (0, 1), or * moves to the replenishment station with probability q p 1 r ∈ (0, 1) resp. q p 2 r ∈ (0, 1), whereby q p 1 s + q p 1 r = 1 resp. q p 2 s + q p 2 r = 1, queues at the replenishment station and carries the pod back to the storage area and waits for the next task.
Each of these processes is modelled as a queue. All the movements of the robots are modelled by processor-sharing nodes with exponentially distributed service times. Their intensities ν j (n j ) := µ j ·φ j (n j ), j ∈ J \{p 1 , p 2 , r}, are presented in Table 1.
The two picking stations and the replenishment station, which are referred to as node p 1 , node p 2 resp. node r, consist of a single server with waiting room under the FCFS regime. The picking times and the replenishment times are exponentially distributed with rates ν p 1 , ν p 2 resp. ν r .
The robots travel among the nodes following a fixed routing matrix R := r(i, j) : i, j ∈ J 0 , whereby J 0 := {0} ∪ J, which is given by The routing matrix R is irreducible by construction.

Determine the minimal number of robots
We see that the throughput T H stb 0 (N ) depends on N , the number of robots. This raises the question: "How many robots do we need to stabilize a given system?" Consequently, to find the minimal number of robots, we first check the stability criterion from Proposition 1. The maximal number of robots N max is equal to the number of pods or is determined by financial restrictions. In the following Algorithm 1, we determine the set N * of feasible numbers of robots for a stable system.
Algorithm 1 Calculate set of feasible numbers of robots for a stable system 1: function StableRobotsSet 2: return N *

4: end function
Remark 14. If T H stb 0 (·) is non-decreasing in N, then the algorithm can be improved: the algorithm does not have to check all possible numbers of robots . It starts with one robot and adds a new robot in each new step until the stability criterion is satisfied for the first time. You will find some simple sufficient conditions for non-decreasing T H stb 0 (·) = λ BO,max in Proposition 7 and 8.
Unfortunately, stability does not say anything about the turnover time of an order. We do not know if only 2 minutes are required to satisfy a customer's order or maybe the order requires 2 years to be processed. In case of the latter turnover time, the system is stable, but customers may not be happy.
Now, in addition to stability, we want to consider the turnover time of a customer's order for the quality of service.
The turnover time of a customer's order can be split into three main parts: 1. Waiting time until the matching algorithm has assigned all required pods to that order. By assumption, this time does not depend on the number of robots and is on average W alg > 0.
2. Waiting time of the first matched pod for an idle robot, time for transport to the picking station, waiting time for the picker at the picking station. During all these times, the order is coupled with at least one task. We call this turnover time for the task T O task (λ LC , N ).
3. Time of an order between start of picking and its completion. This time is complex and depends on many factors. Like, for example: How many orders can a picker complete with the same pod? Will all completed orders wait until a pod leaves? Is the order's content in multiple pods? Will these pods arrive right after each other, or will there be many pods for other orders in between? Is there any complex merging procedure outside of the picking station? In our model, we use a simplifying assumption that the order needs on average W assembled > 0 from the time its first pod arrives at the picking station until the time picking for this order is completed.
With all these assumptions, we can assume that the turnover time of an order is Even in the case where W alg and W assembled are not known, we can still use T O task (λ LC , N ) as a lower bound for T O order (λ LC , N ). Because of the simplifying assumption about W alg and W assembled , only the turnover time T O task (λ LC , N ) of a task depends on N , and for the minimal number of robots we can focus on this.
The turnover time T O task (λ LC , N ) of a task is measured from the time the task is received to the time the picker starts to process it: W ex (N ) is the average time which a task spends waiting in the external queue until it enters the inner network. We can calculate it with (32).
W in (λ LC , N ) is the average time which a task spends in the inner network until a picker starts to process it at one of the picking stations. Given the average waiting times W j (λ LC , N ) at nodes j ∈ J = {sp, pp 1 , pp 2 , p 1 , p 2 , p 1 s, p 2 s, p 1 r, p 2 r, r, rs} from arrival until service completion, and constant service rates ν j at nodes j ∈ {p 1 , p 2 }, then We calculate W j (λ LC , N ), j ∈ J, with MVA.
So, the question is: "How many additional robots do we need, so that the turnover time of a task is also acceptable?" In the following Algorithm 2, we determine the minimum number of robots for an acceptable turnover time of a task. We will call this time T O max task .
Algorithm 2 Calculate the minimal number of robots for acceptable turnover time of a task return "no solution" 12: end function

Numerical experiments
We use parameters from Lamballais et al. [17, Table 5.3 and Table 5 = 30 s, and average travel time at node rs µ −1 rs = 34.5 s. For our numerical example, we assume that the robots do not interfere when they move. Hence, our processor-sharing queues are infinite server queues. That means φ j (n j ) = n j for all j ∈ J \ {p 1 , p 2 , r}.
We implemented our algorithm in R and used the package queueing, see Canadilla [6]. The minimal number of robots for this system to be stable is 18. The minimal number of robots with some additional waiting time requirements depends on this waiting time. In the worst case scenario -when we need to try all (550 − 18 + 1) of the robots -our implementation takes on average 83 seconds on a notebook with an i7-7600U CPU processor, 2.80GHz and 16GB RAM. We plotted some important parameters of the system in Figures 10 to 13. For better readability, we plotted data for a limited number of robots; due to the asymptotic behaviour, one can estimate how these parameters look with more robots. Figure 10 shows maximal arrival rates λ BO for given numbers of robots to keep the system stable. We see asymptotic behaviour of the rate λ BO for N → ∞ . In particular, after about 40 robots, additional robots do not allow significantly higher arrival rates. Figure 11 shows the throughputs for each node. In Proposition 3, we showed that these throughputs do not depend on the number of robots, and they are pairwise the same for the original system with backordering and for the adjusted lost-customers approximation.
The probabilities that the nodes p 1 , p 2 and r are idling are 0.35, 0.35, and 0.22. We calculate them with Corollary 4. Figure 12 shows the adjusted arrival rate λ LC for a system with lost customers, so that the effective arrival rate is λ BO . We see an asymptotic behaviour lim N →∞ λ LC ≈ λ BO . This happens because, in our test system with lost customers, when there are many robots in the system, the probability of an empty resource pool is almost 0, therefore only a few customers are lost. When only a few customers are lost, we do not need to adjust λ LC much. Figure 13 shows average waiting times for an order, after it has arrived into the system and until it is completed at a picking station. We see that an order spends a lot of time in a system with only 18 robots, even if this system is stable. We also see how dramatically the waiting time improves with only one additional robot.  λ LC λ BO Figure 12: Adjusted arrival rate λ LC for a system with lost customers such that the effective arrival rate is λ BO . inner external Figure 13: Turnover times of a task T O task (λ LC , N ), which are the average delay times of a task until it is completed.
Simulation We simulated the RMFS with backordering for 365 days 20 times for each number of robots. For simulation we used SimPy 3.0. Figure 14 shows results for waiting times beginning with 19 robots. The approximation shows the same qualitative behaviour as the original system with backordering. In the interesting region 19-25 robots, the approximation is not very precise but it answers the essential question "how many robots we need'' quite well. Beginning with 26 robots, the approximation reflects the asymptotic behaviour of the original system well.
In Figure 14 we have omitted the results for 18 robots because they have very large mean value and very large standard deviation. This is because the system with 18 robots operates on the edge of instability. This behaviour of the system under these settings is interesting from theoretical point of view, that is why we ran more intensive tests of this system with 200 simulations. The results are in Figure 15. They show how different the average waiting time can be. From practical point of view, we do not recommend to operate a real system under these conditions. Also in this case, we recommend not to trust simulation results if they were obtained by only few simulations. Figure 16 shows that our approximation approximates very well the turnover times. To better judge the quality of the approximation, we need to consider that the turnover times consist of transportation times and waiting times for a picker. The average transportation times are pairwise equal in the original system and in the approximation. We can easily calculate them from service times on appropriate nodes without any approximation: µ −1 sp + r(sp, pp 1 ) · µ −1 pp 1 + r(sp, pp 2 ) · µ −1 pp 1 = 0.0147h. The hard part is to estimate the average waiting times for the picker. Figure 17 shows the results. Our approximation is still good.
Other waiting times, which we cannot calculate directly, are the waiting times for the replenishment, which are shown in Figure 18. We do not need these times for our optimisation problem. However, it demonstrates how well our algorithm estimates other parts of the network.
Although we are happy with these results, we remark that this approximation worked well for our test system but systems with less impressive results are possible, too.  The graph for approximation is right on the graph for simulation. The large fraction of turnover times for the transportation is the same in both systems. It is equal in both systems by construction.

Conclusion
In this paper, we have focused on semi-open queueing networks (SOQNs), which include an external queue for customers and a resource network which consists of an inner network and a resource pool. We have determined closed-form expressions for stability, throughputs and idle probabilities of some nodes. For other performance metrics, we have proposed a new approximation approach to solve problems of an SOQN with backordering.
To approximate the resource network of the SOQN with backordering, in Section 3 we have considered a modification, where newly arriving customers will decide not to join the external queue and are lost if the resource pool is empty ("lost customers"). We have proved that we can adjust the arrival rate so that the throughputs in each node are pairwise identical to those in the original network. We also have proved that the idle probabilities of the nodes with constant service rate are pairwise identical.
To approximate the external queue of the SOQN with backordering, in Section 4 we have used a two-step approach. In step one, we have constructed a reduced SOQN with lost customers, where the inner network consists only of one node, by using Norton's theorem. In step two, we have used the closed-form solution of this reduced SOQN, to estimate the performance of the original SOQN with backordering.
Based on the theoretical foundation above, we have modelled a real-world automatic warehousing system (called robotic mobile fulfilment system, in short: RMFS) as an SOQN with backordering. We have selected for our experiment an RMFS with two picking stations and one replenishment station. Based on the stability analysis in Section 2.2, we got the minimum number of robots for a stable system. However, the stability does not say anything about the turnover time of an order (= waiting time in the external queue + processing time in the inner network), since customers' orders are expected to be processed as quickly as possible in the e-commerce sector. Therefore, we have calculated the processing time in the network based on the approximation model in Section 3 and the waiting time in the external queue based on the approximation model in Section 4. Due to the short computational time for trying different numbers of robots, we got all the important metrics within a couple of seconds.
We have plotted the data to see the relationship between the number of required robots and the average waiting time of a task. Based on our results, we have seen the dramatic reduction in waiting time in the external queue with only one more robot, while we have seen the stagnation of average waiting time by adding more robots. Therefore, it provides an interesting insight for practitioners and researchers to make a trade-off between low investment cost and good service for customers. We made a simulation to analyse the quality of our approximation method. For our test system, it shows good results.

Appendix A Omitted proofs
Proof of Proposition 11. We will show the strict isotonicity of the effective arrival rate λ eff (λ LC ) for any N ≥ 1 by induction in N . To distinguish systems with different numbers of resources, we will use the notation λ (N ) eff (λ LC ) for a system with N resources. We recall the following constants from (25) for L = 1, 2, . . . , N : .
Lemma 2.8 in Daduna et al. [9] states that deleting any node of a cycle with skipping the gap increases the throughput. Consequently, B ≥ 0 in a two-step conclusion, and similarly C ≥ 0.
(ii) From definition of D and E follows by direct comparison D > E.
(iii) The proof will be finished if we can show B ≥ C. To do so, it is sufficient to prove with Shanthikumar and Yao [27]  The left-hand side is the throughput of the Gordon-Newell network according to Definition (2.2) in Shanthikumar and Yao [27].
The right-hand side is the throughput of the Gordon-Newell network with service rate at node 0 increased to λ LC + ε. Corollary 3.1(i) in Shanthikumar and Yao [27] states which verifies (34).