Abstract
Jackson networks are versatile models for analyzing complex networks. In this paper we study generalized Jackson networks with single-server stations, where nodes may have an infinite supply of work. We allow simultaneous breakdown of servers and consider group repair strategies. We establish the existence of a steady-state distribution of the queue-length vector at stable nodes for different types of failure regimes. In steady state the distribution of the failure/repair regime and of the queue-length vector at stable nodes decouples in a product-form way. We provide closed-form solutions for the classical performance measures such as throughput or mean sojourn time at a station.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Jackson networks (henceforth, JN) are a well-established class of models for, for example, production, telecommunication, computer systems; for surveys see [8] and [3]. JN have the property that the distribution of the stationary queue-length vector is of product form, which allows for quick numerical evaluation of performance measures, such as mean queue length, mean sojourn times and throughput at nodes. Further modifications with the product-form property of JN have been developed, which cover additional features relevant in modeling networks in the fields mentioned. This paper addresses the following extensions:
-
(I)
Infinite supply [22] has the aim of utilizing capacity of a server to the fullest. Examples: In service center models an agent, when not answering a call, switches to low-priority works such as answering e-mail and administrative duties. In a model of a highway network, an infinite supply node represents a highway segment with an on-ramp during a rush-hour period where a constant flow of vehicles requiring access to the highway is present. In production processes, an additional inventory with raw material guarantees that the machine will not idle even when there is sometimes no external demand.
-
(II)
Breakdown and repair [16] of servers with (i) simultaneous breakdown of groups of servers and (ii) group repair strategies. For example, repairing several servers simultaneously may lead to more efficient repair actions and thus reduce repair time.
-
(III)
Unstable nodes [5] in the network, where instability of nodes may be due to overload generated by nodes without infinite supply or by nodes with infinite supply, or both.
While (I) and (III), and the combination thereof, were already dealt with in the literature, the interaction of Jackson networks with infinite supply, instability of nodes and reliability issues has, to the best of to our knowledge, not found the deserved investigation.
To fill this gap, we show that the distribution of the stationary queue-length vector of JN with the above features still is of product form. Thereafter, starting from this result, in a section with applications we compute classical performance measures such as throughput, mean queue length and mean service time, which are analytical in the system parameters.
The paper is organized as follows: We start with the definition of the different extended JN models in Sect. 2. In Sect. 3 we illustrate these by a motivating example to clarify the basic notation and concepts. A literature review is provided in Sect. 4. The main technical analysis is performed in Sect. 5, resulting in product-form stationary and/or limiting distributions in Sect. 5.2, and in Sect. 5.3 we provide explicit solutions for standard availability and performance measures, partially computed in nonstandard settings (non-ergodic networks). The main proofs are postponed to the Appendix.
2 Extended Jackson network models
2.1 Standard Jackson networks
The network consists of J exponential single server nodes with service discipline “first-come–first-served” (FCFS); the node set is denoted by \( \tilde{J} = \{ 1 , \ldots , J \}\). An artificial node 0 represents source and sink of the network, and we let \( \tilde{J}_0 = \{ 0, 1 , \ldots , J \}\).
At node i a Poisson stream with rate \( \lambda _i \ge 0\) arrives from the exterior represented by node 0, and we let \( \lambda := \sum _{i\in \tilde{J}} \lambda _i > 0\). Service times at node i are exponential with rate \( \mu _i \). All service times constitute an independent family of variables which are independent of the arrival streams. Customers are indistinguishable and follow the same rules, i.e., there is just one class of customers.
Routing is Markovian; a customer departing from node i immediately proceeds to node j, with probability \(r{(i,j)}\ge 0\) and departs from the network with probability \(r(i,0)\ge 0.\) Given the departure node i, the routing is independent from the history of the network. Taking \(r(0,j)=\lambda _j/\lambda ,\ r(0,0)=0\), we assume that the extended routing matrix \(R=(r(i,j):{i,j\in \tilde{J}_0})\) is irreducible.
In the following overview we indicate the variations of Jackson networks discussed in this paper, where the roman numbers refer to the labels provided in the Introduction.
2.2 Jackson networks with infinite supply (I)
The standard Jackson network is modified as follows: Nodes in \( V \subseteq \tilde{J}\) have an infinite supply from which customers are put into an idling server. We denote \(W:= \tilde{J}{\setminus }V\) and require \( V \ne \emptyset \) (unless otherwise specified).
Customers from the infinite supply have low priority, and (standard) customers arriving from the outside or from another server have high priority with preemptive-resume regime: Service of a low-priority customer is interrupted as soon as a high-priority customer arrives. Service of low-priority customers is resumed only when the server idles again. When a low-priority customer is served and fed into the network, he/she becomes a high-priority customer and follows the rules for standard customers.
Service times of low-priority customers are independent of the external arrival streams and the service times of high-priority customers.
2.3 Jackson networks with infinite supply and unreliable nodes (II) and (III)
Nodes (servers) in the set \( D \subseteq \tilde{J}\) may break down and are repaired thereafter. In the following we denote by \( \mathcal{P } ( A ) \) the power set of a set A.
Definition 1
The breakdown repair process \(Y=(Y(t): t\ge 0)\) is Markov on state space \(\mathcal{P}(D)\). \(Y(t)=I\) for \(\emptyset \subseteq I\subseteq D\) indicates that (exactly) the nodes in I are broken down. The transition rates of Y out of \(I\subseteq D\) are given as follows:
-
1.
if \(I\subset H\subseteq D\), the nodes in \(H{\setminus }I\) break down with rate \(\alpha (I,H)\ge 0\),
-
2.
if \(\emptyset \subseteq K\subset I\), the nodes in \(I{\setminus }K\) are repaired with rate \(\beta (I,K)\ge 0\).
Note that \(\alpha (\cdot ,\cdot ), \beta (\cdot ,\cdot )\) can be constructed from any pair of functions \(A, B, \mathcal {P}(D) \rightarrow [0,\infty ),\) such that \(A(\emptyset ) = B(\emptyset ) =1\), and \(\forall ~ I\subset H\subseteq D,~ {A(H)}/{A(I)}<\infty \), and \(\forall ~\emptyset \subseteq K\subset I,~ {B(I)}/{B(K)}<\infty \) (where we set \(0/0=0\)).
With these functions, we set, for all subsets of down nodes \(I \subseteq D\),
The following regime is in force whenever a node breaks down:
-
service at this node is interrupted, customers (of high as well as of low priority) are frozen there to wait for restart of the service, which is resumed at the point where it was preempted,
-
no new customers are admitted to enter that node,
-
customers who select a broken down node to visit are rerouted according to one of the classical rules: stalling, skipping or blocking rs–rd, which will be defined below,
-
all these rules, if applicable, are valid for high- and low-priority customers.
Rerouting is a function of Y and applies only to high-priority customers, because on departure from a node with infinite supply low-priority customers are transformed immediately to high priority and only thereafter are rerouted.
Definition 2
If nodes in \(I \subseteq D\) are down, a new routing scheme is set in force which is determined by a routing matrix \(R^I=(r^I(i,j):i,j\in \{0\}\cup \tilde{J}{\setminus }I)\).
Construction of rerouting schemes \(R^I\) follows [16], and we distinguish the following rerouting schemes:
1. Skipping If a customer selects a down node as destination, the customer jumps to this node, spends no time there and immediately performs the next jump according to routing regime R until he arrives at a node in up status or leaves the network. So, if nodes in \(I \subseteq D\) are down, customers are rerouted according to the routing matrix \(R^I=(r^I(i,j):i,j\in \{0\}\cup \tilde{J}{\setminus }I)\):
The external arrival rates during a breakdown of I are
and \(\lambda _k^I=0\) for \(k\in I\). The service intensities are
2. Blocking rs–rd Broken down stations are blocked. A customer whose next destination is down stays at his present node to obtain immediately another service there. After the repeated service (rs) the customer chooses his next destination anew according to R [random destination (rd)]. So, if nodes in \(I \subseteq D\) are down, customers are rerouted according to the matrix \(R^I=(r^I(i,j):i,j\in \{0\}\cup \tilde{J}{\setminus }I)\) with
The external arrival rates during a breakdown of I are \(\lambda _j^I=\lambda _j\), for \(j\in \tilde{J}{\setminus }I\), and \(\lambda _j^I=0\) otherwise. The service intensities are
3. Stalling Whenever a node breaks down the service system is frozen: All arrival processes are interrupted and service anywhere in the network is stopped until all broken down nodes are repaired again. So, if nodes in \(I\ne \emptyset \) are broken down then for all \(i\in \tilde{J}\) the I-dependent rates are set to \(\lambda ^I_i=\mu ^I_i=0\). The stopped nodes which are in up status are waiting in warm standby, i.e., they can break down although they are stalled. Stalling is applied, for example in the automotive industry to decrease variability of the flow of materials. Indeed, stalling prevents servers to send parts to a server that is broken down and thereby prevents piling up inventory.
Remarks
The parametric form (1) of the breakdown and repair rates stems from a versatile recipe to construct correlated multi-dimensional birth–death processes. With suitable functions A, B we can model, for example that nodes may break down in isolation or in groups, and repair may happen similarly. It is not required that nodes which are broken down simultaneously are repaired at the same time.
A statistical procedure to check whether this form is justified is to determine in a first step all possible values \( A ( I ) = \alpha ( \emptyset , I ) \) and \(B ( I ) = \beta (I , \emptyset ) , I\subseteq D \), and then to check stepwise (1).
3 Example of breakdown and repair process
We explain the basic setup for our analysis with an example of a Jackson network with \( J = 7 \) nodes, as depicted in Fig. 1. There are two arrival streams with arrival rates \( \lambda _1 \) and \( \lambda _2 \). Possible routes are indicated by arrows. The set of nodes V with infinite supply is indicated in the figure by additional incoming dotted arrows. We have \( V = \{ 1 , 5 , 7 \} \) and \( W = \{ 2 , 3, 4, 6 \} \). The network displayed in Fig. 1 is inspired by a model of a highway network. Nodes represent road segments and nodes with infinite supply model road segments with an on-ramp, where it is assumed that there is constant flow of incoming traffic to the on-ramp. Note that in queueing models for traffic systems one typically only models the flow in one direction and fits the service rate of the queues to the traffic characteristics; see, for example [21, 23].
Suppose that \( \lambda _1 < \mu _1, \mu _3 \) and \( \mu _1 > \mu _3 \). Then, without infinite supply at node 1, node 3 is stable in the classical sense as the arrival rate \(\lambda _1\) to node 3 is smaller than the service rate \(\mu _3 \). In case of infinite supply at node 1, however, node 1 acts as Poisson \( \mu _1 \) arrival stream to node 3. This causes node 3 to become unstable. Whether a node is stable or not can be decided from the traffic equations of the network; details are provided in Sect. 5.1. We let \( S \subseteq \tilde{J}\) denote the set of stable nodes and U the set of unstable nodes.
The set of unreliable nodes is \( D = \{ 3, 5\} \), and possible breakdown scenarios are \(\emptyset , \{ 3\} \), \( \{ 5\} \), \( \{ 3 , 5 \} \). The breakdown rate of server \( i = 3, 5 \) is \(A(\{i\}) =\tau _i\), and the corresponding repair rate is \(B(\{i\}) =\rho _i\). In the same vein, let \(A(\{ 3 , 5 \} ) = \tau _ 3 + \tau _5 \) be the rate with which the operating system enters breakdown state \( \{ 3 , 5\} \), and let \(B(\{ 3 , 5 \} ) = 2 \min ( \rho _3 , \rho _5 ) \) be the rate with which the systems jump from breakdown state \( \{ 3 , 5\} \) to the state \( \emptyset \) with all server operating. These rates \( A ( I ) =\alpha (\emptyset , I)\) and \( B ( K ) =\beta (K,\emptyset )\) are the basic input data for our model. Following (1) we construct transition rates \( \alpha \) and \( \beta \) covering all possible intermediate state transitions. More specifically, for \( i \in D\) let \(\alpha ( \emptyset , \{i\} ) =\tau _i \) and \(\beta (\{i\} , \emptyset ) = \rho _i \), \(\alpha ( \emptyset , \{ 3 , 5 \} ) = \tau _ 3 + \tau _5 \) and \(\beta (\{3, 5\} , \emptyset ) = 2 \min ( \rho _3 , \rho _5 ) \). Eventually, given that node \( i \in D\) is broken down, the breakdown rate of the other node is
whereas from breakdown scenario \( \{ 3 , 5 \} \) node \( i \in D\) alone has repair rate
All other values for breakdown and repair rates are zero.
These breakdown and repair rates from (1) define the generator for the Markov process \(Y = ( Y(t) : t \ge 0 ) \) on state space \((\mathcal {P}(D), 2^{\mathcal {P}(D)})\). By inspection we see that
fulfills
for all \(K, G \in \mathcal {P}(D)\), which implies that, after normalization, \(\pi \) is the steady state of the breakdown and repair process. Even more, we have proved that Y is reversible.
We have only three possible breakdown scenarios, and the repair process has states \( \emptyset , \{ 3 \}, \{ 5 \}, \{ 3 , 5 \} \). The stationary distribution \(\pi \) of Y is
with normalizing constant
4 Literature review and related work
Investigation of generalized Jackson networks with infinite supply has recently found much interest in the literature, and it turned out that the feature of infinite supply makes analysis of the network considerably harder than that of classical product-form networks of the BCMP and Kelly type.
Infinite supply of lower-priority work is used frequently, for example in [11] in an M / G / 1 queueing system to utilize idle times. Recent works using this concept of infinite supply are, for example [10], where a push–pull network with infinite supply is investigated.
A special class of multi-class queueing networks with virtual infinite buffers has been introduced in [9] and [2]. For single-class ergodic networks of Jackson type with infinite supply of work at some nodes, Weiss [22] has obtained a product-form solution of the steady-state queue-length distribution at nodes without infinite supply. He discussed as an example a particular computer communication system that works according to MAN (metropolitan area network) Ethernet RPR (resilient packet ring), where ring traffic has priority over traffic generated at nodes.
Another application from a different field where such models fit is in wireless sensor networks. The nodes (sensors) continuously sense their environment and have to forward the data to a central station (sink). This is usually not possible by direct communication, so the nodes act additionally as transmission stations for data from other sensor nodes. If forwarding transmissions from other nodes has priority, the node’s own data constitute the infinite buffer which generates the infinite supply for the node.
The work of Guo et al. [6] is on general multi-class queueing networks with infinite supply under different scheduling policies for the servers. These policies guide the nodes’ decisions how to dedicate their activities to either the regular standard queues or the infinite virtual queues. The key research question is the interplay of the production of jobs from the infinite supply and stability of the standard queues. Ergodicity problems for such systems are considered in [15]. Another class of models where additional work is added whenever a server becomes idle is queues with vacations. If a server observes an empty queue “he goes away to serve a customer elsewhere” and returns thereafter. If he finds customers waiting there, he immediately starts serving them, but if his queue is still empty on his return he takes “another vacation,” and so on. For a survey, see [4].
The interplay of nodes with infinite supply and local stability issues has been studied in depth in [18].
5 Network processes for unreliable Jackson networks with infinite supply
We consider the Markov process for Jackson networks which at some nodes have infinite supply and where some nodes break down randomly and are repaired thereafter. Breakdowns of nodes in standard Jackson networks were investigated in [16] and [13]. It turns out that breakdown of nodes with infinite supply requires a more specific regime to control breakdown and repair.
Consider a Jackson network with infinite supply and the unreliable nodes, and the rerouting regime is either stalling, skipping or blocking rs–rd with the respective rerouting matrices \(R^I\) according to Definition 2. Denote \(R^\emptyset := R\). Then the joint Markovian availability queue-length process \((Y,X)=((Y(t);X_1(t),\ldots ,X_J(t)):t\in \mathbb {R}_+)\) on the state space \(\mathcal {P}(D)\times \mathbb {N}^J\) has transition rate matrix \(Q=(q(z,z'):z,z'\in \mathcal {P}(D)\times \mathbb {N}^J)\) defined for all \((n_1,\ldots ,n_J)\) and all \(I\subseteq D, i,j\in \tilde{J}{\setminus }I, i\ne j\):
and \(q(z,z')=0\) otherwise for \(z\ne z'\), where \(1_{A}( \cdot )\) denotes the indicator mapping with respect to the set A.
The following theorem yields a characterization of the departure streams from nodes. For a proof, we refer to the technical report [17].
Theorem 1
With the above definitions:
-
(i)
If at time t all nodes are up, the departure stream from node \(j\in V\) is a Poisson process with rate \(\mu _j\). Thus, the departure stream from \(j\in V\) to \(i\in \tilde{J}\) is Poisson with rate \(\mu _j r(j,i)\).
-
(ii)
Whenever nodes in \(I\ne \emptyset \) are broken down and either skipping or blocking rs–rd is in force, the departure stream of node \(j\in V{\setminus }I\) with infinite supply in up status is Poisson with rate \(\mu _j\). The departure stream from \(j\in V{\setminus }I\) to \(i\in \tilde{J}{\setminus }I\) is Poisson with rate \(\mu _j r^I(j,i)\). If a node \(k\in V\) with infinite supply is broken down, i.e., \(k\in V\cap I\), the departure stream of this node is interrupted until its server is repaired.
In the case of stalling, all arrival streams stop whenever a breakdown occurs (\(I\ne \emptyset \)) and are reactivated when all nodes return to the up status.
5.1 Extended traffic equations
Different traffic equations are required to analyze the long-time behavior.
Definition 3
The general traffic equations for Jackson networks with infinite supply (without breakdown and repair) are
Node i is stable if \(\eta _i\) from (4) satisfies \(\eta _i<\mu _i\); otherwise, the node is unstable.
The notion of a stable node in a network was introduced by Goodman and Massey [5] when investigating Jackson networks where the describing Markov process X is not necessarily ergodic. If X is ergodic (often called a stable process), all nodes are stable, but as Goodman and Massey have shown it is a valuable distinction which separates the notions of ergodicity and stability.
It is worth noting that for networks without unstable nodes the traffic equations in Definition 3 reduce to the (standard) traffic equations of a Jackson network with infinite supply [22]:
For the different breakdown regimes, the traffic equations put forward in Definition 3 are extended as follows:
Definition 4
The (standard) traffic equations for unreliable Jackson networks with infinite supply are as follows:
-
(i)
In the case of stalling,
$$\begin{aligned} \eta _i=\lambda _i + \sum _{j\in W}\eta _j r(j,i) + \sum _{j\in V} \mu _j r(j,i), \ i\in \tilde{J}, \end{aligned}$$(6)as long as all nodes are up (\(I=\emptyset \)). Otherwise, \(\eta ^I_i=0\) for all \(i\in \tilde{J}\).
-
(ii)
In the case of blocking rs–rd or skipping, for all \(I\subseteq D\),
$$\begin{aligned} \eta ^I_i=\lambda ^I_i + \sum _{j\in W{\setminus }I}\eta ^I_j r^I(j,i) + \sum _{j\in V{\setminus }I} \mu ^I_j r^I(j,i), \ i\in \tilde{J}{\setminus }I. \end{aligned}$$(7)
The traffic equations for some \(I\subseteq D\) are in force only as long as the availability status is unchanged. Whenever the availability status of the system changes, the traffic equations are adapted. Thus, each traffic equation (7) may have different solutions for different I. The next lemmata provide constraints such that the solutions of (7) are invariant on \(\tilde{J}{\setminus }I\) for all \(\emptyset \subseteq I\subseteq D\).
Lemma 1
Consider a Jackson network where nodes in \(D\subseteq \tilde{J}\) are unreliable, and nodes in \(V\subseteq \tilde{J}\) have an infinite supply of work. For all nodes \(i\in W\) without infinite supply, let \(\eta _i<\mu _i\), where \((\eta _i{:}i\in \tilde{J})\) is the unique solution of (5), resp. (6). In case of breakdowns of nodes customers are rerouted according to the blocking rs–rd regime.
-
(i)
If the following reversibility constraints hold:
$$\begin{aligned} \eta _i r(i,j)&=\eta _j r(j,i)\quad \forall i,j\in W, \end{aligned}$$(8)$$\begin{aligned} \eta _i r(i,j)&=\mu _j r(j,i)\quad \forall i\in W, j\in V, \end{aligned}$$(9)then \(\eta ^I_i= \eta _i, i\in W{\setminus }I\), are solutions of the traffic equations (7) for all \(I\subseteq D\).
-
(ii)
Let (8) and (9) hold. If we additionally require the reversibility constraint
$$\begin{aligned} \mu _i r(i,j)&=\mu _j r(j,i)\quad \forall i,j\in V, \end{aligned}$$(10)then \(\eta ^I_i=\eta _i\) for all \(i\in V{\setminus }I\) are solutions of (7) for all \(I\subseteq D\).
Proof
(i) For all \(i\in W{\setminus }I\) and all \(I\subseteq D\) we make the ansatz \(\eta _i=\eta ^I_i\) with \(\eta _i\) the solution of the traffic equations (5). Inserting we obtain
(ii) For any \(I\subseteq D\), the following holds \(\forall i\in V{\setminus }I\):
\(\square \)
Remark 1
The constraints (8), (9) and (10) are different from the classical reversibility constraints which are the local balance equations of the routing process. But the interpretation of (8), (9) and (10) is similar: Customer flow from i to j equals the customer flow from j to i.
For rerouting in order to skip broken down nodes, we assume that the unreliable nodes in V are rate stable in the sense of [10, p. 76], i.e., these nodes have equal input and output rates.
Lemma 2
For the solution \((\eta _i:i\in \tilde{J})\) of (5) let the following hold: For all nodes \(i\in W\) without infinite supply \(\eta _i<\mu _i\), and
If, in the case of breakdowns of nodes in \(I\subseteq D\), customers are rerouted by skipping I, then the traffic equation (7) is solved by \((\eta ^I_i:= \eta _i,i\in \tilde{J}{\setminus }I).\)
Proof
We make the ansatz \(\eta _i=\eta ^I_i\) for all \(i\in W{\setminus }I\) and all \(I\subseteq D\). We then obtain, with the solution \(\eta _i\) of the traffic equations (5), for any \(I\subseteq D\): \(\forall i\in W{\setminus }I\),
Since \(\eta ^I_j=\eta _i\) holds for all \(j\in W{\setminus }I\) and all \(I\subseteq D\), it follows that, for all \(i\in V{\setminus }I\) and \(I\subseteq D\),
which is the left-hand side of (12) with \(i\in V{\setminus }I\). The above computations in (12) are valid for all \(i\in \tilde{J}{\setminus }I\); hence, it follows that \(\eta ^I_i=\eta _i\) for all \(i\in V{\setminus }I\) and \(I\subseteq D\), too. \(\square \)
We illustrate the adjusted traffic equations with the example from Sect. 3, where nodes in \(D = \{ 3 , 5 \}\) are unreliable. Then condition (11) requires \(\eta _5=\mu _5\) because \(\text {node}~~5\in D\cap V\), and node 5 is not stable. So, (4) is the relevant traffic equation. Due to the network’s feedforward structure we can evaluate the arrival rates directly.
Example 1
For \( I = \emptyset \) the solution of the standard traffic equation (4) is \( \eta \):
If nodes in \( D = \{ 3 , 5 \} \) are down, under stalling \( \eta _ i = 0 \) for all i.
In the case of skipping, we obtain \( \eta _i^I = \eta _i \) for \( i \in \{ 1 ,2 , 4, 6, 7 \}\), and \( \eta _i^I = 0 \) for \( i=3 , 5 \). Blocking rs–rd cannot be implemented in this network, if \( \eta _i^I = \eta _i \), for \( i \in \{ 1 ,2 , 4, 6, 7 \}\) is required because the routing chain is not reversible.
In the following we present examples of networks that fulfill the requirements to apply blocking rs–rd: a network with linear topology and a star-shaped topology network.
Example 2
(Two-way tandem network) Consider a network with \(J=\{1,2,3\}\), \(V=\{2\}\), \(W=\{1,3\}\), and \(D=V\), i.e., the node with infinite supply is prone to failure. Routing is given by
for \( 0< a , b < 1 \), and \( \lambda _i > 0 \), \( i=1,2\). The network is a two-way tandem of three nodes; see Fig. 2. The infinite supply is depicted by a dashed arrow pointing to server 2, and the node that is prone to failure is depicted as bold circle. Note that by incorporating node 0, the linear topology is transformed into a ring.
For ease of analysis we parameterize the model and set \( \lambda _1 = ( 1 -a ) t \), for \( t > 0 \), \( \lambda _3 = a t \), and \( b = 1 - c \). The service rates are
The standard traffic equations (4) then have the solution
Note that this implies that \( U = \{ 2 \} \).
With this choice of parameters it can be seen after some tedious algebra that the network satisfies the reversibility conditions from Lemma 1 and rate stability from Lemma 2. Thus, all three blocking disciplines apply.
\( \emptyset , \{ 2 \}\) are the only possible breakdown scenarios, and with the breakdown and repair rates from the motivating example in Sect. 3 we have
Example 3
(Star-shaped network) Consider a network with \(J=\{1,2, \ldots , 6 \}\), \(V=\{2, 3 ,4 \}\), \(W=\{1,5, 6 \}\), and \(D=V\), i.e., all infinite supply nodes are prone to failure. Jobs arrive from outside with rate \( \lambda \) at the central node 1. From node 1 they go with probability r / 5 to any of the nodes 2 to 6, for \( r \in ( 0 , 1 ) \). After finishing service at node \( i = 2 , \ldots , 6 \), jobs are sent back to the central node 1. Being served there, they either leave the system with probability \(1- r \), or are sent back to one of the servers in the set \( \{ 2 ,\ldots , 6 \}\) according to the routing scheme described above. The network is a star-shaped network; see Fig. 3. Infinite supply is depicted by dashed arrows, and nodes prone to failure are depicted by bold circles.
The traffic equations are
where \( \eta _5 = \eta _6 = r \eta _1 / 5\), provided that nodes 5, 6 are stable. For blocking rs–rd and skipping to be applicable, we let
which implies \( \eta _1= \lambda / (1 - r )\), and thus
Eventually, we let
in order to let 5, 6 be stable nodes. Indeed, this choice implies \( \eta _i < \mu _i\), \( i= 5,6\). The above conditions imply that the reversibility conditions from Lemma 1 and the rate stability condition from Lemma 2 are satisfied.
Recall that \( D = \{ 2 , 3, 4 \}\). Hence, \( \emptyset , \{ 2 \}\), \( \{ 3\}\), \(\{ 4 \} \), \( \{ 2 ,3 \}\), \( \{3 ,4 \} \), \( \{2 ,4 \}\) and D are the possible breakdown scenarios , and with breakdown and repair rates from the example in Sect. 3 we have
5.2 Long-time behavior
In this section we study the long-time behavior of extended Jackson networks.
Theorem 2
Let \(W\subseteq S\) (nodes without infinite supply are stable). Denote by \(\eta =(\eta _1,\ldots ,\eta _J)\) the solution of the traffic equations (5). Under stalling the following hold:
-
(i)
For nodes without infinite supply, the joint marginal limiting distribution is
$$\begin{aligned} \lim _{t\rightarrow \infty } P(Y(t)&=I;X_i(t)=n_i: i\in W) \nonumber \\&=\left( \sum _{K \subseteq D} \frac{A(K)}{B(K)}\right) ^{-1}\frac{A(I)}{B(I)} \prod _{i\in W} \left( 1-\frac{\eta _i}{\mu _i}\right) \left( \frac{\eta _i}{\mu _i}\right) ^{n_i},\nonumber \\&\qquad (\forall I\subseteq D, ~~(n_i: i\in W)\in \mathbb {N}^{|W|}) \end{aligned}$$(13)and this is a stationary distribution on W as well.
-
(ii)
If the global network process is started with an initial distribution which has the marginal (13) on W, the arrival stream from \(i\in W\) to \(j\in V\) is Poisson with rate \(\eta _i r(i,j)\) whenever all nodes are in up status. These streams are independent given the nodes are up.
-
(iii)
Assume the global network process is started with an initial distribution which has marginal (13) on W. Then the marginal limiting distribution for a stable node \(i\in V\) with \(r(i,i)=0\) is, for all \(I\subseteq D\) and all \(n_i\in \mathbb {N}\),
$$\begin{aligned} \lim _{t\rightarrow \infty } P(Y(t)=I;X_i(t)=n_i)= \left( \sum _{K \subseteq D} \frac{A(K)}{B(K)}\right) ^{-1}\frac{A(I)}{B(I)} \left( 1-\frac{\eta _i}{\mu _i}\right) \left( \frac{\eta _i}{\mu _i}\right) ^{n_i} \end{aligned}$$(14)if and only if \(\eta _i<\mu _i\). This is a one-dimensional stationary distribution as well.
Moreover, if \(\eta _i\ge \mu _i\) for node \(i\in V\), then for its limiting probability the following holds:
$$\begin{aligned} \lim _{t\rightarrow \infty } P(Y(t)=I;X_i(t)=n_i)= 0,\quad \forall I\subseteq D, ~n_i\in \mathbb {N}. \end{aligned}$$(15)
The proof and the proof of the next theorem are postponed to the Appendix.
Theorem 3
Let \(W\subseteq S\) (nodes without infinite supply are stable). Denote by \(\eta =(\eta _1,\ldots ,\eta _J)\) the solution of the traffic equations (5). In the case of breakdowns, customers are rerouted according to the blocking rs–rd regime or the skipping regime. If blocking rs–rd is in force we require the reversibility constraints (8) and (9). If skipping is in force, let (11) hold. Then for nodes without infinite supply the joint marginal limiting distribution is
and this is a stationary distribution on W as well.
The results of Goodman and Massey [5] are on classical Jackson networks where some nodes are not stable. They prove a product-form limiting distribution for the stable subnetwork, but there is no such result for a stationary distribution. The reason is that the exploding unstable nodes influence the stable part of the network. Over any finite (transient) time horizon [0, t] the departure streams from the unstable nodes are not Poisson. Put differently, only the limiting distribution is known. Fortunately, the proofs of Theorem 2(i) and (iii) and of Theorem 3 allow us to establish the following result without assuming ergodicity of the whole process:
Corollary 1
Under the conditions of Theorems 2 and 3, the process:
is an ergodic homogeneous Markov process of its own. If, in addition, for \(i\in V\) it holds that \(\eta _i<\mu _i\), then the process
is an ergodic homogeneous Markov process of its own for \( i \in V \).
Remark 2
In the setting of Theorem 3 a statement as in Theorem 2(iii) cannot be proved with the methods used here. This is due to the requirement that \(r(i,i)=0\) and the properties of the rerouting regimes skipping and blocking rs–rd. Whenever nodes in \(I\ne \emptyset \) are down, immediate feedback may emerge even at nodes in V with \(r(i,i)=0\). If \(i\in V{\setminus }I\) and \(r(i,j)>0\) for at least one \(j\in I\), then \(r^I(i,i)>0\) may occur.
On the other hand, if the network’s topology prevents occurrence of feedback by skipping or rs–rd regime in case of breakdown, it is possible to prove a counterpart to Theorem 2(iii) in the setting of Theorem 3.
Our motivating example from Sect. 3 in Fig. 1 is a feedforward network according to the following definition. Feedforward networks constitute an important subclass of Jackson networks.
Definition 5
A network with node set \(\tilde{J}\) with \(|\tilde{J}| = J\) is a feedforward network if there exists an enumeration \(\tilde{J} = \{1,2, \ldots , J\}\) of the nodes such that:
Feedforward networks are not reversible, and therefore in the case of breakdowns we must recur to stalling or skipping as a rerouting scheme. The following property of feedforward networks is intuitive.
Lemma 3
If, in a feedforward network with node set \(\tilde{J} = \{1,2, \ldots , J\}\), a subset \(\emptyset \subseteq I\subseteq \tilde{J}\) of nodes is down and either skipping or stalling is applied as a rerouting scheme, then
and therefore there is no immediate feedback at all nodes.
Theorem 4
Consider a feedforward network with \(W\subseteq S\) (nodes without infinite supply are stable). Denote by \(\eta =(\eta _1,\ldots ,\eta _J)\) the solution of the traffic equations (5). In the case of breakdowns customers are rerouted according to the skipping regime. Assume that (11) holds and that the global network process is started with initial distribution which has the marginal (13) on W.
Then the marginal limiting distribution for a stable node \(i\in V\) is, for all \(I\subseteq D\) and all \(n_i\in \mathbb {N}\),
if and only if \(\eta _i<\mu _i\). This is a one-dimensional stationary distribution as well.
Moreover, if \(\eta _i\ge \mu _i\) for node \(i\in V\), then for its limiting probability
The proof is similar to that of Theorem 2, part (iii), with Lemma 3.
5.3 Applications
Standard performance evaluation requires ergodicity of the underlying Markov processes which allows one to approximate long-time average cost functions by integrals of the cost function under stationary distributions.
Our framework overcomes this restriction and allows us to investigate even non-ergodic networks across subnetworks where stabilization occurs only in the long run. As stated in Corollary 1, some important subnetworks of stable nodes can be considered as networks of their own. Therefore, for these parts we can extend the traditional analysis directly. But we emphasize that even if there exists no equilibrium on the stable subnetworks, performance analysis for long-time averages of cost functions is possible via integrals of the cost function under the limiting distribution on stable nodes, respective subnets. For details we refer to Section 4.2 in [13] and Section 4.6.4 in [12]. In the following, we will state our results for the setting of Corollary 1.
The availability process Y in Theorems 2 and 3 is an ergodic Markov process of its own with limiting and stationary distribution
From this the stationary (time) point availability of a Jackson network with infinite supply and unreliable nodes (or subnetworks thereof) may be computed similarly to [16, p.185] as \( \mathrm{{ PA }} (H)(t):=\sum _{K\subseteq D{\setminus }H} \pi (K) \), for \( H\subseteq D, t\ge 0 \), where \(\pi (I)\) is the probability that exactly the nodes in \(I\subseteq D\) are under repair, given by (17). We provide an overview of main performance characteristics.
Under stalling, the stationary throughput at node \(i\in W\) (no infinite supply) is \( \eta _i\cdot \pi (\emptyset ),\) the mean queue length is \( ({\eta _i}/{\mu _i}) \left( 1 -({\eta _i}/{\mu _i}) \right) ^{ -1} \), and the mean waiting time is \( \big ({(\mu _i - \eta _i ) \, \pi (\emptyset )}\big )^{-1} \).
Under blocking rs–rd and skipping, the stationary throughput at a node \(i\in W\) is \( \eta _i\cdot \sum _{I\subseteq D, i\notin I}\pi (I) \), the mean queue length at node i is \( ({\eta _i}/{\mu _i}) \left( 1 -({\eta _i}/{\mu _i}) \right) ^{ -1} \), and the mean waiting time \( \big ({(\mu _i - \eta _i) \, \sum _{I\subseteq D, i\notin I}\pi (I) }\big )^{-1} \).
The proof of the above properties follows from Theorems 2, 3 and 4, which show that the asymptotic mean queue length at a stable node can be computed. Evoking Little’s law, see [19, 20], mean waiting times at stable nodes follow.
6 Conclusion
We have integrated in Jacksonian networks breakdown and repair of servers together with infinite supply servers and unstable network parts in one framework. We obtained closed-form solutions of the steady-state queue-length distribution at stable nodes and for key performance measures. Future research will be on extending our results to state-dependent and, more generally, to path history-dependent failure rates.
References
Abramowitz, M., Stegun, I.: Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables. Dover Publications Inc., New York (1992). (reprint of the edition 1972)
Adan, I., Weiss, G.: A two-node Jackson network with infinite supply of work. Probab. Eng. Inf. Sci. 19, 191–212 (2005)
Chen, H., Yao, D.D.: Fundamentals of Queueing Networks. Springer, Berlin (2001)
Doshi, B.T.: Single server queues with vacations. In: Takagi, H. (ed.) Stochastic Analysis of Computer and Communication Systems, pp. 217–267. IFIP, Amsterdam (1990)
Goodman, J.B., Massey, W.A.: The non-ergodic Jackson network. J. Appl. Probab. 21, 860–869 (1984)
Guo, Y., Lefeber, E., Nazarathy, Y., Weiss, G., Zhang, H.: Stability of multi-class queueing networks with infinite virtual queues. Queueing Syst. 76(3), 309–342 (2014)
Jackson, J.R.: Networks of waiting lines. Oper. Res. 5, 518–521 (1957)
Kelly, F.P.: Reversibility and Stochastic Networks. Wiley, Chichester (1979)
Kopzon, A., Weiss, G.: A push-pull queueing system. Oper. Res. Lett. 30, 351–359 (2002)
Kopzon, A., Nazarathy, Y., Weiss, G.: A push-pull network with infinite supply of work. Queueing Syst. 65, 75–111 (2009)
Levy, Y., Yechiali, U.: Utilization of idle time in an M/G/1 queueing system. Manage. Sci. 22, 202–211 (1975)
Mylosz, J.: Local stabilization of non-ergodic Jackson networks with unreliable nodes. Ph.D. Thesis, Hamburg (2013)
Mylosz, J., Daduna, H.: On the behavior of stable subnetworks in non-ergodic networks with unreliable nodes. Comput. Netw. 53(8), 1249–1263 (2009)
Melamed, B.: On Poisson traffic processes in discrete-state Markovian systems with applications to queueing theory. Adv. Appl. Probab. 11(1), 218–239 (1979)
Nazarathy, Y., Weiss, G.: Positive Harris recurrence and diffusion scale analysis of a push pull queueing network. Perform. Eval. 67, 201–217 (2010)
Sauer, C., Daduna, H.: Availability formulas and performance measures for separable degradable networks. Econ. Qual. Control 18, 165–194 (2003)
Sommer, J., Daduna, H., Heidergott, B.: Jackson networks with infinite supply and unreliable nodes. Schwerpunkt Mathematische Statistik und Stochastische Prozesse, Fachbereich Mathematik der Universität Hamburg, Hamburg (2014). (Preprint 2014-03)
Sommer, J., Daduna, H., Heidergott, B.: Nonergodic Jackson networks with infinite supply—local stabilization and local equilibrium analysis. J. Appl. Probab. 53, 1125–1142 (2016)
Stidham, S.: \(L = \lambda \cdot W\): a discounted analogue and a new proof. Oper. Res. 20, 1115–1126 (1972)
Stidham, S.: A last word on \(L = \lambda \cdot W\). Oper. Res. 22, 417–421 (1974)
Vandaele, N., Van Woensel, T., Verbruggen, A.: A queuing based traffic flow model. Transp. Res. D Transp. Env. 5, 121–135 (2000)
Weiss, G.: Jackson-networks with unlimited supply of work. J. Appl. Probab. 42, 879–882 (2005)
Van Woensel, T., Vandaele, N.: Modelling traffic flows with queueing models: a review. Asia Pac. J. Oper. Res. 24(04), 1–27 (2006)
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix
Proof of Theorem 2
(i) Assume first that all nodes are in up status (\(I=\emptyset \)). We start the proof with evocation of the subnetwork argument from the proof of Theorem 13 in [18]. It guarantees that the subnetwork W constitutes a Jackson network where the source and sink represent \(\{0\}\cup V\). The corresponding queueing process \(\tilde{X}:=((\tilde{X}_i(t):i\in W):t\in \mathbb {R}_+)\) is a Markov process of its own. The traffic equations of the described subnetwork W are given by
so \(\eta _i=\tilde{\eta }_i\) holds for all \(i\in W\). According to Jackson’s theorem (see [7]), \(\tilde{X}\) has the unique stationary and limiting distribution
because \(\eta _i<\mu _i\) for all \(i\in W\). Thus, even if the subnetwork V of nodes with infinite supply is not in equilibrium, the equilibrium on the subnetwork W of nodes without infinite supply is preserved, if the initial distribution has the joint marginal (18).
This joint queue-length process \(\tilde{X}\) is coupled with an availability process Y which only depends on the interaction of the nodes in \(D\subseteq \tilde{J}\) but not on their load. Whenever a node in D breaks down, stalling occurs, so all nodes go into a warm standby and all arrivals and services are interrupted until all nodes return to the up status. The network process \((Y,\tilde{X})\) is a Markov process on the state space \(\mathcal {P}(D)\times \mathbb {N}^{|W|}\). The balance equations for the subnetwork W are, for all \((\emptyset ,n_k:k\in W)\in \{\emptyset \}\times \mathbb {N}^{|W|}\), given by
and, for all \((I,n_k:k\in W)\in \mathcal {P}(D)\times \mathbb {N}^{|W|}\) with \(I\ne \emptyset \),
We have to show that (13) solves these equations. In the following we denote
for all \((I,n_k:k\in W)\in \mathcal {P}(D)\times \mathbb {N}^{|W|}\), which is (13) before normalization, and plug it into the above balance equations instead of \(\pi (I,n_k:k\in W)\).
In the first equation (19) the term
on the left-hand side is equal to the term \( \hat{\pi }(I,n_k:k\in W) \beta (I,\emptyset )=\hat{\pi }(I,n_k:k\in W)B(I) \) on the right-hand side for each \(\emptyset \ne I\subseteq D\). The remainder of (19) is the global balance equation of a classical Jackson network which has the solution (see [7])
Consider the second equation (20) for some fixed \(I\ne \emptyset \). For any \(K\subset I\), \(K\ne \emptyset \), the term
on the left-hand side is equal to the following term on the right-hand side:
Moreover, for any \(I\subset H \subseteq D\), the term
on the left-hand side is equal to the term
on the right-hand side.
The proof of (i) is finished by normalization, which is possible because \(\eta _i<\mu _i\) holds for all \(i\in W\).
(ii) It is well-known that ergodic Jackson networks have, in equilibrium, Poisson departure streams from node i to the sink with rate \(\tilde{\eta }_i \tilde{r}(i,0)\); see [14, Example 7.1]. From the proof of (i), we know that the subset W behaves like an ergodic Jackson network with unreliable nodes of its own with \(\tilde{\lambda }_i:=\lambda _i +\sum _{j\in V} \mu _j r(j,i)\) and
Hence, if the subnetwork W is in equilibrium, as long as all nodes are in up status, departures to the sink from nodes \(i\in W\) are Poisson streams with rate \(\eta _i r(i,0)\) and departures from \(i\in W\) to any node \(j\in V\) are also Poisson streams with rate \(\eta _i r(i,j)\), because a portion \({r(i,j)}/({r(i,0)+\sum _{j\in V}r(i,j)})\) of the departure stream from node \(i\in W\) is directed to \(j\in V\).
(iii) Under the condition that all nodes \(j\in \tilde{J}\) are in up status, we start the proof with evocation of the M / M / 1 argument from the proof of Theorem 13 in [18].
This argument leads to the conclusion that if the subnetwork W is in equilibrium and if \(r(i,i)=0\), node \(i\in V\) behaves as an M / M / 1-system of its own. The corresponding queue-length process \(\hat{X}\) is a birth–death process on state space \(\mathbb {N}\) with birth rates \(\hat{\lambda }_i = \eta _i\) and death rates \( \mu _i\).
This queue-length process \(\hat{X}\) is here coupled with an availability process Y on \(\mathcal {P}(D)\), \(D\subseteq \tilde{J}\), where breakdown and repair of nodes only depend on the interaction of the nodes but not on their queue length. Whenever a node in D breaks down, stalling occurs, so all nodes go into a warm standby and all arrivals and services are interrupted until all nodes return to the up status.
The network process \((Y,\hat{X})\) is a Markov process on the state space \(\mathcal {P}(D)\times \mathbb {N}\). The balance equations are
for all \((\emptyset ,n_i)\in \{\emptyset \}\times \mathbb {N}\) and
for all \((I,n_i)\in \mathcal {P}(D)\times \mathbb {N}\) with \(I\ne \emptyset \).
We have to show that (14) solves these equations. In the following we set
for all \((I,n_i)\in \mathcal {P}(D)\times \mathbb {N}\) as the non-normalized proposed solution density.
In the first equation (21) the term
on the left-hand side is equal to the term \( \hat{\pi }_i(I,n_i) \beta (I,\emptyset )=\hat{\pi }_i(I,n_i)B(I) \) on the right-hand side for each \(\emptyset \ne I\subseteq D\). The remainder of (21) is the global balance equation of an M / M / 1-system which has the solution
since \(\hat{\lambda }_i=\eta _i\) holds.
Consider the second equation (22) for some fixed \(I\ne \emptyset \). For any \(K\subset I\), \(K\ne \emptyset \), we have
and
which yields
Moreover, for any \(I\subset H \subseteq D\), we have
and
which implies
The proof of (iii) is finished by normalization, which is possible from \(\eta _i<\mu _i\).
The limiting probability (15) for unstable nodes with infinite supply follows from the same arguments as in the proof of Theorem 15 in [13].
Proof of Theorem 3
Consider the subset W of nodes without infinite supply. For any subset \(I\subseteq D\) of broken down nodes, we have the following facts for the subset \(W{\setminus }I\) which remain in force as long as I is unchanged:
-
All service times of all up nodes are exponentially distributed, and the service discipline at all nodes is FCFS.
-
Routing of customers is Markovian: A customer completing service at node \(i\in W{\setminus }I\) will either move to some node \(j\in W{\setminus }I\) with probability \(r^I(i,j)\) or leave the subnetwork with probability \(1-\sum _{j\in W{\setminus }I}r^I(i,j)\).
-
At each node \(i\in W{\setminus }I\), we have external arrivals from the source which are independent Poisson streams with rate \(\lambda ^I_i\ge 0\). Furthermore, all arrivals from nodes \(j\in V{\setminus }I\) with infinite supply into nodes \(i\in W{\setminus }I\) are independent Poisson streams at rate \(\mu _j r^I(j,i)\); see Theorem 1. The sum of independent Poisson streams is a Poisson stream; hence, the arrival stream from the outside of the subset \(W{\setminus }I\) into each node \(i\in W{\setminus }I\) is a Poisson process with rate \(\lambda ^I_i + \sum _{j\in V{\setminus }I} \mu _j r^I(j,i)\).
-
All service times and all inter-arrival times are independent of each other.
Let \(\tilde{X}:=((\tilde{X}_i(t):i\in W{\setminus }I):t\in \mathbb {R}_+)\) be the queueing process of this subnetwork. The process is supplemented with a Markov process \(Y=(Y(t):t\in \mathbb {R}_+)\) which describes the availability status of the nodes and therefore gives information on how long the network process on the subnet \(W{\setminus }I\) lives until it jumps to the next Markov process on some randomly chosen subnet \(W{\setminus }K\), \(K\subseteq D\). Rerouting is according to the blocking rs–rd regime (skipping, resp.). The balance equations of the joint availability queue-length process \((Y,\tilde{X}_i:i\in W)\) are, \(\forall (I,n_i:i\in W)\in \mathcal P(D) \times \mathbb {N}^{|W|}\),
We have to show that the distribution given by (16) solves equation (23) for all \((n_i:i\in W)\in \mathbb {N}^{|W|}\) and all \(I\subseteq D\). In the following we set
for all \((n_i:i\in W)\in \mathbb {N}^{|W|}\) and all \(I\subseteq D\), and consider equation (23) for some fixed \(I\subseteq D\).
For any \(K\subset I\), \(K\ne \emptyset \), it holds that
and
which yields
Moreover, for any \(I\subset H \subseteq D\), the term
and
which yields
The remainder of (23) is
With \(\eta ^I_i=\lambda ^I_i + \sum _{j\in W{\setminus }I} \eta ^I_j r^I(j,i) + \sum _{j\in V{\setminus }I} \mu _j r^I(j,i)\) [see (7)] this is equivalent to
Under the required condition of either (8) and (9) in the case of blocking rs–rd or (11) in the case of skipping, \(\eta _i=\eta ^I_i\) for all \(i\in W{\setminus }I\) and all \(I\subseteq D\) for the respective reduced traffic equations. Therefore, from Lemmas 1 or 2, respectively, this is equivalent to
Plugging in \(\hat{\pi } (I,n_k:k\in W)=\frac{A(I)}{B(I)} \prod _{i\in W} \left( \frac{\eta _i}{\mu _i}\right) ^{n_i}\) yields
which shows that
Thus, \( \hat{\pi } (I,n_k:k\in W)=\frac{A(I)}{B(I)} \prod _{i\in W} \left( \frac{\eta _i}{\mu _i}\right) ^{n_i} \) solves balance equations (23). The last step of proving the theorem is by normalizing \(\hat{\pi }\), which is possible because \(\eta _i<\mu _i\) holds for all \(i\in W\).
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Sommer, J., Berkhout, J., Daduna, H. et al. Analysis of Jackson networks with infinite supply and unreliable nodes. Queueing Syst 87, 181–207 (2017). https://doi.org/10.1007/s11134-017-9542-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11134-017-9542-1