Analysis of Jackson networks with infinite supply and unreliable nodes

Jackson networks are versatile models for analyzing complex networks. In this paper we study generalized Jackson networks with single-server stations, where nodes may have an infinite supply of work. We allow simultaneous breakdown of servers and consider group repair strategies. We establish the existence of a steady-state distribution of the queue-length vector at stable nodes for different types of failure regimes. In steady state the distribution of the failure/repair regime and of the queue-length vector at stable nodes decouples in a product-form way. We provide closed-form solutions for the classical performance measures such as throughput or mean sojourn time at a station.


Introduction
Jackson networks (henceforth, JN) are a well-established class of models for, for example, production, telecommunication, computer systems; for surveys see [8] and [3]. JN have the property that the distribution of the stationary queue-length vector is of product form, which allows for quick numerical evaluation of performance measures, such as mean queue length, mean sojourn times and throughput at nodes. Further modifications with the product-form property of JN have been developed, which cover additional features relevant in modeling networks in the fields mentioned. This paper addresses the following extensions: (I) Infinite supply [22] has the aim of utilizing capacity of a server to the fullest.
Examples: In service center models an agent, when not answering a call, switches to low-priority works such as answering e-mail and administrative duties. In a model of a highway network, an infinite supply node represents a highway segment with an on-ramp during a rush-hour period where a constant flow of vehicles requiring access to the highway is present. In production processes, an additional inventory with raw material guarantees that the machine will not idle even when there is sometimes no external demand. (II) Breakdown and repair [16] of servers with (i) simultaneous breakdown of groups of servers and (ii) group repair strategies. For example, repairing several servers simultaneously may lead to more efficient repair actions and thus reduce repair time. (III) Unstable nodes [5] in the network, where instability of nodes may be due to overload generated by nodes without infinite supply or by nodes with infinite supply, or both.
While (I) and (III), and the combination thereof, were already dealt with in the literature, the interaction of Jackson networks with infinite supply, instability of nodes and reliability issues has, to the best of to our knowledge, not found the deserved investigation.
To fill this gap, we show that the distribution of the stationary queue-length vector of JN with the above features still is of product form. Thereafter, starting from this result, in a section with applications we compute classical performance measures such as throughput, mean queue length and mean service time, which are analytical in the system parameters.
The paper is organized as follows: We start with the definition of the different extended JN models in Sect. 2. In Sect. 3 we illustrate these by a motivating example to clarify the basic notation and concepts. A literature review is provided in Sect. 4. The main technical analysis is performed in Sect. 5, resulting in product-form stationary and/or limiting distributions in Sect. 5.2, and in Sect. 5.3 we provide explicit solutions for standard availability and performance measures, partially computed in nonstandard settings (non-ergodic networks). The main proofs are postponed to the Appendix.
At node i a Poisson stream with rate λ i ≥ 0 arrives from the exterior represented by node 0, and we let λ := i∈J λ i > 0. Service times at node i are exponential with rate μ i . All service times constitute an independent family of variables which are independent of the arrival streams. Customers are indistinguishable and follow the same rules, i.e., there is just one class of customers.
Routing is Markovian; a customer departing from node i immediately proceeds to node j, with probability r (i, j) ≥ 0 and departs from the network with probability r (i, 0) ≥ 0. Given the departure node i, the routing is independent from the history of the network. Taking r (0, j) = λ j /λ, r (0, 0) = 0, we assume that the extended routing matrix R = (r (i, j) : i, j ∈J 0 ) is irreducible.
In the following overview we indicate the variations of Jackson networks discussed in this paper, where the roman numbers refer to the labels provided in the Introduction.

Jackson networks with infinite supply (I)
The standard Jackson network is modified as follows: Nodes in V ⊆J have an infinite supply from which customers are put into an idling server. We denote W :=J \V and require V = ∅ (unless otherwise specified).
Customers from the infinite supply have low priority, and (standard) customers arriving from the outside or from another server have high priority with preemptiveresume regime: Service of a low-priority customer is interrupted as soon as a highpriority customer arrives. Service of low-priority customers is resumed only when the server idles again. When a low-priority customer is served and fed into the network, he/she becomes a high-priority customer and follows the rules for standard customers.
Service times of low-priority customers are independent of the external arrival streams and the service times of high-priority customers.

Jackson networks with infinite supply and unreliable nodes (II) and (III)
Nodes (servers) in the set D ⊆J may break down and are repaired thereafter. In the following we denote by P(A) the power set of a set A.
With these functions, we set, for all subsets of down nodes I ⊆ D, The following regime is in force whenever a node breaks down: -service at this node is interrupted, customers (of high as well as of low priority) are frozen there to wait for restart of the service, which is resumed at the point where it was preempted, -no new customers are admitted to enter that node, -customers who select a broken down node to visit are rerouted according to one of the classical rules: stalling, skipping or blocking rs-rd, which will be defined below, -all these rules, if applicable, are valid for high-and low-priority customers.
Rerouting is a function of Y and applies only to high-priority customers, because on departure from a node with infinite supply low-priority customers are transformed immediately to high priority and only thereafter are rerouted.

Definition 2
If nodes in I ⊆ D are down, a new routing scheme is set in force which is determined by a routing matrix Construction of rerouting schemes R I follows [16], and we distinguish the following rerouting schemes: 1. Skipping If a customer selects a down node as destination, the customer jumps to this node, spends no time there and immediately performs the next jump according to routing regime R until he arrives at a node in up status or leaves the network. So, if nodes in I ⊆ D are down, customers are rerouted according to the routing matrix R I = (r I (i, j) : i, j ∈ {0} ∪J \I ): The external arrival rates during a breakdown of I are and λ I k = 0 for k ∈ I . The service intensities are The external arrival rates during a breakdown of I are λ I j = λ j , for j ∈J \I , and λ I j = 0 otherwise. The service intensities are 3. Stalling Whenever a node breaks down the service system is frozen: All arrival processes are interrupted and service anywhere in the network is stopped until all broken down nodes are repaired again. So, if nodes in I = ∅ are broken down then for all i ∈J the I -dependent rates are set to λ I i = μ I i = 0. The stopped nodes which are in up status are waiting in warm standby, i.e., they can break down although they are stalled. Stalling is applied, for example in the automotive industry to decrease variability of the flow of materials. Indeed, stalling prevents servers to send parts to a server that is broken down and thereby prevents piling up inventory.

Remarks
The parametric form (1) of the breakdown and repair rates stems from a versatile recipe to construct correlated multi-dimensional birth-death processes. With suitable functions A, B we can model, for example that nodes may break down in isolation or in groups, and repair may happen similarly. It is not required that nodes which are broken down simultaneously are repaired at the same time.

Example of breakdown and repair process
We explain the basic setup for our analysis with an example of a Jackson network with J = 7 nodes, as depicted in Fig. 1. There are two arrival streams with arrival rates λ 1 and λ 2 . Possible routes are indicated by arrows. The set of nodes V with infinite supply is indicated in the figure by additional incoming dotted arrows. We have V = {1, 5, 7} and W = {2, 3, 4, 6}. The network displayed in Fig. 1 is inspired by a model of a highway network. Nodes represent road segments and nodes with infinite supply model road segments with an on-ramp, where it is assumed that there is constant flow of incoming traffic to the on-ramp. Note that in queueing models for traffic systems one typically only models the flow in one direction and fits the service rate of the queues to the traffic characteristics; see, for example [21,23].
Suppose that λ 1 < μ 1 , μ 3 and μ 1 > μ 3 . Then, without infinite supply at node 1, node 3 is stable in the classical sense as the arrival rate λ 1 to node 3 is smaller than the service rate μ 3 . In case of infinite supply at node 1, however, node 1 acts as Poisson μ 1 arrival stream to node 3. This causes node 3 to become unstable. Whether a node is stable or not can be decided from the traffic equations of the network; details are provided in Sect. 5.1. We let S ⊆J denote the set of stable nodes and U the set of unstable nodes.
The . Eventually, given that node i ∈ D is broken down, the breakdown rate of the other node is whereas from breakdown scenario {3, 5} node i ∈ D alone has repair rate All other values for breakdown and repair rates are zero.
These breakdown and repair rates from (1) define the generator for the Markov process Y = (Y (t) : t ≥ 0) on state space (P(D), 2 P(D) ). By inspection we see that for all K , G ∈ P(D), which implies that, after normalization, π is the steady state of the breakdown and repair process. Even more, we have proved that Y is reversible.
We have only three possible breakdown scenarios, and the repair process has states with normalizing constant

Literature review and related work
Investigation of generalized Jackson networks with infinite supply has recently found much interest in the literature, and it turned out that the feature of infinite supply makes analysis of the network considerably harder than that of classical product-form networks of the BCMP and Kelly type. Infinite supply of lower-priority work is used frequently, for example in [11] in an M/G/1 queueing system to utilize idle times. Recent works using this concept of infinite supply are, for example [10], where a push-pull network with infinite supply is investigated.
A special class of multi-class queueing networks with virtual infinite buffers has been introduced in [9] and [2]. For single-class ergodic networks of Jackson type with infinite supply of work at some nodes, Weiss [22] has obtained a product-form solution of the steady-state queue-length distribution at nodes without infinite supply. He discussed as an example a particular computer communication system that works according to MAN (metropolitan area network) Ethernet RPR (resilient packet ring), where ring traffic has priority over traffic generated at nodes.
Another application from a different field where such models fit is in wireless sensor networks. The nodes (sensors) continuously sense their environment and have to forward the data to a central station (sink). This is usually not possible by direct communication, so the nodes act additionally as transmission stations for data from other sensor nodes. If forwarding transmissions from other nodes has priority, the node's own data constitute the infinite buffer which generates the infinite supply for the node.
The work of Guo et al. [6] is on general multi-class queueing networks with infinite supply under different scheduling policies for the servers. These policies guide the nodes' decisions how to dedicate their activities to either the regular standard queues or the infinite virtual queues. The key research question is the interplay of the production of jobs from the infinite supply and stability of the standard queues. Ergodicity problems for such systems are considered in [15]. Another class of models where additional work is added whenever a server becomes idle is queues with vacations. If a server observes an empty queue "he goes away to serve a customer elsewhere" and returns thereafter. If he finds customers waiting there, he immediately starts serving them, but if his queue is still empty on his return he takes "another vacation," and so on. For a survey, see [4].
The interplay of nodes with infinite supply and local stability issues has been studied in depth in [18].

Network processes for unreliable Jackson networks with infinite supply
We consider the Markov process for Jackson networks which at some nodes have infinite supply and where some nodes break down randomly and are repaired thereafter. Breakdowns of nodes in standard Jackson networks were investigated in [16] and [13]. It turns out that breakdown of nodes with infinite supply requires a more specific regime to control breakdown and repair. Consider a Jackson network with infinite supply and the unreliable nodes, and the rerouting regime is either stalling, skipping or blocking rs-rd with the respective rerouting matrices R I according to Definition 2. Denote R ∅ := R. Then the joint Markovian availability queue-length process (Y, and q(z, z ) = 0 otherwise for z = z , where 1 A (·) denotes the indicator mapping with respect to the set A.
The following theorem yields a characterization of the departure streams from nodes. For a proof, we refer to the technical report [17].

Theorem 1 With the above definitions:
(i) If at time t all nodes are up, the departure stream from node j ∈ V is a Poisson process with rate μ j . Thus, the departure stream from j ∈ V to i ∈J is Poisson with rate μ j r ( j, i). (ii) Whenever nodes in I = ∅ are broken down and either skipping or blocking rs-rd is in force, the departure stream of node j ∈ V \I with infinite supply in up status is Poisson with rate μ j . The departure stream from j ∈ V \I to i ∈J \I is Poisson with rate μ j r I ( j, i). If a node k ∈ V with infinite supply is broken down, i.e., k ∈ V ∩ I , the departure stream of this node is interrupted until its server is repaired.
In the case of stalling, all arrival streams stop whenever a breakdown occurs (I = ∅) and are reactivated when all nodes return to the up status.

Extended traffic equations
Different traffic equations are required to analyze the long-time behavior.

Definition 3
The general traffic equations for Jackson networks with infinite supply (without breakdown and repair) are Node i is stable if η i from (4) satisfies η i < μ i ; otherwise, the node is unstable.
The notion of a stable node in a network was introduced by Goodman and Massey [5] when investigating Jackson networks where the describing Markov process X is not necessarily ergodic. If X is ergodic (often called a stable process), all nodes are stable, but as Goodman and Massey have shown it is a valuable distinction which separates the notions of ergodicity and stability. It is worth noting that for networks without unstable nodes the traffic equations in Definition 3 reduce to the (standard) traffic equations of a Jackson network with infinite supply [22]: For the different breakdown regimes, the traffic equations put forward in Definition 3 are extended as follows: Definition 4 The (standard) traffic equations for unreliable Jackson networks with infinite supply are as follows: (i) In the case of stalling, as long as all nodes are up (I = ∅). Otherwise, η I i = 0 for all i ∈J . (ii) In the case of blocking rs-rd or skipping, for all I ⊆ D, The traffic equations for some I ⊆ D are in force only as long as the availability status is unchanged. Whenever the availability status of the system changes, the traffic equations are adapted. Thus, each traffic equation (7) may have different solutions for different I . The next lemmata provide constraints such that the solutions of (7) are invariant onJ \I for all ∅ ⊆ I ⊆ D.

Lemma 1 Consider a Jackson network where nodes in D ⊆J are unreliable, and nodes in V ⊆J have an infinite supply of work. For all nodes
is the unique solution of (5), resp. (6). In case of breakdowns of nodes customers are rerouted according to the blocking rs-rd regime.
(i) If the following reversibility constraints hold:  (8) and (9) hold. If we additionally require the reversibility constraint then η I i = η i for all i ∈ V \I are solutions of (7) for all I ⊆ D.
Proof (i) For all i ∈ W \I and all I ⊆ D we make the ansatz η i = η I i with η i the solution of the traffic equations (5). Inserting we obtain = μ k r (k,i) (ii) For any I ⊆ D, the following holds ∀i ∈ V \I :

Remark 1
The constraints (8), (9) and (10) are different from the classical reversibility constraints which are the local balance equations of the routing process. But the interpretation of (8), (9) and (10) is similar: Customer flow from i to j equals the customer flow from j to i.
For rerouting in order to skip broken down nodes, we assume that the unreliable nodes in V are rate stable in the sense of [10, p. 76], i.e., these nodes have equal input and output rates.

Lemma 2
For the solution (η i : i ∈J ) of (5) let the following hold: For all nodes i ∈ W without infinite supply η i < μ i , and If, in the case of breakdowns of nodes in I ⊆ D, customers are rerouted by skipping I , then the traffic equation (7) is solved by (η I i := η i , i ∈J \I ). Proof We make the ansatz η i = η I i for all i ∈ W \I and all I ⊆ D. We then obtain, with the solution η i of the traffic equations (5), for any I ⊆ D: ∀i ∈ W \I , = r I ( j,i) Since η I j = η i holds for all j ∈ W \I and all I ⊆ D, it follows that, for all i ∈ V \I and I ⊆ D, which is the left-hand side of (12) with i ∈ V \I . The above computations in (12) are valid for all i ∈J \I ; hence, it follows that η I i = η i for all i ∈ V \I and I ⊆ D, too.
We illustrate the adjusted traffic equations with the example from Sect. 3, where nodes in D = {3, 5} are unreliable. Then condition (11) requires η 5 = μ 5 because node 5 ∈ D ∩ V , and node 5 is not stable. So, (4) is the relevant traffic equation. Due to the network's feedforward structure we can evaluate the arrival rates directly.
If nodes in D = {3, 5} are down, under stalling η i = 0 for all i.
In the case of skipping, we obtain η I i = η i for i ∈ {1, 2, 4, 6, 7}, and η I i = 0 for i = 3, 5. Blocking rs-rd cannot be implemented in this network, if η I i = η i , for i ∈ {1, 2, 4, 6, 7} is required because the routing chain is not reversible.
In the following we present examples of networks that fulfill the requirements to apply blocking rs-rd: a network with linear topology and a star-shaped topology network.
for 0 < a, b < 1, and λ i > 0, i = 1, 2. The network is a two-way tandem of three nodes; see Fig. 2. The infinite supply is depicted by a dashed arrow pointing to server 2, and the node that is prone to failure is depicted as bold circle. Note that by incorporating node 0, the linear topology is transformed into a ring. For ease of analysis we parameterize the model and set λ 1 = (1 − a)t, for t > 0, λ 3 = at, and b = 1 − c. The service rates are The standard traffic equations (4) then have the solution Note that this implies that U = {2}. Jobs arrive from outside with rate λ at the central node 1. From node 1 they go with probability r/5 to any of the nodes 2 to 6, for r ∈ (0, 1). After finishing service at node i = 2, . . . , 6, jobs are sent back to the central node 1. Being served there, they either leave the system with probability 1 − r , or are sent back to one of the servers in the set {2, . . . , 6} according to the routing scheme described above. The network is a star-shaped network; see Fig. 3. Infinite supply is depicted by dashed arrows, and nodes prone to failure are depicted by bold circles. The traffic equations are where η 5 = η 6 = r η 1 /5, provided that nodes 5, 6 are stable. For blocking rs-rd and skipping to be applicable, we let which implies η 1 = λ/(1 − r ), and thus r 5 Eventually, we let in order to let 5, 6 be stable nodes. Indeed, this choice implies η i < μ i , i = 5, 6. The above conditions imply that the reversibility conditions from Lemma 1 and the rate stability condition from Lemma 2 are satisfied.

Long-time behavior
In this section we study the long-time behavior of extended Jackson networks. (η 1 , . . . , η J ) the solution of the traffic equations (5). Under stalling the following hold:

Theorem 2 Let W ⊆ S (nodes without infinite supply are stable). Denote by η =
(i) For nodes without infinite supply, the joint marginal limiting distribution is and this is a stationary distribution on W as well.
(ii) If the global network process is started with an initial distribution which has the marginal (13) on W , the arrival stream from i ∈ W to j ∈ V is Poisson with rate η i r (i, j) whenever all nodes are in up status. These streams are independent given the nodes are up.
(iii) Assume the global network process is started with an initial distribution which has marginal (13) on W . Then the marginal limiting distribution for a stable node i ∈ V with r (i, i) = 0 is, for all I ⊆ D and all n i ∈ N, if and only if η i < μ i . This is a one-dimensional stationary distribution as well.
Moreover, if η i ≥ μ i for node i ∈ V , then for its limiting probability the following holds: The proof and the proof of the next theorem are postponed to the Appendix. η = (η 1 , . . . , η J ) the solution of the traffic equations (5). In the case of breakdowns, customers are rerouted according to the blocking rs-rd regime or the skipping regime. If blocking rs-rd is in force we require the reversibility constraints (8) and (9). If skipping is in force, let (11) hold. Then for nodes without infinite supply the joint marginal limiting distribution is

Theorem 3 Let W ⊆ S (nodes without infinite supply are stable). Denote by
and this is a stationary distribution on W as well.
The results of Goodman and Massey [5] are on classical Jackson networks where some nodes are not stable. They prove a product-form limiting distribution for the stable subnetwork, but there is no such result for a stationary distribution. The reason is that the exploding unstable nodes influence the stable part of the network. Over any finite (transient) time horizon [0, t] the departure streams from the unstable nodes are not Poisson. Put differently, only the limiting distribution is known. Fortunately, the proofs of Theorem 2(i) and (iii) and of Theorem 3 allow us to establish the following result without assuming ergodicity of the whole process:

Corollary 1 Under the conditions of Theorems 2 and 3, the process:
is an ergodic homogeneous Markov process of its own. If, in addition, for i ∈ V it holds that η i < μ i , then the process is an ergodic homogeneous Markov process of its own for i ∈ V .
Remark 2 In the setting of Theorem 3 a statement as in Theorem 2(iii) cannot be proved with the methods used here. This is due to the requirement that r (i, i) = 0 and the properties of the rerouting regimes skipping and blocking rs-rd. Whenever nodes in I = ∅ are down, immediate feedback may emerge even at nodes in V with r (i, i) = 0. If i ∈ V \I and r (i, j) > 0 for at least one j ∈ I , then r I (i, i) > 0 may occur.
On the other hand, if the network's topology prevents occurrence of feedback by skipping or rs-rd regime in case of breakdown, it is possible to prove a counterpart to Theorem 2(iii) in the setting of Theorem 3.
Our motivating example from Sect. 3 in Fig. 1 is a feedforward network according to the following definition. Feedforward networks constitute an important subclass of Jackson networks.
Feedforward networks are not reversible, and therefore in the case of breakdowns we must recur to stalling or skipping as a rerouting scheme. The following property of feedforward networks is intuitive.

Lemma 3
If, in a feedforward network with node setJ = {1, 2, . . . , J }, a subset ∅ ⊆ I ⊆J of nodes is down and either skipping or stalling is applied as a rerouting scheme, then and therefore there is no immediate feedback at all nodes. η = (η 1 , . . . , η J ) the solution of the traffic equations (5).

Theorem 4 Consider a feedforward network with W ⊆ S (nodes without infinite supply are stable). Denote by
In the case of breakdowns customers are rerouted according to the skipping regime. Assume that (11) holds and that the global network process is started with initial distribution which has the marginal (13) on W . Then the marginal limiting distribution for a stable node i ∈ V is, for all I ⊆ D and all n i ∈ N, if and only if η i < μ i . This is a one-dimensional stationary distribution as well. Moreover, if η i ≥ μ i for node i ∈ V , then for its limiting probability The proof is similar to that of Theorem 2, part (iii), with Lemma 3.

Applications
Standard performance evaluation requires ergodicity of the underlying Markov processes which allows one to approximate long-time average cost functions by integrals of the cost function under stationary distributions. Our framework overcomes this restriction and allows us to investigate even nonergodic networks across subnetworks where stabilization occurs only in the long run. As stated in Corollary 1, some important subnetworks of stable nodes can be considered as networks of their own. Therefore, for these parts we can extend the traditional analysis directly. But we emphasize that even if there exists no equilibrium on the stable subnetworks, performance analysis for long-time averages of cost functions is possible via integrals of the cost function under the limiting distribution on stable nodes, respective subnets. For details we refer to Section 4.2 in [13] and Section 4.6.4 in [12]. In the following, we will state our results for the setting of Corollary 1.
The availability process Y in Theorems 2 and 3 is an ergodic Markov process of its own with limiting and stationary distribution From this the stationary (time) point availability of a Jackson network with infinite supply and unreliable nodes (or subnetworks thereof) may be computed similarly to [16, p.185] as PA(H )(t) := K ⊆D\H π(K ), for H ⊆ D, t ≥ 0, where π(I ) is the probability that exactly the nodes in I ⊆ D are under repair, given by (17). We provide an overview of main performance characteristics. Under stalling, the stationary throughput at node i ∈ W (no infinite supply) is η i · π(∅), the mean queue length is (η i /μ i ) (1 − (η i /μ i )) −1 , and the mean waiting Under blocking rs-rd and skipping, the stationary throughput at a node i ∈ W is η i · I ⊆D,i / ∈I π(I ), the mean queue length at node i is (η i /μ i ) (1 − (η i /μ i )) −1 , and the mean waiting time (μ i − η i ) I ⊆D,i / ∈I π(I ) −1 .
The proof of the above properties follows from Theorems 2, 3 and 4, which show that the asymptotic mean queue length at a stable node can be computed. Evoking Little's law, see [19,20], mean waiting times at stable nodes follow.

Conclusion
We have integrated in Jacksonian networks breakdown and repair of servers together with infinite supply servers and unstable network parts in one framework. We obtained closed-form solutions of the steady-state queue-length distribution at stable nodes and for key performance measures. Future research will be on extending our results to state-dependent and, more generally, to path history-dependent failure rates.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

A Proof of Theorem 2
(i) Assume first that all nodes are in up status (I = ∅). We start the proof with evocation of the subnetwork argument from the proof of Theorem 13 in [18]. It guarantees that the subnetwork W constitutes a Jackson network where the source and sink represent {0} ∪ V . The corresponding queueing processX := ((X i (t) : i ∈ W ) : t ∈ R + ) is a Markov process of its own. The traffic equations of the described subnetwork W are given bỹ so η i =η i holds for all i ∈ W . According to Jackson's theorem (see [7]),X has the unique stationary and limiting distribution because η i < μ i for all i ∈ W . Thus, even if the subnetwork V of nodes with infinite supply is not in equilibrium, the equilibrium on the subnetwork W of nodes without infinite supply is preserved, if the initial distribution has the joint marginal (18). This joint queue-length processX is coupled with an availability process Y which only depends on the interaction of the nodes in D ⊆J but not on their load. Whenever a node in D breaks down, stalling occurs, so all nodes go into a warm standby and all arrivals and services are interrupted until all nodes return to the up status. The network process (Y,X ) is a Markov process on the state space P(D) × N |W | . The balance equations for the subnetwork W are, for all (∅, n k : k ∈ W ) ∈ {∅} × N |W | , given by and, for all (I, n k : k ∈ W ) ∈ P(D) × N |W | with I = ∅, We have to show that (13) solves these equations. In the following we denotê for all (I, n k : k ∈ W ) ∈ P(D) × N |W | , which is (13) before normalization, and plug it into the above balance equations instead of π(I, n k : k ∈ W ). In the first equation (19) the term π(∅, n k : k ∈ W )α(∅, I ) =π(∅, n k : k ∈ W )A(I ) =π(I, n k : k ∈ W )B(I ) on the left-hand side is equal to the termπ(I, n k : k ∈ W )β(I, ∅) =π(I, n k : k ∈ W )B(I ) on the right-hand side for each ∅ = I ⊆ D. The remainder of (19) is the global balance equation of a classical Jackson network which has the solution (see [7]) Consider the second equation (20) for some fixed I = ∅. For any K ⊂ I , K = ∅, the term on the left-hand side is equal to the following term on the right-hand side: Moreover, for any I ⊂ H ⊆ D, the term on the left-hand side is equal to the term on the right-hand side. The proof of (i) is finished by normalization, which is possible because η i < μ i holds for all i ∈ W .
(ii) It is well-known that ergodic Jackson networks have, in equilibrium, Poisson departure streams from node i to the sink with rateη ir (i, 0); see [14,Example 7.1]. From the proof of (i), we know that the subset W behaves like an ergodic Jackson network with unreliable nodes of its own withλ i := λ i + j∈V μ j r ( j, i) and Hence, if the subnetwork W is in equilibrium, as long as all nodes are in up status, departures to the sink from nodes i ∈ W are Poisson streams with rate η i r (i, 0) and departures from i ∈ W to any node j ∈ V are also Poisson streams with rate η i r (i, j), because a portion r (i, j)/(r (i, 0) + j∈V r (i, j)) of the departure stream from node i ∈ W is directed to j ∈ V .
(iii) Under the condition that all nodes j ∈J are in up status, we start the proof with evocation of the M/M/1 argument from the proof of Theorem 13 in [18].
This argument leads to the conclusion that if the subnetwork W is in equilibrium and if r (i, i) = 0, node i ∈ V behaves as an M/M/1-system of its own. The corresponding queue-length processX is a birth-death process on state space N with birth rateŝ λ i = η i and death rates μ i . This queue-length processX is here coupled with an availability process Y on P(D), D ⊆J , where breakdown and repair of nodes only depend on the interaction of the nodes but not on their queue length. Whenever a node in D breaks down, stalling occurs, so all nodes go into a warm standby and all arrivals and services are interrupted until all nodes return to the up status.
The network process (Y,X ) is a Markov process on the state space P(D) × N. The balance equations are for all (∅, n i ) ∈ {∅} × N and π i (I, n i ) for all (I, n i ) ∈ P(D) × N with I = ∅.
We have to show that (14) solves these equations. In the following we set Moreover, for any I ⊂ H ⊆ D, we havê which impliesπ The proof of (iii) is finished by normalization, which is possible from η i < μ i . The limiting probability (15) for unstable nodes with infinite supply follows from the same arguments as in the proof of Theorem 15 in [13].

B Proof of Theorem 3
Consider the subset W of nodes without infinite supply. For any subset I ⊆ D of broken down nodes, we have the following facts for the subset W \I which remain in force as long as I is unchanged: -All service times of all up nodes are exponentially distributed, and the service discipline at all nodes is FCFS. -Routing of customers is Markovian: A customer completing service at node i ∈ W \I will either move to some node j ∈ W \I with probability r I (i, j) or leave the subnetwork with probability 1 − j∈W \I r I (i, j). -At each node i ∈ W \I , we have external arrivals from the source which are independent Poisson streams with rate λ I i ≥ 0. Furthermore, all arrivals from nodes j ∈ V \I with infinite supply into nodes i ∈ W \I are independent Poisson streams at rate μ j r I ( j, i); see Theorem 1. The sum of independent Poisson streams is a Poisson stream; hence, the arrival stream from the outside of the subset W \I into each node i ∈ W \I is a Poisson process with rate λ I i + j∈V \I μ j r I ( j, i). -All service times and all inter-arrival times are independent of each other.
LetX := ((X i (t) : i ∈ W \I ) : t ∈ R + ) be the queueing process of this subnetwork. The process is supplemented with a Markov process Y = (Y (t) : t ∈ R + ) which describes the availability status of the nodes and therefore gives information on how long the network process on the subnet W \I lives until it jumps to the next Markov process on some randomly chosen subnet W \K , K ⊆ D. Rerouting is according to the blocking rs-rd regime (skipping, resp.). The balance equations of the joint availability queue-length process (Y,X i : i ∈ W ) are, ∀(I, n i : i ∈ W ) ∈ P(D) × N |W | , π(I, n k : k ∈ W ) · i∈W \I π(I, n k : k ∈ W \{i}, n i − 1) · λ I i + j∈V \I μ j r I ( j, i) ·1 N + (n i ) + i∈W \I π(I, n k : k ∈ W \{i}, n i + 1) · μ i 1 − j∈W \I r I (i, j) + i∈W \I j∈W \I, j =i π(I, n k : k ∈ W \{i, j}, n i + 1, n j − 1) · μ i r I (i, j) · 1 N + (n j ) + K ⊂I ⊆D π(K , n k : k ∈ W ) · α(K , I ) + I ⊂H ⊆D π(H, n k : k ∈ W ) · β(H, I ) .
Under the required condition of either (8) and (9) in the case of blocking rs-rd or (11) in the case of skipping, η i = η I i for all i ∈ W \I and all I ⊆ D for the respective reduced traffic equations. Therefore, from Lemmas 1 or 2, respectively, this is equivalent tô π(I, n k : k ∈ W ) · i∈W \I η i − j∈W \I η j r I ( j, i) + i∈W \I μ i (1 − r I (i, i)) · 1 N + (n i ) = i∈W \Iπ (I, n k : k ∈ W \{i}, n i − 1) · η i − j∈W \I η j r I ( j, i) ·1 N + (n i ) + i∈W \Iπ (I, n k : k ∈ W \{i}, n i + 1) · μ i 1 − j∈W \I r I (i, j) + i∈W \I j∈W \I, j =iπ (I, n k : k ∈ W \{i, j}, n i + 1, n j − 1) · μ i r I (i, j) · 1 N + (n j ).
Plugging inπ(I, n k : k ∈ W ) = A(I ) which shows that Thus,π(I, n k : k ∈ W ) = A(I ) solves balance equations (23). The last step of proving the theorem is by normalizingπ , which is possible because η i < μ i holds for all i ∈ W .