Stability of linear EDF networks with resource sharing

We consider a linear real-time, multiresource network with generally distributed stochastic primitives and soft customer deadlines, in which some users require service from several shared resources simultaneously. We show that a strictly subcritical network of this type is stable under the preemptive Earliest Deadline First scheduling strategy. Our argument is direct, without using fluid model analysis as an intermediate step. As an application of our main result, we propose a stable proxy for the preemptive Shortest Remaining Processing Time service protocol for linear, strictly subcritical resource sharing networks.


Introduction
Massoulié and Roberts [34] introduced resource sharing networks, commonly called bandwidth sharing models, to model congestion control problems for Internet flows. In such systems, flows, corresponding to continuous transfers of elastic documents, are being transmitted, requiring simultaneous service at all nodes along their routes. Various service protocols for resource sharing networks have been proposed. Some B Łukasz Kruk lkruk@hektor.umcs.lublin.pl 1 Department of Mathematics, Maria Curie-Skłodowska University, Pl. Marii Curie-Skłodowskiej 1, 20-031 Lublin, Poland of the most popular of them are proportional fairness, originating in a work of Kelly [23], and more general α-fair policies, introduced by Mo and Walrand [36].
A natural problem in the theory of queueing systems with simultaneous resource possession is their stability when the average load placed on each resource is less than its capacity. Assuming exponentially distributed document sizes, de Veciana, Lee and Konstantopoulos [11] showed stability of weighted max-min fair and proportionally fair policies. Bonald and Massoulié [7] obtained a counterpart of these results for weighted α-fair protocols. Massoulié [33] established stability of a fluid model for a resource sharing system with exponential arrival and document sizes, incorporating additional routing. This allowed him to show stability of stochastic resource sharing networks when file sizes have phase-type distributions. More recently, Gromoll and Williams [18] proposed a fluid model associated with a stochastic resource sharing network with generally distributed interarrival times and document sizes, working under a fairly general bandwidth sharing discipline. Under mild assumptions, they showed that rescaled measure-valued processes corresponding to a sequence of such networks are tight and any weak limit point of this sequence is almost surely a fluid model solution. This, together with stability of fluid models for weighted α-fair policies in linear and tree networks established by Gromoll and Williams [17], may be used to infer stability of the corresponding stochastic systems. The interested reader is referred to [18] for more references regarding stability results for generalized bandwidth sharing policies.
Verloop, Borst and Núñez Queija [40] investigated linear, strictly subcritical resource sharing networks with Poisson arrivals and generally distributed document sizes, working under the Shortest Remaining Processing Time (SRPT) and Least Attained Service (LAS) scheduling. They found that such systems may be unstable and, moreover, for networks with sufficiently many nodes, instability may arise at arbitrarily low traffic loads. From a broader perspective, they have observed that a linear network topology "appears already sufficiently rich to exhibit many of the qualitative phenomena that may occur for general network topologies and route structures". Having said this, we must also admit that there are service disciplines making the underlying linear strictly subcritical resource sharing networks stable, while being unstable for some other network topologies, for example, UFOS (Utilization First, Output Second), introduced by Harrison et al. [21]. In any case, the results of [40] indicate that it is worthwhile to understand stability phenomena in linear networks with resource sharing as an important first step in the analysis of systems with more complex topologies.
Recent years have brought a rapidly increasing demand for real-time services, in which jobs have specific timing requirements. Examples of such services include voice and video transmission, manufacturing systems, where the orders have due dates, realtime control systems and tracking systems. Another important class of applications arises in medical scheduling problems, like organ allocation or prioritizing admissions to emergency rooms.
There are several possible reactions of a real-time system to deadline misses. In this paper, we will focus on the case of soft deadlines in which lateness is permitted and the jobs completed after their deadlines are used.
A natural service protocol for real-time systems is Earliest Deadline First (EDF), in which the job with the shortest remaining lead time, i.e., the difference between its deadline and the current time, is selected for service. A definition of the preemptive EDF policy for a resource sharing network with soft deadlines and arbitrary topology was given by Kruk [26]. In order to describe the contents of the latter paper, for each element i from the set I of available routes of the network, each time t ≥ 0 and each s ∈ R, let Y i (t, s) denote the cumulative idleness by time t with regard to transmission of flows with lead times at time t not greater than s. The main result of Kruk [26] was to show that, under mild distributional assumptions, the preemptive EDF resource sharing protocol minimizes i∈I Y i with respect to the pointwise functional inequality. (The latter relation is a partial ordering, according to which, for functions f , g of two variables t, s, we have f ≤ g if and only if f (t, s) ≤ g(t, s) for all arguments t and s.) One may wonder whether the above (or a similar) minimality result may be used to show stability of the corresponding EDF networks. While this idea looks attractive, we have not been able to obtain any results along this line. Related optimality properties of the EDF protocol were studied by Liu and Layland [32], Panwar and Towsley [38,39], Moyal [37], Kruk et al. [31], Atar et al. [3], in the context of single-server systems, and by Baruah [4], who investigated resource sharing networks with hard deadlines.
In this paper, we consider linear resource sharing networks with renewal arrival streams and generally distributed document sizes. Upon arrival to the network, a flow on each route is assigned a soft random deadline drawn from a distribution associated with this route. Assuming mild conditions on the model stochastic primitives, we show that a strictly subcritical network of this type is stable under the preemptive EDF service policy. This result is in sharp contrast to instability of strictly subcritical linear networks under the SRPT protocol established in [40]. Indeed, there is a deep relation between the EDF and SRPT scheduling strategies. Their similarity was first noticed (at least to our knowledge) by Bender, Chakrabarti and Muthukrishnan [5] and then, more explicitly, by Down, Gromoll and Puha [14], who investigated fluid limits for SRPT queues using an auxiliary process similar to the one introduced by Doytchinov, Lehoczky and Shreve [15] in the context of heavy traffic analysis for EDF queues. In fact, in Sect. 8 of our paper, we apply our main result, together with an idea of Bender, Chakrabarti and Muthukrishnan [5], to propose a stable regularization of the SRPT protocol for linear resource sharing networks.
The main idea of our stability proof is to verify the condition given in Theorem 3.1 of Dai [9] (see (10), to follow) which, roughly speaking, states that after a sufficiently long time, the expected size of the system state is small in comparison to the size of the initial condition as the former gets large. In the literature, this criterion is usually checked by proving stability of the corresponding fluid model, as was originally suggested by Dai [9]. Here, we choose an alternative way, proving Dai's condition (10) directly for the underlying queueing system, without using fluid model analysis as an intermediate step. Our argument is based on two crucial ingredients. The first one, presented in Sect. 5, is an investigation of some general properties of the workload process in linear resource sharing networks. This part of the analysis requires only a very mild assumption on the service protocol, and hence it is applicable to a broad range of disciplines as long as the network is linear. The main result of this section states that in the strictly subcritical case, after a time proportional to the size of the initial condition, the workload on all the "local routes" of the network, requiring only a single resource, is small. The question is whether or not the workload on the "long route", requiring the cooperation of all the system resources, is also small at the same time. In the case of the preemptive EDF protocol, the answer is affirmative, thanks to a current lead time estimate presented in Sect. 4, stating that the smallest of the lead times of flows on the "long route" is close to the smallest of the lead times of flows on the "local routes". This excludes the possibility of congestion on the "long route" without at least one of the "local" ones being congested as well. While such an estimate cannot be expected to hold for disciplines other than EDF, we hope that the main idea of our approach will turn out to be useful also for asymptotic analysis of linear resource sharing networks with other service protocols.
To make the above analysis rigorous, it is necessary to overcome several difficulties. First, in order to model an EDF queueing system, the lead times of individual flows or some equivalent information must be stored. In the preemptive case, considered in this paper, the number of partially transmitted flows is unbounded and their residual service times must be stored as well. This calls for modeling the network evolution in an infinite-dimensional state space. Since the original analysis of Dai [9] was constrained to head-of-the-line (HL) disciplines (i.e., such that at any time at most one customer in each class has been partially served), one must proceed carefully, adjusting Dai's [9] approach to preemptive EDF networks which do not enjoy the HL property. Furthermore, in the stability analysis for our system, different (infinite-dimensional) initial states have to be considered. Some of them are statistically "atypical" in various ways, for example they may have very "irregular" structure of initial lead times, or some service times "abnormally" large. Consequently, any statistical regularity in the system's behavior may, in general, be observed only after all the initial flows leave the network. Finally, the partially served flows are difficult to analyze, so it is necessary to show that, in some sense, their influence on the system's asymptotics is negligible. We accomplish the latter task in Sect. 6, showing a suitable state space collapse result which is, in a sense, a counterpart of "crushing lemmas" used in the heavy traffic analysis of "conventional" EDF queueing systems (see, for example, [15]).
There is a connection between our results and stability theory for switched networks. In contrast to the systems discussed above, generalized switches are packet level models in which time is discrete and processing a task takes exactly one time unit. This considerably simplifies the analysis, since the current state of a switch is fully characterized by the finite dimensional vector of queue lengths, any scheduling policy for it is HL and there are no partially transmitted packets there. A natural service discipline for a generalized switch is the Longest Queue First (LQF) algorithm, selecting the set of served links iteratively, in a way somewhat similar to EDF and SRPT, but according to the queue lengths rather than lead times or residual service times. In a seminal paper, Dimakis and Walrand [13] showed stability for a class of LQF systems satisfying the local pooling condition (which is true for linear networks) and, under mild assumptions on stochastic arrivals, also for some more general network topologies, for which a suitable rank condition holds. The main ideas of their analysis were to use the maximal queue length as a Lyapunov function for the underlying Markov chain and to combine diffusion-scale properties of the sample paths with the fluid limit framework. Accordingly, their proof techniques are notably different from the ones used in this paper, in which the dynamics of the underlying Markov process are more complicated, even for relatively simple network topologies. The network graphs satisfying local pooling under primary interference constraints were characterized by Birand et al. [6].
Stability theory for resource sharing networks is clearly related to stability problems for "conventional" multiserver, multiclass queueing networks. The fundamental difference between these two types of systems is that flows in a bandwidth sharing network need access to all the resources on their routes simultaneously, while customers of a multiclass queueing network visit different servers along their routes in succession. However, there are also some important similarities. The results from the theory of multiclass queueing networks which are of greatest relevance to this paper, in addition to [9], are the works by Bramson [8] and Kruk [24]. The former shows stability of general strictly subcritical multiclass EDF networks without preemption, while the latter contains the corresponding result for preemptive, strictly subcritical EDF networks with fixed customer routes. In both those papers, convergence of the fluid-scaled sample paths of the network performance processes to fluid limits, satisfying the corresponding fluid model equations, together with stability of the resulting fluid model, were used to prove stability of the stochastic EDF networks under consideration. The results of [24] may be generalized to strictly subcritical, preemptive EDF networks with Markovian routing, although such a generalization seems to be nontrivial. This paper is devoted to preemptive EDF resource sharing systems, so our analysis is more akin to the one presented in [24]. In particular, our proof of state space collapse from Sect. 6 resembles the proof of the corresponding result from [24].
Finally, let us mention a growing body of literature devoted to scaling limits for "conventional" EDF networks. The case of a single-server, single customer class queue is relatively well understood by now. Fluid limits for such systems with hard deadlines and no preemption, under various distributional assumptions, were investigated by Decreusefond and Moyal [12], Atar, Biswas and Kaspi [1], and, recently, Atar, Biswas, Kaspi and Ramanan [3]. Heavy traffic limits for preemptive EDF queues with soft deadlines were provided by Doytchinov, Lehoczky and Shreve [15], and the corresponding analysis for EDF queues with reneging was given by Kruk, Lehoczky, Shreve and Ramanan [31]. The accuracy of the heavy traffic approximation from [15] was investigated by Kruk, Lehoczky and Shreve [27,28]. There are also a few results concerning multistation EDF systems. Atar, Biswas and Kaspi [2] developed fluid limits for a many-server EDF queue. The heavy traffic analysis of [15] was extended to multiclass feedforward EDF networks by Yeung and Lehoczky [42], and to some acyclic EDF networks by Kruk, Lehoczky, Shreve and Yeung [29]. Developing a fully satisfactory heavy traffic theory for general multiclass, multiserver EDF queueing networks appears to be mathematically challenging, since, as was pointed out by Kruk [25], such networks typically exhibit unconventional heavy traffic behavior. Counterparts of the above-mentioned results for resource sharing EDF networks are still to be developed.
This paper is organized as follows. In Sect. 2 we define a stochastic model for a linear EDF resource sharing network. Section 3 contains our main result and the rest of the paper is devoted to its proof. In Sect. 4 we provide essential current lead time estimates. Section 5 is devoted to investigation of the workload dynamics in linear resource sharing networks. In Sect. 6, we estimate the time of transmission completion for all the initial flows and we show that after this time state space collapse holds. Section 7 contains the proof of the main result. Section 8 concludes.

Notation
The following notation will be used. Let N denote the set of nonnegative integers and let R denote the set of real numbers. For a, b ∈ R, we write a ∨ b (a ∧ b) for the maximum (minimum) of a and b, a + for a ∨ 0 and a for the largest integer less than or equal to a. The infimum (supremum) taken over the empty set should be interpreted as ∞ (−∞). For a vector a = (a 1 , . . . , a n ) ∈ R n , let |a| = n i=1 |a i |. By convention, a sum over the empty set of indices equals zero. The nonnegative real numbers [0, ∞) will be denoted by R + .
The Borel σ -field on a metric space S will be denoted by B(S). For B ∈ B(R), we denote the indicator of the set B by I B . For a function g : R → R and T ∈ (0, ∞), Let M denote the set of finite, nonnegative measures on B(R). For ξ ∈ M and a Borel measurable function g : R → R that is integrable with respect to ξ , define In particular, μ(R) = 0 if and only if L μ = ∞. We denote the measure in M that puts one unit of mass at a point x ∈ R by δ x .
All stochastic processes used in this paper are assumed to have paths that are right continuous with finite left limits (r.c.l.l.). We denote by D[0, ∞) the space of r.c.l.l. functions from [0, ∞) into R.

Stochastic model
Our stochastic model of a linear EDF resource sharing network consists of the following: the network topology, stochastic primitives, the service protocol and performance processes describing the time evolution of the system. These are defined below.

Network structure
We consider a network with a finite number of resources (nodes), labeled by j = 1, . . . , J , and a finite set of routes, labeled by i = 1, . . . , I . Each route may be identified with a nonempty subset of J = {1, . . . , J }, interpreted as the set of resources used by this route. Let A = [a ji ] be the J × I incidence matrix in which a ji = 1 if resource j is used by route i and a ji = 0 otherwise. Let I = {1, . . . , I }. Then the set R(i) of resources used by route i may be described by the equation R(i) = { j ∈ J : a ji = 1}. In what follows, by the network topology we mean the structure of connections between the resources made by the routes, which is defined by the incidence matrix A (or, equivalently, by the sets R(i), i ∈ I). Throughout this paper, we assume that the network under consideration is linear, i.e., I = J + 1, R(i) = {i}, i = 1, . . . , J , and R(J + 1) = J.
By a flow on route i we mean a continuous transmission of a file through the resources used by this route. We assume that a flow takes simultaneous possession of all the resources on its route during the transmission. For convenience, we also assume that all the resources have a unit service rate.

Stochastic primitives
Let (Ω, A, P) be a probability space on which all the random objects to follow will be defined. The initial condition consists of nonnegative integers Q i (0), i ∈ I, counting the numbers of initial flows on each route at time zero, strictly positive constant initial file sizes of the initial flowsṽ i,k and their corresponding constant initial lead times (deadlines)l i,k , where i ∈ I, k = 1, . . . , Q i (0). Without loss of generality, we assume thatl i,k ≤l i,k+1 for every k = 1, . . . , Q i (0) − 1, i ∈ I. The initial flow with service timeṽ i,k and deadlinel i,k will be called flow k on route i.
Let N i (·) be an exogenous arrival process for the route i ∈ I. For each i it is a delayed renewal process with rate α i . For t ≥ 0, N i (t) represents the number of flows arriving to the i-th route in the time interval (0, t]. The k-th arrival modeled by N i (·) will be called flow Q i (0) + k on route i. Its arrival time equals The first arrival times U i,1 , i ∈ I, are assigned fixed positive values. For notational convenience, we also define the "arrival times" of the initial flows by the formula We assume that for i ∈ I and k ≥ 2, and, for some n i > 0 and some nonnegative Borel function g i such that In other words, the interarrival times of incoming flows are integrable, unbounded and spread out. For i ∈ I and k ≥ 1, a random variable v i,k represents the initial size of the file associated with the Q i (0) + k-th flow on route i, i.e., the cumulative transfer time of this flow through the network. We assume that for each i ∈ I the random variables {v i,k } k≥1 are strictly positive and form an independent and identically distributed (i.i.d.) sequence with finite mean For i ∈ I and k ≥ 1, a random variable l i,k represents the initial lead time for the transmission of the file associated with the Q i (0) + k-th flow on route i. Thus, the deadline for the Q i (0) + k-th transmission on route i equals U i,k + l i,k . We assume that for each i ∈ I the random variables {l i,k } k≥1 are nonnegative and they form an i.i.d. sequence with a finite first moment: In an important special case of a First-In-System, First-Out (FISFO) resource sharing network, we have In this case, for the sake of derivation of current lead time estimates in Sect. 4.1, which are slightly stronger than general ones, presented in Sect. 4.2, for i ∈ I such that Q i (0) > 0, we additionally assume the compatibility conditioñ We assume that for each i ∈ I, the random vectors We also assume that the sequences (We do not rule out a possibility of dependence between v i,k and l i,k for the same indices i, k; see the discussion in Sect. 8.) Let ρ i = α i m i be the traffic intensity of route i. The corresponding traffic intensity for resource j is ρ j = I i=1 a ji ρ i . We say that a resource sharing network is strictly subcritical if ρ j < 1 for every j ∈ J.

Residual file sizes, lead times
For t ≥ 0, i ∈ I and k ≤ A i (t), let w i,k (t) denote the residual size of the file (transmission time) of flow k on route i at time t. Thus, w i,k (·) decreases during the transmission of the flow k on route i and it is constant otherwise.
To determine whether flows meet their timing requirements, one must keep track of each flow's lead time, where lead time = deadline − current time.
More formally, let t ≥ 0, i ∈ I and k ≤ A i (t). The lead time at time t of flow k on route i is defined by We combine the stochastic primitives defined above into the following measurevalued arrival process: for i ∈ I and t ≥ 0, let Then V i (t) = 1, V i (t) denotes the total time necessary to complete the transmission of all flows that have arrived at the route i ∈ I by time t.

Basic performance processes
For t ≥ 0 and i ∈ I, the measure-valued state descriptors for route i are defined by The random measure Q i (t) (resp. W i (t)) puts the unit mass (resp., the mass equal to the corresponding residual transmission time) at the lead time of any flow present on route i at time t. Then Q i (t) = 1, Q i (t) denotes the number of flows on route i ∈ I at time t and W i (t) = 1, W i (t) denotes the total time necessary to complete the transmission of these flows, which will be called the workload on route i at time The current lead time process for route i will be denoted by In other words, C i (t) is equal to the smallest of the lead times of the flows present on route i at time t if Q i (t) > 0, and max k≤A i (t) l i,k (t) + 1 otherwise.

Service protocol
The network operates under the preemptive EDF policy, dynamically allocating bandwidth to flows with the shortest remaining lead time. In the case of preemption, we assume preempt-resume and no setup, switchover or other type of overhead. A precise definition of this protocol for general resource sharing networks was given in [26]. We recall it here, adjusting it to the network under consideration and introducing notation which will be used in Sect. 4. For every t ≥ 0 and i ∈ I such that Q i (t) > 0, let In words, k i (t) is the index of the "most urgent" flow on route i present in the system at time t (strictly speaking, the smallest such index, if there is more than one). Let t ≥ 0 be such that Q(t) = 0 and let i 0 = i 0 (t) ∈ I be such that l i 0 ,k i 0 (t) (t) is the smallest of the lead times of the flows present in the system at time t. Here and elsewhere we assume that ties between routes are broken in an arbitrary deterministic and time-independent manner, for example here we may choose the smallest index i 0 with the required properties. The flow k i 0 (t) on route i 0 is chosen for transmission with the maximal (i.e., unit) rate at time t. If i 0 = J + 1, then the assignment of flows for transmission at time t is finished, because no more flows can be transmitted at that time. Otherwise, for each i ∈ {1, . . . , J }\{i 0 } such that Q i (t) > 0, the flow k i (t) on route i is also chosen for transmission with the unit rate at time t. This assignment is effective until either one of the ongoing transmissions is finished, or a new flow arrives to the system, when, subject to the same rules, a rearrangement may happen.

Main result
In this section, after the definition of a suitable Markov process describing the evolution of an EDF resource sharing network, we state our main result, Theorem 1, and provide a sketch of its proof. The details of the proof will be presented in Sects. 4-7.

Markov process background
be the state space. Under the product topology, S is a locally compact Polish space. The state of the network at any time is given by a point where, for i ∈ I, q i is the number of flows (i.e., the queue length) on route i, h i describes all flows present on route i at this time so that each of them is listed in terms of its residual transmission time and lead time, and r i is the residual interarrival time for route i. We assume that the flows in h i are listed in the order of their arrivals to the route, ties are broken in an arbitrary manner and the empty spaces on the list h i (i.e., not corresponding to any flow present on the route) are positioned after all the listed flows and they are filled with zeros. Let w = (w i ) i∈I , where w i is the sum of the residual transmission times of the flows listed in h i . Let q = (q i ) i∈I , r = (r i ) i∈I and let be the greatest lead time. For x ∈ S, let |x| = |q| + |w| + |r | + + be the "norm" of x.
The S-valued process describing the evolution of the EDF resource sharing network is denoted by is the state of the network at time t. By definition, the process X has right-continuous sample paths. It is easy to see that X is a Markov process. The evolution of the process X between arrivals and departures is deterministic. Thus, X is a piecewise-deterministic Markov process, so it is actually strong Markov (see [10]).
We will sometimes use a superscript x ∈ S for various performance processes describing the evolution of an EDF resource sharing network to indicate that the state process X corresponding to this network starts at state x. Also, for x ∈ S, by P x and E x we denote the probability and the expectation operator, respectively, under the condition that X (0) = x.
A Markov process X on the state space S is Harris recurrent if there exists a σfinite measure ν on B(S) such that whenever A ∈ B(S) and ν(A) > 0, we have It is known that Harris recurrence implies the existence of a unique (up to a multiplicative constant) invariant measure; see, for example, [16]. If this measure is finite, X is called positive Harris recurrent.
The following proposition is a variant of Theorem 3.1 in [9], which is very useful in stability theory for queueing networks. It reduces the problem of proving positive Harris recurrence of a Markov process to checking the condition (10) on the asymptotic behavior of this process as the initial condition gets large.
then X is positive Harris recurrent.
The proof of this proposition is the same as the proof of Theorem 3.1 in [9] (see also the proof of Theorem 2.1 (ii) in [35]).

Main theorem
Recall that a queueing network is stable when the underlying Markov process is positive Harris recurrent. The following theorem is the main result of this paper.
The rest of this paper is devoted to the proof of Theorem 1. It is long and it proceeds in several steps. Along the way, we provide several auxiliary results, which may be of independent interest. To aid the reader, we first present an outline of our argument. Our goal is to verify that, for a suitable γ > 0, the condition (10) of Proposition 1 holds. To this end, due to uniform integrability of the random variables involved, it suffices to check that each sequence x n of initial states such that |x n | → ∞ contains a subsequence (also denoted by x n for convenience), along which almost surely (a.s.). A state space collapse result established in Sect. 6 reduces the justification of (11) to proving the a.s. convergence In order to establish (12), we show that there exist a routeî ∈ {1, . . . , J } and a constant τ 0 such that, for t ≥ τ 0 |x n |, the workload on routeî asymptotically dominates the workloads on routes 1, . . . , J , i.e., see Lemma 13 and Remark 2. On the other hand, because the network is strictly subcritical, if the constant γ is large enough, there exists a (random) time σ ∈ [τ 0 , γ ] such that W x n i (σ |x n |) = 0; see Lemma 14. Letσ be the supremum of such times. By (13), The results (13)- (14) are general properties of linear, strictly subcritical resource sharing networks, which hold for a broad range of service policies, satisfying a mild "weak non-idleness" condition, defined at the beginning of Sect. 5. We want to prove that, under the EDF discipline, we also have This follows from (14), together with a crucial current lead time estimate (Proposition 3), implying that The estimate from Proposition 3, relying heavily on properties of the EDF service protocol, is the heart of our stability proof. Once (13)-(15) are established, (12) follows easily from these three relations, together with the definition ofσ and the fact that the network is strictly subcritical.

Current lead time estimates
The aim of this section is to prove Proposition 3, stating that after all the initial flows are fully transmitted in a linear EDF resource sharing network, the current lead time on route J + 1 is close to the minimum of the current lead times on routes 1, . . . , J . This excludes the possibility of having a large workload on route J +1, while all other routes are almost empty, and therefore it provides a key ingredient of our stability argument. The proof of Proposition 3, given in Sect. 4.2, is somewhat involved. Therefore, to aid the reader, in Sect. 4.1 we show a corresponding result for a linear FISFO network. The proof for the latter case uses the same ideas as the general one from Sect. 4.2, while being notably simpler. The arguments presented in this section are pathwise, requiring no distributional assumptions on the model stochastic primitives. Throughout this section, we use the notation introduced in Sect. 2.5.

Current lead times in linear FISFO networks
In this subsection, we additionally assume that the network protocol is FISFO, i.e., (6) holds. We also assume the compatibility condition (7). Our aim is to prove Proposition 2, assuring that if at least one flow on every route has already been transmitted, then the current lead time on the long route is close to the minimum of the current lead times on the short routes. The analysis is divided into five cases, considered in Lemmas 1-5, respectively. Note that, by (8) and the explanation following it, the cases in which some of the queues under consideration are empty have to be analyzed separately; see Lemmas 3-5.
Proof The first inequality in (16) follows from the definition of i 0 (see Sect. 2.5). For the proof of the second one, note that, by the definition of the FISFO service protocol, the flow k J +1 (t) − 1 on route J + 1 has already been fully transmitted. We will show that Indeed , then the flow k i 0 (t) on route i 0 arrived at the system no later than the flow k J +1 (t) − 1 on route J + 1 (in fact, earlier, unless both of them were already present in the system at time zero). However, R(J + 1) ∩ R(i 0 ) = ∅ and thus, by the network topology and the FISFO service discipline, the flow k J +1 (t) − 1 on route J + 1 cannot be transmitted as long as the flow k i 0 (t) on route i 0 is present in the system. We have obtained a contradiction, proving (17).
By (17), we get Proof The argument is similar to the proof of Lemma 1. The main step is to show that (compare (17)). If then the flow k J +1 (t) on route J +1 arrived at the system no later than the flow k ι (t)−1 on route ι (in fact, earlier, unless both of them were already present in the system at time zero). However, by the network topology and the FISFO service discipline, the flow k ι (t) − 1 on route ι can be transmitted at time s only if i 0 (s) = J + 1, i.e., s ∈ B t . By the definitions of τ , ι, k ι (t) and the FISFO service discipline, the transmission of the flow k ι (t) − 1 on route ι was completed at time τ and ι = i 0 (τ −). Hence, This contradicts (21), proving (20). Next, we argue as in the proof of Lemma 1.
Proof By (8) and the assumption that Q i (t) = 0, under the FISFO service discipline, which proves the first inequality in (27). The second one is an immediate consequence of (6)- (8).
This result follows directly from Lemmas 1-5, because the condition

Current lead times in linear EDF networks
Now we shall present counterparts of the results of Sect. 4.1 for more general linear EDF resource sharing networks. In particular, we no longer require that the conditions (6)- (7) hold. In this subsection, we make repeated use of nonnegativity of initial lead times for incoming flows, without explicit reference. First note that in an EDF resource sharing network, initial flows, together with flows arriving after time zero with deadlines not greater thañ form a priority class, i.e., as long as these flows are present in the network, at least one of them is transmitted with unit rate (compare a similar remark in the proof of Lemma 5.2 in [24]). All of these priority flows arrive at the network by the timel max . Accordingly, the sum of the transmission times of the priority flows is not greater than i∈I Q i (0) v i,k , and hence all the priority flows leave the network by the timel By the same argument, at least one flow coming to each route after time zero, in addition to all the initial flows, is fully transmitted by the time As in Sect. 4.1, in order to prove Proposition 3, the main result of this subsection, we analyze five different cases in the following Lemmas 6-10, respectively. Again, the cases in which some relevant queues are empty require special attention; see Lemmas 8-10.

Lemma 6 Let t > T be such that Q J +1 (t) > 0 and i
Proof The first inequality in (32) follows from the definition of i 0 (see Sect. 2.5). For the proof of the second one, let η be the largest index of a flow on route J + 1 that has been fully transmitted by time t. We have η > Q J +1 (0), because t > T. Let τ be the time when the transmission of the flow η on the route J + 1 was completed.
If the flow k i 0 (t) on the route i 0 arrived at the system before time τ , then l i 0 ,k i 0 (t) (τ −) ≥ l J +1,η (τ −) (otherwise, the system could not finish the transmission of the flow η on the route J + 1 before the departure of k i 0 (t) from the route i 0 ). Consequently, If η < A J +1 (t), then the flow η + 1 is still on the route J + 1 at time t, so where (33) was used in the last line. If η = A J +1 (t), let κ be a flow present on the route J +1 at time t. Then Q J +1 (0) < κ < A J +1 (t), because t > T, and hence where again (33) was used in the last line. If the flow k i 0 (t) on the route i 0 arrived at the system no earlier than at the time τ , then In this case, we have two possibilities. If τ is so large that A J +1 (τ ) = A J +1 (t), then t − τ < u J +1,N J +1 (t)+1 , so (36) implies Since t > T, the flow k J +1 (t) on route J + 1 is not an initial one, so where (37) was used in the last line. If A J +1 (τ ) < A J +1 (t), then the flow A J +1 (τ ) + 1 on the route J + 1 arrived at the system in the time interval (τ, t] and, by the definition of τ , it is still in the system at time t. Thus, From (36) and (39), we have Summarizing, the inequalities (34)-(35), (38) and (40) imply that if t > T, then, irrespective of the case, (32) holds.

Lemma 7 Let t > T be such that Q(t) = 0 and i
where B t is as in (18), and let ι = i 0 (τ −). If Q ι (t) > 0 and τ > T, then Proof The argument is similar to the proof of Lemma 6. By the definition of ι, we have ι ∈ {1, . . . , J }. Let η = k ι (τ −). Then η > Q ι (0), because τ > T. The subsequent analysis is divided into the following cases. If the flow k J +1 (t) on the route J + 1 arrived at the system before time τ , then Indeed, by the definitions of ι and η, l ι, which is a contradiction. The inequality (42) implies that (compare (33)).
First assume that there is a flow κ present on the route ι at time t which has arrived at the system no later than the deadline for the flow η on the same route, i.e., such that In this case, which, together with (43), yields We now consider the case in which there is no flow satisfying (44) on the route ι at time t. Observe that every flow κ arriving at the route ι after the deadline for the flow η, but no later than the time t (i.e., such that U ι,η−Q ι (0) + l ι,η−Q ι (0) < U ι,κ−Q ι (0) ≤ t), is still in the system at time t. Indeed, such a flow cannot preempt the flow η on route ι, so it cannot be transmitted before the time τ . On the other hand, only flows on the route J + 1 are transmitted in the time interval [τ, t], so a flow κ on route ι is not transmitted at any time in this interval, either. In particular, flow κ = A ι (U ι,η−Q ι (0) +l ι,η−Q ι (0) )+1 is still on the route ι at time t, so This, together with (43), yields The inequalities (45)-(46) imply that in both cases analyzed so far, the estimate (41) holds.
The analysis for the case in which the flow k J +1 (t) on the route J + 1 arrived at the system no earlier than at the time τ is completely analogous to the proof of Lemma 6 for the corresponding case.
If the flow k J +1 (t) on the route J + 1 arrived at the system before time τ , then (43) holds by the same argument as in the proof of Lemma 7. We claim that If η = A ι (t), (47) follows immediately from (43). Assume that η < A ι (t). By the definitions of τ and η, the flow η was the last one to leave the route ι by time t, while the flow A ι (t) was the last one to arrive at ι by that time. This, together with the assumption that Q ι (t) = 0 and the definition of the EDF protocol, implies that l ι,η (t) > l ι,A ι (t) (t), so again (43) implies (47). However, N ι (t) > 0, because t ≥L max , and hence, by (23), This, together with (47), implies (22). It remains to consider the situation in which the flow k J +1 (t) on the route J + 1 arrived at the system no earlier than τ . In this case, we have By the definition of τ , there is no transmission on the route ι in the time interval [τ, t], so the assumption that and (22) follows.
Proof If the flow k i 0 (t) on route i 0 arrived at the network no later than the flow A J +1 (t) on route J + 1, then we have (26), by the same reason as in the proof of Lemma 4. By (23), with ι replaced by J + 1, we get which, together with (26), implies (25). If the flow k i 0 (t) on route i 0 arrived at the network later than the flow A J +1 (t) Q i 0 (0) . This, together with the fact that the flow k i 0 (t) on route i 0 is not an initial one, implies that so again (25) holds.

Lemma 10 Let t ≥L max and i ∈ I be such that Q i (t) = 0. Then
(49) Proof By (8), for every t ≥ 0, so for t ≥L max the second inequality in (49) follows. If t ≥L max is such that Q i (t) = 0, then N i (t) > 0 and the first inequality in (50) is actually an equality. Thus, where the last inequality follows from (23), with i substituted for ι.

Proposition 3 For t > T, we have
Proof The estimate (51) follows from Lemmas 6-10. We will provide a detailed analysis under the assumptions of Lemma 6 (the proof in other cases is similar). Let i ∈ {1, . . . , J } be such that C¯i (t) = min i=1,...,J C i (t). If Q¯i (t) > 0, then C¯i (t) = C i 0 (t) and Lemma 6 immediately implies (51). Assume that Q¯i (t) = 0. Then (8), the first inequality in (32), Lemma 10 and the assumption t > T >l max imply that and again (51) follows.

Workload evolution in linear networks
In this section, we investigate properties of the workload process in linear resource sharing networks. The results presented here do not depend on the service protocol, and hence, they are fairly general. We do assume, however, the following weak form of non-idleness: for any t ≥ 0 and any route i = 1, . . . , J , if Q i (t) > 0, then the sum of the transmission rates on the routes i and J + 1 at time t equals 1. (Recall that we have assumed the unit maximal service rate for every resource.) This "weak nonidleness" assumption, mathematically expressed by Eqs. (52)-(54), is clearly satisfied by the EDF service protocol described in Sect. 2.5, but it also holds for any other "reasonable" policy in a linear network.
The main results of this section are Lemmas 13 and 14. Lemma 13 shows that there exists a routeî ∈ {1, . . . , J } such that, after a time proportional to the "norm" of the initial condition, the workload on this route asymptotically dominates the workloads on other routes belonging to the set {1, . . . , J }. Lemma 14 provides an upper bound for the time at which a route chosen from {1, . . . , J } becomes empty in a strictly subcritical network. Along the way, we establish a simple comparison result for the Skorokhod map on R + (Lemma 11).
We first introduce some auxiliary performance processes. The cumulative transmission time on route i ∈ I by time t will be denoted by T i (t). Note that T i (t) = The netput for each route i ∈ {1, . . . , J } is given by where is the Skorokhod map on [0, ∞). It is well-known (see, for example, [30], Definition 1.1) that, for a given ψ ∈ D[0, ∞), the functions  that (φ, η) and (φ , η ) solve the Skorokhod problem on [0, ∞) for c 0 + ψ and c 0 + ψ , respectively. Then Proof If c 0 ≥ c 0 , then τ = 0 and (56) follows from Theorem 6 (i) in [22]. Suppose that c 0 > c 0 and let where the last inequality follows from the definition of τ 0 . Therefore, Put It is easy to see that the pairs (φ 1 , η 1 ) and (φ 1 , η 1 ) solve the Skorokhod problem on [0, ∞) for c 1 + ψ 1 and c 1 + ψ 1 , respectively. By the definition of τ 0 , we have Thus, by the first part of this proof, . This, together with (57), shows (56).
Let G be the set of elementary events ω ∈ Ω for which By the strong law of large numbers, P(G) = 1.
To proceed further, let x n ∈ S be a sequence of initial states, with the corresponding initial workloads (total transmission times) w n = (w n i ) i∈I and residual interarrival times r n = (r n i ) i∈I , such that uniformly on compacts (u.o.c.) in t (see Lemma 4.2 in [9]). Also, by (59) and the functional strong law of large numbers, for every i ∈ I, we have the convergence u.o.c. in t ≥ 0 on the set G.
The following simple lemma shows that both the initial lead times of the incoming flows and their interarrival times become negligible under fluid scaling.
The proof of the first statement of this lemma is the same as the proof of Lemma 4.1 in [24]. The proof of the second one follows by a similar argument. For i ∈ I and t ≥ 0, we define By (61)-(63), for each i, on the set G, By (65)-(66), (68)-(70) and Lipschitz continuity of Γ 0 with respect to the supremum norm (see, for example, [41], Lemma 13.5.1), for i = 1, . . . , J , we have and the latter convergence is u.o.c. in t ≥ 0 on the set G.

Lemma 13 Let
Letî ∈Ĩ max and let otherwise. (77) Remark 1 If |Ĩ| > 1, then the setĨ max depends, in general, on both n ∈ N (i.e., the starting state x n ) and ω ∈ Ω. Consequently, the indexî may also depend on n and ω.

Lemma 14
Assume that the network is strictly subcritical. Let x n be a sequence of initial states satisfying (61). Let C ≥ 0 and Then for each i ∈ {1, . . . , J } and ω ∈ G, there exists n 0 = n 0 (i, ω) such that, for every n ≥ n 0 , In what follows, all the random objects under consideration are evaluated at this ω. By (66), there exists n 0 such that Let n ≥ n 0 . We claim that (87) holds. Indeed, if σ n i ≥ δ, then W x n i (t) > 0 for all t ∈ [C|x n |, δ|x n |) and consequently, by the network topology, the i-th resource transmits with unit rate on the entire time interval [C|x n |, δ|x n |). This implies where the first equality is a consequence of (66), the third inequality follows from (65), the fourth inequality is a consequence of (86) and the last inequality follows from (88). We have obtained a contradiction, proving (87).

State space collapse
In this section, we provide an upper bound for the time of transmission completion of all the initial flows (Lemma 15). We also show that after this time, the workload on any route is approximately equal to a fixed multiple of the corresponding queue length, thus establishing state space collapse (Proposition 4).

Lemma 15 Let
and let x n ∈ S be a sequence satisfying (61). For every ω ∈ G and all n sufficiently large, we have T x n (ω) < τ 1 |x n |.
Proof By the definition of the "norm" of an initial state, for any starting state x ∈ S, Fix ω ∈ G. In what follows, all the random objects under consideration are evaluated at this ω. By Lemma 12, for n sufficiently large, max i∈I l i,1 ≤ L n ≤ |x n |. This, together with (31) and (91), implies that, for large n, and consequently, by (65)-(66), for each i ∈ I and n large enough, Equation (30) and the estimates (92)-(93) imply (90), with the constant τ 1 given by (89).
Remark 3 A careful examination of the above proof shows that the constant τ 1 in Lemma 15 (and, consequently, in the statement of Proposition 4, to follow) may be replaced by 1 + i∈I ρ i + for any > 0.
The following proposition, which may be of independent interest, is an analog of Proposition 6.1 in [24]. The main ideas of the proofs of these two results are also similar.
Proposition 4 (State space collapse) Let τ 1 be given by (89), let T > τ 1 and let x n ∈ S be a sequence satisfying (61). Then, for each i ∈ I, on G, In what follows, all the random objects under consideration are evaluated at this ω. Let n ≥ 1. By Lemma 15, we may assume that n is large enough that (90) holds. The definition of T x n implies that all the initial flows from the network with initial state x n are fully transmitted by that time. Together with (90), this implies that there are no initial flows in this network at any time t ≥ τ 1 |x n |.
For i ∈ I and t ∈ [τ 1 , T ], let a n i (t) be the arrival time at the network with initial state x n of the flow on route i which was the last one to receive some (even partial) transmission on this route by time t|x n |. By definition, Recall the random variable L n defined by (64). By the definition of the EDF service protocol, every flow that arrived at route i before a n i (t) − L n (in particular, by a n i (t) − L n − 1) has already been fully transmitted by time t|x n |. Similarly, a flow that arrived at route i after a n i (t) + L n cannot preempt the flow that arrived at time a n i (t), and hence it has not received any transmission by time t|x n |. All these facts imply that, for i ∈ I, t ∈ [τ 1 , T ] and n sufficiently large, Scaling these inequalities and using (62), (65)-(66), together with the fact that t ≥ τ 1 > 1 ≥ r i , we get where the o(1) terms are uniform with respect to t ∈ [τ 1 , T ]. By (95) and Lemma 12, the above estimates imply (94). 1 and let x n ∈ S be a sequence satisfying (61). For each i ∈ I and t ≥ 0, let

Corollary 1 Let T > τ
Then sup This follows immediately from the estimates (96).

Proof of Theorem 1
Consider a linear, strictly subcritical, preemptive EDF resource sharing network, with stochastic primitives satisfying (1)-(5). Let with where τ 0 , τ 1 are as in (80) and (89), respectively. We will show that (10) holds and thus, by Proposition 1, the network is stable. If (10) is false, there exist > 0 and a sequence x n ∈ S with |x n | → ∞ such that E x n X (γ |x n |) ≥ |x n |, n ≥ 1.
Without loss of generality (extracting a subsequence if necessary), we can assume that the sequence x n satisfies (61). Recall the set G from Sect. 5. We will show that on G, (11) holds. Arguing as in the proof of Lemma 4.3a in [9], one can show that lim n→∞ R x n (γ |x n |) |x n | = 0.
Let L + (t) denote the positive part of the greatest lead time of a flow in the system at time t. By (99), we have γ > 1, so the lead times at time γ |x n | of initial flows in the system starting from state x n are negative. This, together with Lemma 12, implies that lim n→∞ L x n + (γ |x n |)/|x n | = 0.
By (86), (99)-(100) and the fact that w i ≤ 1 for all i ∈ I, we have Using Proposition 4 with T = γ , together with the relations (102)-(103), we reduce the proof of (11) to showing the convergence (12) on the set G.
If (12) is false, there exist ω ∈ G, η ∈ (0, 1) and a subsequence of the sequence x n (still denoted by x n for convenience) such that for every n W x n (γ |x n |)(ω) ≥ η|x n |.

Conclusion
In this paper, we have shown stability of linear, strictly subcritical resource sharing networks under the preemptive EDF service protocol. The main idea of the proof was to verify Dai's condition (10) on the growth of the expected "norm" of the performance process as the initial condition gets large. To this end, we used a direct argument, based on the current lead time estimate from Sect. 4 and properties of the workload process in linear resource sharing networks established in Sect. 5. Our approach does not require fluid model analysis, for which the above-mentioned Dai's criterion was originally introduced. This seems to indicate that in some situations it might be simpler to establish (10) by a direct analysis of the original queueing system, instead of showing convergence to an appropriate fluid model and investigating its asymptotic properties.
The above strategy has already been applied in [24], yielding a concise stability proof for a fairly general family of multiclass queueing networks with reneging. The analysis of this paper illustrates the applicability of this approach in a more complicated setting of EDF resource sharing networks.
Our results are readily applicable to the following regularization (in our context, stabilization) of the SRPT service protocol, based on an idea of Bender, Chakrabarti and Muthukrishnan [5]. For i ∈ I, let l i,k = Cṽ i,k , k = 1, . . . , Q i (0), where C is a large positive constant. 1 It is intuitively clear that, for large C, the EDF service discipline in a network satisfying (128) assigns priorities to flows in a way similar to SRPT, at least as long as the waiting times of these flows are not too large. On the other hand, our Theorem 1 implies that every preemptive, linear, strictly subcritical EDF resource sharing network satisfying (1)-(4), (128) is stable. In contrast, some of these networks are known to be unstable under the SRPT protocol [40]. This indicates that exchanging SRPT for its EDF proxy with lead times (128) has a strong long-term smoothing effect on the transmission times of large flows, alleviating their "unfair" discriminatory treatment by SRPT and ultimately resulting in network stability. More general size-based (or rather size-related) scheduling disciplines may be constructed by replacing (128) with where f i : R → R + , i ∈ I, are given deterministic Borel functions. Theorem 1 assures stability of the corresponding preemptive, linear, strictly subcritical EDF resource sharing networks subject to (1)-(4), (129), provided that E f i (v i,1 ) < ∞ for i ∈ I. Note that if the functions f i in (129) are decreasing, EDF gives preferential treatment to larger flows, thus potentially increasing the corresponding queue lengths. The analysis of this paper sheds some light on the stability issue of more general strictly subcritical linear resource sharing networks, working under protocols satisfying the mild "weak non-idleness" assumption of Sect. 5. Recall that, by Lemmas 13, 14 and Remark 2, for each such network with initial state x n , there is a random timeσ , bounded by a constant γ , such that (14) holds. If (14) implies (15), then the argument given in Sect. 7 shows that |W x n (γ |x n |)| = o(|x n |).
For many service disciplines, it is possible to show state space collapse, analogous to the one given by our Proposition 4. In the case of such protocols, (130) implies |Q x n (γ |x n |)| = o(|x n |), and consequently (11), which in turn, by a variant of Theorem 3.1 in [9], ensures that the system is stable. Therefore, the validity of the implication (14) ⇒ (15) is, in many cases, sufficient for network stability. It is plausible that the latter implication may be proved for some service disciplines other than EDF. A trivial example is a protocol under which flows on the long route J + 1 have priority over flows on other routes. We believe that a number of interesting applications of the above methodology, in addition to the EDF case investigated in this paper, may be given. This should be a subject of future research. Another natural future research direction is the development of fluid and heavy traffic approximations of critically loaded linear EDF resource sharing networks. We believe that the estimate (51) will play a key role in this analysis.
Finally, it is desirable to show stability for a sufficiently rich class of more general (not necessarily linear) EDF networks with resource sharing. This clearly requires new tools and ideas, which are beyond the scope of this paper. Note that for a resource sharing queueing system with general topology, the maximal potential stability region must be defined in terms of the so-called network utilization rather than the "bottleneck utilization" max j∈J ρ j . Indeed, as was discussed in detail by Gurvich and Van Mieghem [19,20], such systems typically exhibit unavoidable bottleneck idleness. (Compare also Definition 2.1 of the stability region in [6] and the notion of a feasible arrival rate vector in [13].) We hope that a suitable counterpart of the inequality (51) holds for some more general networks, for example those satisfying the local pooling condition, and that it will turn out to be a key ingredient of the proof of the corresponding extension of Theorem 1.
It is likely that some EDF resource sharing networks which do not satisfy the local pooling condition are unstable. In this context, let us consider an example, provided by Dimakis and Walrand [13], of a strictly subcritical six-cycle packet level model with a deterministic arrival of a single packet to each queue in every time slot, which is unstable under LQF with the "unbiased" random tie-breaking rule. If the initial queue lengths are all equal, LQF for this particular system coincides with FISFO, so the corresponding FISFO (and hence EDF) system is also unstable. However, it is easy to see that if the tie-breaking rule for this network is changed so that a "big match" (i.e., a triple of queues) is always selected for simultaneous service if possible, then the system is actually stable as long as it is strictly subcritical. Hence, the abovementioned network instability is due to a suboptimal choice of the tie-breaking rule rather than the service protocol. Nevertheless, it is plausible that an example of an EDF resource sharing network which is unstable under any tie-breaking rule may be constructed along this line (perhaps by replacing the six-cycle by a larger cyclic graph). Interestingly, Dimakis and Walrand [13] have observed that if there is any randomness in the packet arrivals, then the six-cycle system is actually stable, even with the "unbiased" random tie-breaking rule, as long as it is strictly subcritical. However, in the presence of any randomness, either in the arrivals (implied, for example, by each of our conditions (2)-(3)), or in the transmission times, LQF no longer coincides with FISFO (EDF), so the relation between their stability regions is not clear. These issues should be subject to future research.