1 Introduction

The purpose of this article is to loosen conditions for stability in the “Poisson hail” interacting queueing model introduced by [1]. In the discrete setting for this model, there are countably many jobs (identified by countably many points in space-time). Job i requires service \(\tau _i \) from a subset \(B_i \subset \mathbb {Z}^d\). As in the preceding paper, we associate to a job i a (semi-arbitrary) server \(x = x(i) \in \mathbb {Z}^d\) who is in some sense central in the group \(B_i\). We suppose that for each site \(w \in \mathbb {Z}^d\) the jobs i with \(x(i) = w\) arrive according to a Poisson process \(N_w\) of rate \(\lambda \). The jobs i arriving at w will have their subsets \(B_i \) and service times \(\tau _i \) distributed as i.i.d. vectors, so the arrivals at site w may be considered as a marked Poisson process \(\Phi _w\). In other words, \(\Phi _w\) is a Poisson process on \(\mathbb {R}\times \mathbb {R}_+ \times 2^{\mathbb {Z}^d}\). Points in its support are typically denoted by \((t, \tau , B)\), the t’s forming the aforementioned rate-\(\lambda \) Poisson process. The pair \((\tau , B)\) is referred to as the mark of the point t. We also assume that the \(\Phi _w\) are obtained as follows: Let, for each w, \(\widetilde{\Phi }_w\) be an independent copy of \(\Phi _0\). Then let \(\Phi _w\) contain all points of the form \((t, \tau , B+w)\) where \((t,\tau , B)\) is a point of \(\widetilde{\Phi }_w\). Thus, the arrival process (including marks) is translation invariant. Physically, we can think of the system as a model of hailstones of cylindrical shape \(B \times [0,\tau ] \subset \mathbb {Z}^{d+1}\), where \(\tau \) is the height of the stone and B its base. When a hailstone appears at some point of time at which all sites \(w \in B\) are free, it starts melting at rate 1. If there is at least one \(w \in B\) occupied by a previously arrived stone, then the current stone will not start melting before all sites in w become free; at the first moment of time this happens, the hailstone starts melting at rate 1. (Only the ground, \(\mathbb {Z}^d\) is hot and heat is not transmitted upwards!) At each time t, we let W(tx) be the total work required for x to be free of hailstones provided no stones arrive after t. In queueing terms, W(tx) is a workload. In hailstone terms, W(tx) is the sum of the heights of all hailstones which contain x in their base and have not been melted yet. Since the superposition of \(N_w\), \(w \in \mathbb {Z}^d\), has infinite rate, it follows that within any time interval of positive length there are infinitely many stones arriving. Thus, \(W(t,\cdot )\) will change infinitely many times in any right neighborhood of t. However, typically, for fixed \(x \in \mathbb {Z}^d\), and any \(\varepsilon > 0\), W(tx) depends only on \(W(t-\varepsilon , y)\), for y ranging in a finite (but random) number of sites. This is due to the fact that the only have to look at those \(\Phi _w\) with points \((s,\tau , B)\) such that \(t-\varepsilon \le s \le t\) and \(x \in B\).

Fix \(x \in \mathbb {Z}^d\) and suppose there is \(w \in \mathbb {Z}^d\) such that \((t,\tau , B)\) is a point of \(\Phi _w\). Then

$$\begin{aligned} W(t+,x)={\left\{ \begin{array}{ll} \max _{y \in B} W(t-,y) + \tau , &{} x \in B\\ W(t-, x), &{} x \not \in B. \end{array}\right. } \end{aligned}$$
(1)

By convention, we shall assume that \(t\mapsto W(t,x)\) is right-continuous: W(tx) \(= W(t+,x)\). On the other hand, if there is no w such that \((t,\tau ,B)\) is a point of \(\Phi _w\) with \(x \in B\), then W(st), \(s \ge t\), decreases linearly for a interval of positive length until either it reaches zero or there is job arriving at some \(s > t\) at some site w whose base contains x. We have thus completely specified the dynamics of the system. The system considered here differs from that of [3] in that the latter (i) considers only finitely many sites (\(\mathbb {Z}^d\) is replaced by a finite set) but (ii) works for stationary and ergodic arrival processes.

The system is said to be stable if (starting from full vacancy at time 0) the distribution of W(tx) is tight as t varies for fixed x. The central question to be addressed is when is the system stable (for \(\lambda \) sufficiently small). More precisely, for which laws on \((\tau ,B)\) for jobs arriving at the origin is it the case that there exists \(\lambda _0 \in (0, \infty ) \) so that the system is stable for all arrival rates \(\lambda < \lambda _0\). To avoid trivialities, we assume that B is a finite set, a.s. The founding article [1] showed that the system was indeed stable provided that there is \(c \in (0, \infty ) \) so that

$$\begin{aligned} \mathbb {E}\left[ \hbox {e}^{c\left( \tau + ({\text {diam}}B)^d\right) }\right] < \infty , \end{aligned}$$

where \({\text {diam}}B\) is the diameter of set B, i.e., the maximum of \(|x-y|_\infty \) over all \(x, y \in B\), and where \(|x|_\infty := \max _{1\le i \le d} |x_i|\). The proof in [1] is based on a comparison with an auxiliary branching process with weights requiring the condition stated in the last display.

Our purpose in this paper is to slacken this condition to the existence of the \((d+1+\varepsilon )\)th moment for \(\tau + {\text {diam}}B\). We then (easily) show that this condition is (in a certain weak sense) almost optimal. The key idea is to use ideas on laws of large numbers for lattice animals. This was first proved in [2], though for this paper we take as reference the article by James Martin [4]. In analogy to [3], one could also ask whether stability is possible for more general arrival processes. This question, however, is outside the scope of our paper as our method explicitly uses the Poissonian assumptions.

Our principal result is

Theorem 1

Suppose there exists \(\varepsilon >0\) such that \(\tau \) and \({\text {diam}}B\) have finite moments of order \((d+1+\varepsilon )\):

$$\begin{aligned} \mathbb {E}\tau ^{d+1+\varepsilon } + \mathbb {E}({\text {diam}}B)^{d+1+\varepsilon } <\infty . \end{aligned}$$

Then there exists \(\lambda _0 > 0 \) so that for job arrival rate \( \lambda < \lambda _0 \) the system is stable.

That this result is (in a weak sense) the best possible is shown by

Theorem 2

For any \( d+1> \varepsilon > 0\), we can find a (spatially homogeneous) job arrival process so that

$$\begin{aligned} \mathbb {E}\tau ^{d+1-\varepsilon } + \mathbb {E}({\text {diam}}B)^{d+1-\varepsilon } <\infty \end{aligned}$$

and the system is unstable.

Remark 1

The condition of Theorem 1 is equivalent to the following: There exists \(\varepsilon >0\) and \(C>0\) such that, for any \(x\ge 0\),

$$\begin{aligned} \mathbb {P}(\tau + {\text {diam}}B > x) \le \frac{C}{x^{d+1+ \varepsilon }}. \end{aligned}$$
(2)

Similarly, the condition of Theorem 2 is equivalent to the same thing with \(-\varepsilon \) in place of \(\varepsilon \).

Given stability, it is easy to see that starting from complete vacancy (that is, no workload at any site), the system converges in distribution to an explicitly describable equilibrium. It is natural to ask whether the system possesses other, not necessarily spatially homogeneous, equilibria. While not definitively answering this we show,

Theorem 3

Under the conditions of Theorem 1, there exists \(\lambda _0 > 0 \) so that for arrival rate \(0< \lambda < \lambda _0 \), the only equilibrium for the system that is spatially translation invariant is the limit measure obtained by starting from zero workload.

We now assemble some observations and techniques from earlier papers, [1, 3]. Start the system at time \(-n\) from full vacancy and consider how the workload \(W^n(t,x)\) at time \(t \ge -n\) and site \(x \in \mathbb {Z}^d\) is obtained.

Definition 1

Let \(\Gamma ^n(x,t) \) be the set of locally constant cadlag (= piecewise constant, continuous on the right with left limits at every point) paths \( \gamma : [u,t] \rightarrow \mathbb {Z}^d \) for some \(-n \le u \le t \) such that

  1. (i)

    \( \gamma (t) = x\),

  2. (ii)

    if \( \gamma (s) \ne \gamma (s-)\), then a job arrived at time s requiring service from both servers \( \gamma (s) \) and \( \gamma (s-)\).

Associate to such a \(\gamma \in \Gamma ^n(x,t)\) the score

$$\begin{aligned} V( \gamma ) = \sum _i \tau _i - (t-u), \end{aligned}$$

where the sum is over jobs \((\tau _i, B_i )\) which are arrive at time \(s_i \) with \(\gamma (s_i) \in B_i \). Based on the way that the workload evolves [see discussion around Eq. (1)] we obtain that \(W^n(t,x) = \sup _{\gamma \in \Gamma ^n(x,t)} V(\gamma )\). See Fig. 1.

There are three monotonicity properties that the system possesses and which we take into account when analyzing its stability. We start from full vacancy at time \(-n\) and consider \(W^n(t,x)\) for some \(t \ge -n\). Then \(W^n(t,x)\) will increase if we (i) delay all arrivals between \(-n\) and t, or (ii) increase the heights of the stones, or (iii) enlarge their bases.

Thanks to the monotonicity, it was deduced in [1] that it is enough, for the results sought, to consider the case where the sets B for the team of servers required for a job i with \(x(i) = 0\) is a cube centered at the origin and we write (for a job arriving at server x in time interval \((m-1,m]\)) \(R^{x,m}_i\) for the value so that \(B = x + [-R^{x,m}_i,R^{x,m}_i]^d\). We will work with time doubly infinite, notwithstanding the fact that we consider the process on \([-n, \infty )\).

Fig. 1
figure 1

Graphical representation of the part of the process responsible for the computation of the profile \(W(t,\cdot )\) when the system starts with \(W({-}n,\cdot )\equiv 0\). Consider the evaluation of W(t, 0) at site \(x=0\). Horizontal intervals represent hailstone (job) arrivals with heights \(\tau _i\). Only those arrivals which can potentially influence W(t, 0) are shown. Consider a path \(\gamma \) as indicated, from (u, 2) to (t, 0). Its score is \(V(\gamma )= \tau _1+\tau _3+\tau _7-(t-u)\). W(t, 0) is the maximum of these scores over all such paths starting from some (uy) and ending at (t, 0)

The first step is the discretization of the Poisson processes. We consider for \(m \in \mathbb {Z}\) and \(x \in \mathbb {Z}^d\) the random variables

$$\begin{aligned} R_{x,m} = \sum _{i } R^{x,m}_i, \quad T_{x,m} = \sum _{i} \tau ^{x,m}_i, \end{aligned}$$

the sum taken over all jobs i arriving on the time interval \((m-1,m]\) at site x. Since the summands in \(R_{x,m}\) are i.i.d. and their number is Poisson (and, therefore, light tailed), it is clear that \(R_{x,m}\) has finite \(\alpha \) moment if each summand has finite \(\alpha \) moment. Similarly for \(T_{x,m}\).

Lemma 4

If for \( \alpha > 0\), \(\mathbb {E}({\text {diam}}B)^{\alpha } <\infty \) (resp., \(\mathbb {E}\tau _i^{\alpha } <\infty \)), then \(\mathbb {E}R_{x,m}^{\alpha } <\infty \) (resp. \(\mathbb {E}T_{x,m}^{\alpha }<\infty \)).

We will deal with the discretized system where at server w at time n a job requiring service time \(T_{w,n} \) from each server in the cube \([w-R_{w,n},w+R_{w,n} ] ^d \). By the monotonicity (see [1] for detail), this discretization is effective in that if we can show stability for the discretized system of jobs, then we will have shown stability for the original system: The workload at time m for this system will dominate that arising from the nondiscretized model. It is also as well to note that we have not given up too much here. In principle, if we have multiple w jobs arriving during interval \((m-1,m]\), then we could, again in principle, lose if one job required a long service but only from w while a second job required a very short service from a large cube of servers centered at w. However, this will be rare for small \(\lambda \), where our analysis is most relevant.

2 (Very) Greedy Lattice Animals (GLA)

As noted, we wish to exploit the celebrated results (see [2]) on greedy lattice animal systems. Recall that a lattice animal of \(\mathbb {Z}^r \) is simply a connected subset (when \(\mathbb {Z}^r \) is considered as a graph with the standard edge set). We are given a collection of i.i.d. positive random variables \(\{X(x)\}_{x \in \mathbb {Z}^r} \). We suppose the existence of \(\varepsilon > 0\) so that \(\mathbb {E}X(0)^{r+1+\varepsilon }<\infty \) or, equivalently, the existenceFootnote 1 of \(\varepsilon >0\) and \(C < \infty \) so that for all \(t >1\),

$$\begin{aligned} \mathbb {P}(X(0) > t ) \le C/t^{r+1+ \varepsilon }. \end{aligned}$$
(3)

(The 1 in the power is unnecessary but it is in this case that we will use our results.)

We will then parametrize our system by taking i.i.d. random variables \(X^\lambda (x) \) to be equal to X(x) with probability \(\lambda \) and otherwise 0. For \(\zeta \subset \mathbb {Z}^r \), its \(X^\lambda \) value (or score) is simply

$$\begin{aligned} X^\lambda (\zeta ):=\sum _{x \in \zeta } X^\lambda (x). \end{aligned}$$
(4)

The size \(|\zeta |\) of the lattice animal is its cardinality. Note that \(X^\lambda (\zeta ) \rightarrow 0\), as \(\lambda \rightarrow 0\), in probability, for any lattice animal of finite size. We fix positive integer k and \(c_1 > 0\) and consider the event

$$\begin{aligned} A_k:=\big \{\exists \text { lattice animal }\zeta \text { containing }0,\, |\zeta |=2^k,\, X^\lambda (\zeta ) \ge c_1 2^k\big \}. \end{aligned}$$

We wish to prove the following upper bound on the probability of \(A_k\), a result which may be of independent interest. We note that \(\mathbb {P}(A_k)\) depends both on \(\lambda \) and \(c_1\).

Proposition 5

Given any \(c_1 > 0\), there exists a \(\lambda _0 >0\) and a function \(C:[0, \lambda _0 ) \rightarrow (0,\infty )\) so that \(C( \lambda ) \rightarrow 0 \) as \(\lambda \rightarrow 0\) and so that, for \(\lambda < \lambda _0\) and for all positive integers k,

$$\begin{aligned} \mathbb {P}(A_k) \le \frac{C(\lambda )}{2^{k(1+\varepsilon )}}. \end{aligned}$$

Remark 2

  1. (i)

    We can use the above to bound the probability that there is a lattice animal of size \(u \ge 2^k\) containing the origin whose value is \(\ge c_1 u\), when \(\lambda \) is small, by considering \(\bigcup _{\ell \ge k} A^{c_1/2}_\ell \) whose probability, by the above, is less than \(C(\lambda ) 2^{-k(1+\varepsilon )} (1-2^{-(1+ \varepsilon )})^{-1}\).

  2. (ii)

    The above formalism will certainly apply to our situation with random variables \(R_{x,m} + T_{x,m}\) at each site \(x \in \mathbb {Z}^r\). Indeed, if X(x) denotes the random variable at site x for rate \(\lambda = 1\) conditioned on there being at least one arrival, then it is easy to see that with rate \(\lambda <1\), \(R_{x,m} + T_{x,m}\) is stochastically less than \(X^{1-\hbox {e}^{-\lambda }}(x)\).

Some notation used in the proof and elsewhere. If \(x=(x_1,\ldots ,x_r) \in \mathbb {R}^r\), then \(|x|_\infty := \max _{1 \le i \le r} |x_r|\). The \(L^\infty \) ball \(\mathbb {B}(x,\rho )\) centered at x is the set

$$\begin{aligned} \mathbb {B}(x,\rho )= \{y \in \mathbb {R}^r:\, |y-x|_\infty \le \rho \}. \end{aligned}$$

We also let \(|x|_1:= \sum _{i=1}^r |x_i|\). We use for the indicator of A.

Proof of Proposition 5

We split the value \(X^\lambda (\zeta )\), see Eq. (4), into three parts:

$$\begin{aligned} X^\lambda (\zeta ) = X^\lambda _a(\zeta ) + X^\lambda _b(\zeta ) + X^\lambda _c(\zeta ), \end{aligned}$$
(5)

where

The constants q and v appearing in the splitting are chosen as

$$\begin{aligned} q:=\frac{r}{r+ 1 + \varepsilon }< v < 1. \end{aligned}$$

Define next four events:

$$\begin{aligned} A_{k,a}&:=\big \{\exists \text { lattice animal }\zeta \text { containing } 0,\, |\zeta |=2^k,\, X^\lambda _a(\zeta ) \ge c_1 2^k/10\big \} \\ A_{k,b}&:=\big \{\exists \text { lattice animal }\zeta \text { containing } 0,\, X^\lambda _b(\zeta ) \ge c_1 2^k/10\big \} \\ A_{k,c}&:=\big \{ N^\lambda _c(B_k) \ge m\big \} \\ A_{k,d}&:= A_k \setminus (A_{k,a} \cup A_{k,b} \cup A_{k,c}), \end{aligned}$$

where m is a positive integer satisfying

$$\begin{aligned} m(v(r+1+\varepsilon )-r) > 1+\varepsilon , \end{aligned}$$
(6)

where \(B_k:=[-2^k, 2^k]^r=\mathbb {B}(0,2^k)\), and where \(N^\lambda _c(B_k)\) is the integer-valued random variable

Note that if a lattice animal \(\zeta \) of size \(|\zeta |=2^k\) contains 0, then \(\zeta \subset B_k\).

We obtain an upper bound for \(\mathbb {P}(A_k)\) via

$$\begin{aligned} \mathbb {P}(A_k) \le \mathbb {P}(A_{k,a}) + \mathbb {P}(A_{k,b}) + \mathbb {P}(A_{k,c}) + \mathbb {P}(A_{k,d}). \end{aligned}$$

Bound for \(\mathbb {P}(A_{k,d})\): Since \(A_k\) occurs, there is a lattice animal \(\zeta \) of size \(2^k\) containing the origin and having value \(X^\lambda (\zeta )\ge c_1 2^k\). Since \(A_{k,a}\) does not occur, we have \(X^\lambda _a(\zeta ) \le c_1 2^k/10\). Since \(A_{k,b}\) does not occur, we have \(X^\lambda _b(\zeta ) \le c_1 2^k/10\) and so \(X^\lambda _b(\zeta ) \le c_1 2^k/10\). Therefore, from (5),

$$\begin{aligned} X^\lambda _c(\zeta ) \ge \frac{c_1 2^k}{2}. \end{aligned}$$

But this, together with the fact that \(A_{k,c}\) does not occur, implies that there is \(x \in B_k\) such that \(X^\lambda _x \ge 2^k c_1/2\,\hbox {m}\). Hence

$$\begin{aligned} \mathbb {P}(A_{k,d}) \le \sum _{x \in B_k} \mathbb {P}(X^\lambda (x) \ge 2^k c_1/2\,\hbox {m}) \le K_m \lambda /2^{(1+\varepsilon )k}, \end{aligned}$$

for some constant \(K_m\).

Bound for \(\mathbb {P}(A_{k,c})\): The event \(A_{k,c}\) is the event that the sum of at most \((2\times 2^k+1)^r\) Bernoulli random variables, each taking value 1 with probability at most \(p_k=\lambda C 2^{-kv(r+1+\varepsilon )}\), exceeds m. To bound this probability, we observe that if \(S_n(p)\) is the sum of n i.i.d. Bernoulli(p) random variables, then \(\mathbb {P}(S_n(p) \ge m)\) is upper bounded by the probability that there is a set \(A \subset \{1,\ldots ,n\}\) of size m such that all Bernoulli random variables are equal to 1 on A, so

$$\begin{aligned}&\mathbb {P}(S_n(p) \ge m) \le \left( {\begin{array}{c}n\\ m\end{array}}\right) p^m \le \frac{n^m p^m}{m!}.\\&\mathbb {P}(A_{k,c}) \le \big ( (2^{k+1} +1)^{r} p_k \big )^m \le \frac{K_m^{\prime } \lambda }{2^{(1+\varepsilon )k}},\nonumber \end{aligned}$$
(7)

for some constant \(K_m^{\prime }\) and thanks to the choice (6) for m.

Bound for \(\mathbb {P}(A_{k,b})\): We have

The sum in the probability is the sum of \(n_k=(2^{k+1}+1)^r\) Bernoulli random variables, each with probability being 1 being at most \(p_k:= \lambda C (2^{qk}/k^2)^{-(r+1+\varepsilon )}= \lambda C k^{2(r+1+\varepsilon )}/2^{rk}\). We can apply now inequality (7) with \(n=n_k\), \(p=p_k\) and \(m=c_12^{k(1-v)}/10\), and the Stirling formula for \(m!\sim \sqrt{2\pi m}\hbox {e}^{m (\log m -1)}\), to obtain that the required probability \(\mathbb {P}(A_{k,c})\) is not bigger than

$$\begin{aligned} \lambda \hbox {e}^{-m (\log m - 2(r+1+\varepsilon ) \log k) (1+o(1))} = \lambda o \left( 2^{-(1+\varepsilon )k}\right) , \end{aligned}$$

as \(k\rightarrow \infty \). Therefore,

$$\begin{aligned} \mathbb {P}(A_{k,c}) \le \frac{K^{\prime \prime } \lambda }{2^{(1+\varepsilon )k}}, \end{aligned}$$

for some constant \(K^{\prime \prime }\).

Bound for \(\mathbb {P}(A_{k,a})\): We repeat the argument given in [4] (or [2]).

Lemma 6

(Lemma 1 in [2], Lemma 2.1 in [4]) For any lattice animal \(\zeta \) of size n containing the origin and any \(1 \le \ell \le n\) we can find a sequence \(0 = u_0, u_1, \ldots , u_h\) of points in \(\mathbb {Z}^r\), h being the integer part of \(2n/\ell \), and \(|u_i-u_{i-1}|_{\infty }\le 1\) for all \(1 \le i \le h\), so that

$$\begin{aligned} \zeta \subset \bigcup _{i=0}^ h \mathbb {B}(\ell u_i, 2\ell ). \end{aligned}$$

Proof

For \(y=(y^1,\ldots ,y^r)\in \mathbb {R}^r\), let \(\lfloor y/\ell \rfloor \) be the point x in \(\mathbb {Z}^r\) such that \(x^i = \lfloor y^i/\ell \rfloor \) (the quotient of the division by \(\ell \), componentwise). Clearly, \(\ell x \le y < \ell (x+1)\), componentwise, so \(|y-\ell x|_\infty \le \ell \). If \(\zeta \) is a lattice animal containing 0 we can find a sequence \(\pi =(\pi _0,\ldots ,\pi _{2n})\) such that successive elements are either identical or neighbors in \(\mathbb {Z}^r\) (\(\pi \) is a path) and such that \(\{\pi _0,\ldots ,\pi _{2n}\}=\zeta \). (To do this, consider a spanning tree of \(\zeta \) and form \(\pi \) by traversing the tree “from the bottom.”) Then \(|\pi _i-\pi _j|_\infty \le \ell \) if \(|i-j| \le \ell \). Define \(u_i \in \mathbb {Z}^r\) by \(u_i:= \lfloor \pi _{i\ell }/\ell \rfloor \), \(i=0,\ldots , h\). Then \(|u_i-u_{i-1}|_\infty \le \ell \) for all i. Furthermore, if \(x \in \zeta \), then \(x=\pi _t\) for some \(0 \le t \le 2n\). Let \(k=\lfloor t/\ell \rfloor \). Then \(|\pi _t-\ell u_{k}|_\infty \le |\pi _t-\pi _{k\ell }|_\infty + |\pi _{k\ell }-\ell u_{k}|_\infty \le \ell +\ell \), so \(x=\pi _t \in \mathbb {B}(\ell u_k,2\ell )\). \(\square \)

From this, it is immediate that for given \(\ell \) there are at most \(9^{r 2n/\ell }\) such \(2\ell \) ball coverings. We use this result with \(n= 2^k \). We consider \(\ell \) of “scale” \(2^i\) with \(2^i \le 2^{qk}/k^2\). Let \(i_0 \) be the maximal such value. For given i, we choose the value \(\ell = l(i)\) to equal the integer part of \(\lambda ^{-1/2r}\, 2^{i/q}\). With this value the probability that a \(L^\infty \) ball of radius \(2\ell \) contains a site having an \(X^\lambda \) value is small for \(\lambda \) small but not (in principle) negligible. From this, it is easily seen that given a sequence \(u_0, u_1,\ldots ,u_h\) satisfying the above (and therefore given an \(\ell \) covering), the probability that

$$\begin{aligned} \text {the number of sites within the covering having value at least } 2^i \text { is at least }2(2^k/\ell ) c_1 \end{aligned}$$

is bounded above by \(20^{-2 r \cdot 2^k/\ell }\) for \(\lambda \) small. Thus, we see that outside an event of probability \(9^{hr} 20^{-2 \cdot 2^kr/\ell }\), this bound will hold for all \(\ell \) coverings. Summing over i such that \(2^i \le 2^{kq}/k^2\) we have that outside probability

$$\begin{aligned} \sum _ {2^i \le 2^{kq}/k^2} \left( \frac{1}{2}\right) {2 \cdot 2^kr/\ell (i)} \le 2 \left( \frac{1}{2}\right) ^{2 \cdot 2^kr/\ell (i_0)} \le 2 \left( \frac{1}{2}\right) ^{c(\lambda k^{2/q})} \end{aligned}$$

for each such i and for each corresponding \(\ell (i)\) covering, the number of sites in the covering whose \(X^\lambda \) value at least \(2^i\) is at most \(2(2^k/\ell )c_1\).

Thus (outside of probability \(2 (\frac{1}{2})^{c(\lambda k^{2/q})}\) for some universal c), we have, for any lattice animal \(\zeta \) of size \(2^k\),

which is bounded by \(\text {Constant}(\varepsilon ) 2^k \lambda ^{(1+\varepsilon )/2r}\). The conclusion follows for large k. Thus, we have shown the proposition.

Corollary 7

Define

$$\begin{aligned} B^{c_1}_u(x):= \{\exists \,\mathrm{lattice\,animal}\,\zeta \,\mathrm{containing}\, x,\, |\zeta | \ge u,\, X^\lambda (\zeta ) \ge c_1 |\zeta |\}. \end{aligned}$$

For \(c_{1} < 1\) fixed, there exists a constant \(\lambda _1 = \lambda _1 (c_1) \) and a function H defined on \([0, \lambda _1 )\) tending to zero as \(\lambda \) tends to zero, so that for all \(0< \lambda < \lambda _1\) and all positive integers R,

$$\begin{aligned} \mathbb {P}\left( \bigcup _{x \in [-R,R]^r} B^{c_1}_u(x)\right) \le {\left\{ \begin{array}{ll} \displaystyle \frac{H(\lambda )}{{(u+1)}^{1+ \varepsilon }}, &{} u \ge R \\ \displaystyle \frac{H(\lambda ) R^{r}}{{(u+1)}^{r +1+ \varepsilon }}, &{} u \le R. \end{array}\right. } \end{aligned}$$

Proof

We treat the case \(u\ge R \) only as that for \(u \le R\) is essentially the same. Fix \(c_2 < c_1\). Let

By Proposition 5 and Remark 2(i), \(\mathbb {P}(B^{c_2}_u(x)) \le C(\lambda )/2^{k(1+\varepsilon )}\) where k is the largest integer with \(2^k \le u\). Therefore,

$$\begin{aligned} \mathbb {E}(N) \le \frac{C(\lambda )(2R+1)^r}{2^{k(1+\varepsilon )}}. \end{aligned}$$

Now suppose that event \(\bigcup _{x \in [-R,R]^r} B^{c_1}_u(x) \) occurs. Then for some \(x \in [-R, R]^r\) and some lattice animal \(\xi \) containing x, \(X^\lambda (\xi ) \ge c_1 | \xi |\). Now for every \(y \in [-R, R]^r\) with \(|y-x|_\infty \le \frac{R(c_1-c_2)}{c_2r}\), we can create a new lattice animal \(\xi ^{\prime }\) containing both y and \(\xi \) by adding at most \((c_1-c_2)R/c_2\) points to \(\xi \). Since we assumed \(R \le u \le |\xi |\), we have \(|\xi | \le |\xi ^{\prime }| \le (c_1/c_2) |\xi |\). By positivity of the random variables

$$\begin{aligned} X^\lambda (\xi ^{\prime }) \ge X^\lambda (\xi ) \ge c_1 | \xi | \ge c_2 | \xi ^{\prime }|. \end{aligned}$$

Thus, the event \(\bigcup _{x \in [-R,R]^r} B^{c_1}_u(x) \) is a subset of the event that random variable N defined at the start of the proof is at least \((\frac{(c_1 -c_2)R}{rc_2})^r\). Our result now follows from Markov’s inequality. \(\square \)

3 Cluster Formation and Their Properties

In this section, we construct clusters for our Poisson hail corresponding to integer intervals \((m-1,m]\). The clusters themselves will follow a clustering procedure of [1] and will depend only on the random variables \(\{R_{x,m}\}_{x}\). Our departure will consist in the temporal (or workload) variable we associate to each cluster. Our clusters will have the property that if \(C \subset \mathbb {Z}^d \) is a cluster and \(\gamma : (m-1,m] \rightarrow \mathbb {Z}^d\) is a path satisfying property (ii) of Definition 1, then

$$\begin{aligned} \gamma (m) \in C \Rightarrow \gamma (s) \in C\quad \text {for all}\quad s \in (m-1,m]. \end{aligned}$$
(8)

Recall we discretized time by identifying with m all tasks for site x arriving in \((m-1, m]\) with a single task of “radius”

$$\begin{aligned} R_{x,m} \equiv \sum R_i^{x,m}, \end{aligned}$$

summed over all tasks arriving at x in time interval \([m-1,m]\). We denote by \(t^{x,m}_i \) the times of the arrivals, i.e., the points of the Poisson process \(N_x\) in the interval \((m-1, m]\). The indices i are coordinated so that for site x a job arrives at time \(t^{x,m}_i\) requiring \(\tau ^{x,m}_i\) units of service from servers in \(x + [-R_{i}^{x,m}, R_{i}^{x,m}]^d\).

By Lemma 4,

$$\begin{aligned} \mathbb {P}(R_{x,m} \ge u) \le \frac{C}{(u+1)^{d+1+\varepsilon }}, \end{aligned}$$

for some \(\varepsilon >0\) and some finite constant \(C= C( \varepsilon )\) for any \( \varepsilon \) conforming to the hypotheses of Theorem 1.

For fixed “time” m and \(y\in \mathbb {Z}^d\) let

$$\begin{aligned} D_{y,m}:=\mathbb {B}(y, R_{y,m}) \end{aligned}$$

be the \(L^\infty \) ball centered at y and having radius \(R_{y,m}\). The cluster C(xm) containing x is defined as the union of such \(D_{y,m}\) over y having the property that there exists integer K and sites \(y=z_0, \ldots , z_K=x\) such that \(D_{z_i,m} \cap D_{z_{i-1},m} \not = \varnothing \), for all \(1\le i \le K\).

Let D(xm) be the diameter of the cluster C(xm). It is clear that these clusters have the property (8) above. What is not a priori clear is that even with very small rate \(\lambda \) the clusters will be a.s. finite. However, the preceding section enables us to prove

Lemma 8

Assume that \(\mathbb {P}(X(0)>t) \le C/t^{d+1+\varepsilon }\).

Then there exists a function \(K(\lambda )\) tending to zero as \(\lambda \) tends to zero so that, for \(\lambda \) sufficiently small and all positive integers z,

$$\begin{aligned} \mathbb {P}(D(0,m) \ge z) \le \frac{K(\lambda )}{z^{1+\varepsilon }}. \end{aligned}$$

Proof

Consider the GLA system with random variables \(\{Z(x)\}_{x \in \mathbb {Z}^d}\) for

$$\begin{aligned} Z(x) = R_{x,m}. \end{aligned}$$

If the diameter D(0, m) of the cluster C(0, m) containing the origin exceeds z, then there must exist L and \(0 = x_0, x_1, \ldots , x_L\) so that, for all \(1 \le i \le L\),

$$\begin{aligned} |x_{i-1} - x_i|_\infty \le R_{x_i,m} + R_{x_{i-1},m} \end{aligned}$$
(9)

and \(|x_L| \ge z\). We choose \(\zeta \) to be the lattice animal \(\bigcup _{i=1} ^ L P(x_{i-1}, x_i) \) where \(P(x_{i-1}, x_i)\) is a path connecting \(x_{i-1} \) and \(x_i \) of length \(|x_{i-1} - x_i |_1\). (Recall that \(|x|_1\) denotes the \(L^1\) norm.) Then \(\zeta \) is a lattice animal in \(\mathbb {Z}^d\) containing the origin for which \(\sum _{y \in \zeta } Z(y) \ge \sum _{i=0}^L Z(x_i)\). By (9),

$$\begin{aligned} \sum _{i=0}^L Z(x_i) \ge \frac{1}{2} \sum _{j=1}^L |x_{i-1} - x_i|_\infty \ge \frac{1}{2d} \sum _{j=1}^L |x_{i-1} - x_i|_1 \ = |\zeta |/2d \ge z/2d. \end{aligned}$$

The result follows from Proposition 5 applied to \(c_1 < \frac{1}{4d}\). \(\square \)

Arguing as in Corollary 7, we obtain

Corollary 9

There is a function \(C(\lambda )\) tending to zero as \(\lambda \rightarrow 0\) so that for \(\lambda \) small, for all L, and for \(R \le L/2\),

$$\begin{aligned} \mathbb {P}(\exists x \in [-R,R]^d\quad \mathrm{with}\quad D(x,m) \ge L) \le \frac{C(\lambda )}{L^{1+ \varepsilon }}. \end{aligned}$$

while for \( \lambda \) small and \(R \ge L/2\)

$$\begin{aligned} \mathbb {P}(\exists x \in [-R,R]^d\quad \mathrm{with}\quad D(x,m) \ge L) \le \frac{C(\lambda ) R ^ d }{L^{d + 1+ \varepsilon }}. \end{aligned}$$

We now consider the “time” T(xm) associated with the cluster C(xm). This definition is a little less direct than that for D(xn): Given \(x \in \mathbb {Z}^d\) and integer m (and so given cluster C(xm)), T(xm) is equal to the maximum value of

$$\begin{aligned} \sum _{i=0}^L \tau _{j(i)}^{x_i,m} \end{aligned}$$

over sequences \(x_0, x_1, \ldots , x_L \in C(x,m)\) and \(m \ge t_0 \ge t_1 \ge \cdots \ge t_L \ge {m-1}\) so that, for all i a job arrives at \(x_i\) at time \(t_i = t^{x,m}_{j(i)}\) having work time \(\tau ^{x_i,m}_{j(i)}\) and

$$\begin{aligned} |x_{i-1} -x_i|_\infty \le R^{x_i,m}_{j(i)} + R^{x_{i-1},m}_{j(i)}, \quad 1\le i \le L. \end{aligned}$$

We remark that, under the latter two conditions, if \(x_0 \in C(x,m) \), then necessarily the “subsequent” \(x_i\) are also in this cluster. We note also that this definition (which requires more information than the discretized data) ensures that, for any site in C(xm), the waiting time accrued during time interval \((m-1,m]\) is less than or equal to T(xm).

Lemma 10

There exists function \(K(\lambda )\) which tends to zero as \(\lambda \) tends to zero so that for \(\lambda \) sufficiently small for all \(z \ge 1\),

$$\begin{aligned} \mathbb {P}( T(0,m) \ge z ) \le \frac{K( \lambda )}{z^{1+ \varepsilon }}. \end{aligned}$$

Proof

By the previous lemma we may suppose that \(D(0,n) \le z/100\). Again if T(0, m) takes a value exceeding z, then there must exist L and a sequence \(x_0,x_1, \ldots , x_L \in C(0,m)\) and times \(m \ge t_0 \ge t_1 \ge \cdots \ge t_L \ge {m-1}\) so that, for all \(i \le L-1\), there is a job arrival at \(x_i \) at time \(t_i\) and

$$\begin{aligned} |x_{i-1} - x_i |_{\infty } \le R^{x_i,m}_{j(i)} + R^{x_{i-1},m}_{j(i-1)}, \quad 1 \le i \le L, \end{aligned}$$

and also \(\sum _i T(x_i,t_i ) \ge z\). It is important to note that we do not assume that the \(x_i\) are distinct. Indeed, it is for this reason that we use that bound involving \(R^{x_i,m}_{j(i)}\) rather than \(R_{x_i,m}\). However, if y is equal to \(x_{i_1}, x_{i_2}, \ldots , x_{i_r}\), then of course \(R_{y,m} \ge \sum _k R^{x_{i_k},m}_{j(i_k)}\) and equally \(T_{y,m} \ge \sum _k \tau ^{x_{i_k},m}_{j(i_k)}\). Thus as before we obtain, as in Lemma 8, but with \(Z(x) = R_{x,m} + T_{x,m}\), that for a lattice animal \(\zeta = \bigcup _i P(x_{i-1}, x_i)\) that the GLA(Z) score (i.e., \(\sum _{x \in \zeta } (R_{x,m} + T_{x,m} ) \)) will exceed \((z+|\zeta |)/4d\). \(\square \)

Again we have

Corollary 11

There exists function \(C(\lambda )\) which tends to zero as \(\lambda \) tends to zero so that for so that for all L and for \(R \le L/2\),

$$\begin{aligned} \mathbb {P}(\exists x \in [-R,R]^d\quad \mathrm{with}\quad T(x,m) \ge L) \le \frac{C( \lambda )}{L^{1+ \varepsilon }}. \end{aligned}$$

while for \( \lambda \) small and \(R \ge L/2\)

$$\begin{aligned} \mathbb {P}(\exists x \in [-R,R]^d\quad \mathrm{with}\quad T(x,m) \ge L) \le \frac{C(\lambda ) R ^ d }{L^{d + 1+ \varepsilon }}. \end{aligned}$$

4 Workload Bounds and Stability

We now apply the foregoing to analyze the workload stability for small values of \(\lambda \). It is enough to show tightness of the workload \(W^n(0,0)\) at time 0 when the system starts empty at time \(-n\). Recall that \(W^n(0,0)\) is obtained as the maximum of scores \(V(\gamma )\) where \(\gamma \) ranges in the set of paths \(\Gamma ^n(0,0)\). See Definition 1.

Due to the monotonicity properties of the system, \(W^n(0,0)\) is readily seen to be bounded above by the quantity \(W^{n,D}(0,0)\) which corresponds to the discretized system and is given by

$$\begin{aligned} W^{n,D}(0,0) = \sup _\gamma V^D(\gamma ), \end{aligned}$$

where the supremum is taken over discrete time indexed paths \(\gamma : [-r,0] \rightarrow \mathbb {Z}^d\) for some \(0 \le r \le n\) satisfying

  1. (i)

    \(\gamma (0) = 0\),

  2. (ii)

    for each \(-n < i \le 0\), \(\gamma (i-1)\) belongs to cluster \(C(\gamma (i), i)\).

The score \(V^D(\gamma )\) of \(\gamma \) is given by

$$\begin{aligned} V^D( \gamma ) = \left( \sum _{i=0}^{r-1} T(\gamma (-i),-i)) \right) - r. \end{aligned}$$

We now consider a cube H of length R in \(\mathbb {Z}^{d+1} = \mathbb {Z}^{d} \times \mathbb {Z}\) where the first d coordinates are considered as “spatial” and the last one temporal. Accordingly, we write H as \(H^\prime \times I\) where I is a temporal interval of length R and \(H^\prime \) is a cube of length R in \(\mathbb {Z}^d\). We define the variable

$$\begin{aligned} V(H,u):= & {} \text {number of clusters}\, C(x,m)\,\text {intersecting}\,H^{\prime } \\&\times \,\{m\}\quad \text {and having}\,D(x,m)+T(x,m) \ge u, m \in I. \end{aligned}$$

Clusters are by definition enclosed in a slab \(\mathbb {Z}^d \times \{m\} \) for some m and the clusters at different temporal levels m are independent. Thus (after repeated use of Lemma 2 of [1]), we easily obtain from Corollaries 9 and 11.

Proposition 12

There exists constant \(K_\lambda \) so that, for \(u \le R\), V(Hu) is stochastically less than Poisson of parameter \(\frac{K_\lambda R^{d+1}}{{(u+1)}^{d +1+ \varepsilon }}\). For \(u \ge R\) it is bounded by a Poisson of parameter \( \frac{K_\lambda R}{{(u+1)}^{1+ \varepsilon }} \). Furthermore, as \(\lambda \) tends to zero, \(K_\lambda \) tends to zero.

Given the above for C a cluster corresponding to temporal interval \((m-1,m]\), we define a value \(X_m(C)\) to signify the value \(D(x,m) + T(x,m)\) for a (and so any) x in C.

In analyzing \(V^D( \gamma ) \) we will consider a (nonstandard) lattice animal system on \(\mathbb {Z}^{d+1}\). Instead of having i.i.d. random variables indexed by points \((x,m) \in \mathbb {Z}^{d+1}\), we will consider a lattice animal model based on the random variables \(X_m(C)\) which, while independent for distinct collections of index m, are not independent. Given a lattice animal \(\Xi \subset \mathbb {Z}^{d+1}\), we write \(\Xi _m\) to denote \(\Xi \cap \mathbb {Z}^{d} \times \{m\}\) ( of course in general \(\Xi _m\) will not be a lattice animal). Obviously given \(\Xi \), all but finitely many \(\Xi _m\) will be empty. The value \(V^C(\Xi )\) associated with such a lattice animal \(\Xi \) will be

To analyze \(\sup V^C(\Xi )\) over all lattice animals containing the origin of \(\mathbb {Z}^{d+1}\) and of cardinality N, we proceed as in Sect. 2. For a positive integer k, we let

$$\begin{aligned} L_k:= 2^{k(1+\varepsilon /2(d+1))}. \end{aligned}$$

(We are primarily interested in k with \(L_k \le N/\log ^2(N)\)). We know from Sect. 2 that there are less than \(K_d^{2N/L_k+1}\) collections of \(L_k\) cubes in \(\mathbb {Z}^{d+1}\), each collection denoted as \(\{C^k_1, C^k_2, \ldots , C^k_{2N/L_k + 1}\}\), so that each \(\Xi \) considered is contained in the union of the \(C^k_i\) for one of these collections. Given such a collection we have, by Proposition 12, that for any j, \(V(C^k_j, 2^k)\) is stochastically less than a Poisson random variable of parameter \(K_\lambda /2^{\varepsilon / 2}\). Furthermore (again by Lemma 2 of [1]), we have that, having identified all clusters intersecting \(\bigcup _{i<j} C^k_i\), then conditional number of “extra” clusters intersecting \(C^k_j\) is stochastically less than this Poisson random variable. Thus, just as in Sect. 2, we obtain the following: If \(N_k(\Xi )\) is the number of clusters of value more than \(2^k\) that intersect \(\Xi \), then, for all \(N/L_k \le \log ^2(N)\) and N large,

$$\begin{aligned} \mathbb {P}\left( \sup _\Xi N_k(\Xi ) \ge 3K_\lambda N/L_k \right) \le \hbox {e}^{-N/L_k}. \end{aligned}$$

Thus for every lattice animal \(\Xi \) of size N containing the origin, outside of probability bounded by \(\text {Const} \times \hbox {e}^{-\log ^2(N)}\), the contribution to \(V^C(A)\) from clusters C having \(X_n(C) \) less than \(\left( \frac{N}{\log ^2(N)} \right) ^{\frac{1}{1+ \varepsilon /2(d+1)}} = N_0\) is less than

$$\begin{aligned} 3 N K_\lambda \sum _{2^k \le N_0} 2^k/2^{k(1+ \varepsilon /2(d+1) )} \le N K_\lambda C( \varepsilon ), \end{aligned}$$

for some finite \(C(\varepsilon )\), where (we stress) \(K _ \lambda \) tends to zero as \(\lambda \) tends to zero.

Given this bound we easily deal with the clusters having value greater than \(N_0\) using Proposition 12 and the arguments of Sect. 2 and obtain

Proposition 13

For the above system, for each \(\delta >0\), there exists a sufficiently small \(\lambda >0\), such that the probability that there exists a lattice animal \(\Xi \) of size at least n, containing the origin of \(\mathbb {Z}^{d+1}\) and with \(V^C(\Xi ) > \delta |\Xi |\) is less than \(C_\delta / n^{\varepsilon /2} \) for all positive integers n.

We now apply this to the values \(V^D(\gamma )\). We denote by \(\Gamma _m\) the set of discrete time paths \(\gamma : [-m,0] \rightarrow \mathbb {Z}^d\) satisfying the stipulated conditions: \(\gamma (0) = 0\) and for all \(0 \le i < m\), \(\gamma (i-1) \in C(\gamma (i),i)\). It is immediate that if a curve (in continuous time, \(\gamma [-m,0] \rightarrow \mathbb {Z}^d\), is in \(\Gamma ^m(0,0)\), then its “skeleton” \(\gamma (-m), \gamma (-m+1), \ldots , \gamma (0) = 0\) is in \(\Gamma _m\).

Proposition 14

There exists \(\lambda _0 >0\) and \(C < \infty \) such that for all \(n \ge 1\) and for all \(\lambda < \lambda _0\), the probability that there exists an \(m \ge n\) so that \(V^D(\gamma ) > -m/2\), for some \(\gamma \in \Gamma _m (0,0)\), is bounded by \(C/n^{\varepsilon /2}\).

Proof

We associate to each path \(\gamma (-m), \gamma (-m+1), \ldots , \gamma (0)\) the score

$$\begin{aligned} \sum _{j=-m+1}^0 T(\gamma (j),j) + \sum _{j=-m+1}^0 |\gamma (j)-\gamma (j+1)|_1. \end{aligned}$$

Note that, by definition of T(xn), for each j, the sum of durations for jobs which arrive at time \(s \in (j-1,j]\) and so that the job requires service from both server \(\gamma (s)\) and \(\gamma (s-)\) must be less than \(T(\gamma (j),j) = T(\gamma (j-1),j)\). Thus, in particular for a continuous time curve \(\gamma \in \Gamma ^m(0,0)\),

$$\begin{aligned} V^D(\gamma ) + m \le \sum _{j=-m+1}^0 T(\gamma (j), j) \le \sum _{j=-m+1}^0 T(\gamma (j),j) + \sum ^{j=0}_{-m+1} |\gamma (j)- \gamma (j+1)|_1. \end{aligned}$$

We can associate the path \(\gamma \) with the “lattice animal” \(\Xi ^\gamma \) in \(\mathbb {Z}^{d+1}\) consisting of points \((\gamma (i),i)\) for \(i = -m,-m+1, \ldots 0\) together with for each \(-m < i \le 0\) the points (yi) which lie on a path \(P_i\) from \((\gamma (i),i) \) to \((\gamma (i+1),i)\) which lies within \(\mathbb {Z}^d \times \{i\}\) and has length \(|\gamma (i) - \gamma (i+1)|_1\). Thus, this lattice animal \(\zeta \) has size

$$\begin{aligned} |\zeta |:= & {} m+1 + \sum _{i=-m+1}^0 (|\gamma (i) - \gamma (i-1)|_1-1)_{+}\\\le & {} m +1 + d \sum _{i=-m+1} ^ 0 D(\gamma (i), i ) = m +1 + d \sum _{i=-m+1} ^ 0 D(\gamma (i-1), i ). \end{aligned}$$

We note that, while obviously \(|\Xi ^\gamma | \ge m+1\), there are no nonrandom upper bounds for the cardinality. Thus, the inequality above can be rewritten as

$$\begin{aligned}&V^C(\Xi ^\gamma ) \ge V^D(\gamma ) + m, \quad \\&V^C(\Xi ^ \gamma ) \ge \frac{1}{d} \sum _{j=-m+1} ^ {0} T(\gamma (j), j) + \sum ^{j=0}_{-m+1} |\gamma (j)- \gamma (j+1) |_1. \end{aligned}$$

We now invoke Proposition 13 with \(\delta = \frac{1}{20d} \) to deduce that (with \(\lambda \) sufficiently small) the (bad) event

$$\begin{aligned} B = \left\{ \exists \text { lattice animal } \Xi ,\, 0 \in \Xi ,\, |\Xi | \ge m,\, V^C(\Xi ) \ge \delta |\Xi |\right\} \end{aligned}$$

has probability bounded by \(C/m^{\varepsilon /2} \) for some universal C. Our analysis of now splits into two cases. In both cases, we suppose that bad event B does not occur.

Firstly suppose that \(|\Xi ^\gamma | \le 4\hbox {dm}+m\). In this case

$$\begin{aligned} V^D(\gamma ) \le \sum _{j=-m+1}^0 T(\gamma (j), j) - m. \end{aligned}$$

But, on event \(B^c\), \(\sum _{j=-m+1}^0 T(\gamma (j), j) \le V^C(\Xi ^\gamma ) \le \frac{4\hbox {dm}+m}{20d}\). This implies that

$$\begin{aligned} V^D(\gamma ) \le -3m/4. \end{aligned}$$

On the other hand, suppose that \(|\Xi ^\gamma | > 4\hbox {dm}+m\). Now we have

$$\begin{aligned} V^C(\Xi ^\gamma ) ) \ge \sum ^{j=0}_{-m+1} |\gamma (j)- \gamma (j+1)|_1 / d \ge |\Xi ^\gamma |/2d. \end{aligned}$$

But this is impossible on event \(B^c\).

Thus, we have shown that, on event \(B^c\), for all \(m \ge n\), the event \(A_m = \{ \exists \gamma \in \Gamma ^m(0,0)\, V^D( \gamma ) > -2m/3\}\) does not occur, and so \(\mathbb {P}(A_m) \le C/m^{\varepsilon /2}\). So \(\mathbb {P}(\bigcup _{\frac{6}{5}^m\ge n} A_{\frac{6}{5}^m})\le C^\prime /n^{\varepsilon /2}\). But it is easily seen that the event \(\{\bigcup _{\frac{6}{5}^m \ge n} A_{\frac{6}{5}^m}\}\) contains the event \(\bigcup _{m\ge n} \{\exists \gamma \in \Gamma ^m(0,0)\, V^D(\gamma ) \ge -m/2\}\). \(\square \)

Proof of Theorem 1

From Proposition 14 we have that the workload \(W^{n,D}(0,0)\) of the discretized system is tight as n varies. On the other hand, by monotonicity, the limit as \(n \rightarrow \infty \) of \(W^{n}(0,0)\) exists a.s. Tightness ensures that this limit is finite. If we now start the system in full vacancy at time 0 and consider the workload W(xn) for some \(n >0\), we have that W(xn) is in distribution equal to \(W^n(0,0)\). Therefore, W(xn) converges in distribution as \(n \rightarrow \infty \). \(\square \)

Remark 3

We have actually shown something stronger than tightness: Namely, that, starting with an initially empty system, the workload profile at time t converges in distribution, as \(t \rightarrow \infty \), to some distribution which we will denote by \(\mu \). Standard arguments show that \(\mu \) is an invariant measure: If we start with \(W(0, \cdot )\) distributed according to \(\mu \), then \(W(t,\cdot )\) also has distribution \(\mu \). Since \(W(t,\cdot )\) is translation invariant in space, this is the case for the limit \(\mu \). We have thus proved the existence of an invariant probability measure which is also spatially invariant.

5 Necessity and Proof of Theorem 2

Let \(0< \varepsilon < d+1\). We consider the case where the stone heights (job service times) \(\tau \) satisfy, for t positive integer,

$$\begin{aligned} \mathbb {P}(\tau \ge t) = \frac{1}{t^{d+1-\varepsilon }}, \end{aligned}$$

and the stone basis B is the cube

$$\begin{aligned} B = [-\tau , \tau ]^d. \end{aligned}$$

We consider the number of job arrivals in space time cube \([0, t)^{d+1}\) of duration at least 2t for integer t, that is, the number of arrivals \((B, \tau )\) so that

  1. (1)

    \(\tau \ge 2t\),

  2. (2)

    B is a cube of side length \(2 \tau + 1\) centered at a site in \([0,t)^d\) and

  3. (3)

    the job arrives at a time in [0, t).

This random variable has expectation \(t^\varepsilon \lambda / 2^{d+1- \varepsilon } \) which for fixed \(\lambda \) tends to infinity as t becomes large. In particular for t large this expectation strictly exceeds \(\frac{1}{2} \). We fix such a t now.

Obviously this applies to any translation of the cube and the random variables associated to disjoint space time cubes are independent.

We consider the path \(\gamma ^n \) in \(\Gamma ^{nt}(0,0) \) which is identically the origin for all \(s \in [-tn, 0]\). We note that under our assumptions on \((B, \tau )\) any job arriving at a site in \([0,t)^d\) having \(\tau \ge 2t\) requires service from the origin. Hence the path \(\gamma ^n\) has value at least

$$\begin{aligned} \sum _{j=1}^n 2t X_j - nt, \end{aligned}$$

where \(X_j\) is the number of jobs arriving during interval \((-jt, -(j-1)t]\) and satisfying (1) and (2) above. By the law of large numbers, \(V(\gamma ^n )\) tends to infinity a.s., as n tends to infinity. This is enough to establish instability of the workload in this case no matter what the value of \(\lambda >0\) might be.

Remark 4

In fact, this argument can easily be generalized to show that if for each arriving job, \(\tau = R\) and if, for some \(\varepsilon > 0\), \(E( \tau ^{d+1- \varepsilon } ) = \infty \), then the system does not have stability.

6 Uniqueness

We now briefly address the question of unicity of invariant measures for the workloads when the power law condition holds and when \(\lambda \) is sufficiently small. We know that if condition (2) is satisfied and parameter \(\lambda \)is sufficiently small, then the distribution \(\mu \) of workloads, obtained by starting the system at time \(-n\) with the workloads identically zero and letting \(n \rightarrow \infty \), is invariant. The question that naturally arises is whether other equilibria for the workload, under Poisson arrival of jobs, are possible.

We consider systems that are stationary under spatial translations and show the following.

Theorem 15

Under the condition (2) above, there exists \(\lambda _0 \) so that if the arrival rate \(\lambda \) is less than \(\lambda _0\), and \(\nu \) is an invariant probability for the system on the space of workloads that is preserved by spatial translation, then \(\nu = \mu \).

In this section, the assumption that all jobs require service from cubes of servers is not “without loss of generality” so we remark that we only use the weak “irreducibility” condition that for every neighbor e of the origin there exist sequences

$$\begin{aligned} 0 = x_0, x_1, \ldots , x_r = e \end{aligned}$$

and bases

$$\begin{aligned} B_1, B_2, \ldots , B_r \end{aligned}$$

so that for all i, \(x_{i-1}, x_i \in B_i\) and jobs \(B_i\) occur with strictly positive probability.

To show the claimed uniqueness it suffices to show that for such a measure \(\nu \) and any bounded cylinder function h, we have

$$\begin{aligned} \int h(\eta ) \nu ( d \eta ) = \int h(\eta ) \mu ( d \eta ). \end{aligned}$$

Assuming that \(\nu \) is invariant this is equivalent to

$$\begin{aligned} \mathbb {E}^{\nu } [h(W_n)] = \int h(\eta ) \mu ( d \eta ) \end{aligned}$$

for any n (and so, in particular, for n large). Given this and our construction of the measure \(\mu \), it will be enough to show that for \(\varepsilon ^{\prime } > 0\) and h as above, both fixed,

$$\begin{aligned} \left| \mathbb {E}^\nu [h(W_n)] - \mathbb {E}^{\vec {0}}[h(W_n)] \right| < \varepsilon ^{\prime }, \end{aligned}$$

for n large. This will be our objective in the following.

As \(\nu \) is temporally invariant, we have, by the ergodic theorem [5] that, for every M, a.s.,

where \(\mathcal {I}_T\) is the \(\sigma \)-field of events that are invariant under temporal shifts. Thus, for an \(\varepsilon > 0\) fixed, we can find an M so large that,

with probability at least \(1-\varepsilon ^2\). By (spatial) translation invariance we then have for (every) \(x \in \mathbb {Z}^d\)

with probability at least \(1-\varepsilon ^2\). Let us call the above event \(B^x_M\). Thus, we will have, by the ergodic theorem applied to spatial shifts, that

where \(\mathcal {I}_S\) is the sigma field of spatially shift invariant events. Thus, we have that, for \(k_0\) sufficiently large, with probability at least \(1- 2\varepsilon \),

We now note that at time 0, say, the existence of a large workload V at a site 0,  say, implies that with reasonable probability the workload will be of order V for a time of order V in the time interval [0, V] for a cube of sites of side length of order V.

Proposition 16

There exists \(c_1 \in (0, \infty )\) so that, for all V large enough, uniformly over initial workloads \(W(0,\cdot )\) with \(W(0,0)>V\), with probability at least \(c_1\), we have, for all \(x \in [-c_1V,c_1V]^d\),

$$\begin{aligned} \quad W(x,t) > V/4,\quad \mathrm{for\,all}\,t \in [V/2, 3V/4]. \end{aligned}$$

Proof

From our “irreducibility” assumptions on the distribution of jobs, it is clear that there exist for each neighbor e of the origin 0 a sequence of jobs with bases \(B_1,B_2, \ldots , B_R\) so that, for each i, \(B_i \cap B_{i+1} \ne \varnothing \), \(0 \in B_1\) and \(e \in B_R\) and the rate at which job with base \(B_i\) arrives is strictly positive. Taking \(R_1\) to be the maximum over the Rs as the neighbor e varies and c to be the minimum over the rates \(B_i\) as e and i vary, we obtain that, for any x, there exists a “path” \(B_1,B_2, \ldots , B_R\) so that \(R \le d R_1|x|_\infty \),

for each i, \(B_i \cap B_{i+1} \ne \varnothing \), \(0 \in B_1\) and \(x \in B_R\) and the rate at which job \(B_i\) arrives is at least c. Thus, for every \(x \in [-c_1V,c_1V]^d \), the probability that \(W(x,V/2) \le V/2\) is bounded by the probability that a parameter Vc / 2 Poisson process is less than \(dc_1VR_1\). The result now follows easily from Poisson tail probabilities. \(\square \)

We note that if \(V>4M\) (assuming as we may that \(\varepsilon < 1/3\)), then \(W(x,t) > V/4\), for \(t \in [V/2, 3V/4]\), implies that event \(B^x_M \) does not occur. This implies that

Proposition 17

If for some x with \(|x| \le KV\) we have \(W(x,0) \ge V >4M\), then with probability at least \(c_1\),

for \(\varepsilon \) fixed small enough.

This yields the simple corollary.

Corollary 18

For M and \(\varepsilon \) as above, let A(VK) be the event that \(W(x,0) > t\) for some \(t > V\) and some \(|x| \le Kt\). Then, under measure \(\nu \), the probability that A(VK) occurs is less than \(\varepsilon /c_1\) provided \(c_1/(2K)^d > \varepsilon \).

From this result, our claim is straightforward.