Application: Social Networks, Communication Networks

Topics: Continuous-Time Markov Chains, Product-Form Queueing Networks

6.1 Social Networks

We provide the proofs of the theorems in Sect. 5.1.

Theorem 6.1 (Spreading of a Message)

Let Z be the number of nodes that eventually receive the message.

  1. (a)

    If μ < 1, then P(Z < ∞) = 1 and E(Z) < ∞;

  2. (b)

    If μ > 1, then P(Z = ∞) > 0.

Proof

For part (a), let X n be the number of nodes that are n steps from the root. If X n = k, we can write X n+1 = Y 1 + ⋯ + Y k where Y j is the number of children of node j at level n. By assumption, E(Y j) = μ for all j. Hence,

$$\displaystyle \begin{aligned} E[X_{n+1} \mid X_n = k] = E(Y_1 + \cdots + Y_k) = \mu k. \end{aligned}$$

Hence, E[X n+1X n] = μX n. Taking expectations shows that E(X n+1) = μE(X n), n ≥ 0. Consequently,

$$\displaystyle \begin{aligned} E(X_n) = \mu^n, n \geq 0. \end{aligned}$$

Now, the sequence Z n = X 0 + ⋯ + X n is nonnegative and increases to \(Z = \sum _{n=0}^ \infty Z_n\). By MCT, it follows that E(Z n) → Z. But

$$\displaystyle \begin{aligned} E(Z_n) = \mu_0 + \cdots + \mu^n = \frac{1 - \mu^{n+1}}{1 - \mu}. \end{aligned}$$

Hence, E(Z) = 1∕(1 − μ) < . Consequently, P(Z < ) = 1.

For part (b), one first observes that the theorem does not state that P(Z = ) = 1. For instance, assume that each node has three children with probability 0.5 and has no child otherwise. Then μ = 1.5 > 1 and P(Z = 1) = P(X 1 = 0) = 0.5, so that P(Z = ) ≤ 0.5 < 1. We define X n, Y j, and Z n as in the proof of part (a).

Let α n = P(X n > 0). Consider the X 1 children of the root. Since α n+1 is the probability that there is one survivor after n + 1 generations, it is the probability that at least one of the X 1 children of the root has a survivor after n generations. Hence,

$$\displaystyle \begin{aligned} 1 - \alpha_{n+1} = E( (1 - \alpha_n )^{X_1} ), n \geq 0. \end{aligned}$$

Indeed, if X 1 = k, the probability that none of the k children of the root has a survivor after n generations is (1 − α n)k. Hence,

$$\displaystyle \begin{aligned} \alpha_{n+1} = 1 - E((1 - \alpha_n)^{X_1}) =: g(\alpha_n), n \geq 0. \end{aligned}$$

Also, α 0 = 1. As n →, one has α n → α  = P(X n > 0, for all n). Figure 6.1 shows that α  > 0. The key observations are that

$$\displaystyle \begin{aligned} g(0) &= 0 \\ g(1) &= P(X_1 > 0) < 1 \\ g'(0) &= E(X_1 (1 - \alpha)^{X_1 - 1}) \mid_{\alpha = 0} = \mu > 1 \\ g'(1) &= E(X_1 (1 - \alpha)^{X_1 - 1}) \mid_{\alpha = 1} = 0, \end{aligned} $$

so that the figure is as drawn. □

Fig. 6.1
figure 1

The proof that α  > 0

Theorem 6.2 (Cascades)

Assume p k = p ∈ (0, 1] for all k ≥ 1. Then, all nodes turn red with probability at least equal to θ where

$$\displaystyle \begin{aligned} \theta = \exp\left\{- \frac{1 - p}{p} \right\}. \end{aligned}$$

Proof

The probability that node n does not listen to anyone is a n = (1 − p)n. Let X be the index of the first node that does not listen to anyone. Then

$$\displaystyle \begin{aligned} P(X > n) &= (1 - a_1)(1 - a_2) \cdots (1 - a_n) \leq \exp\{- a_1 - \cdots - a_n\} \\ &= \exp\left\{- \frac{1}{p} ((1 - p) - (1 - p)^{n+1})\right\}. \end{aligned} $$

Now,

$$\displaystyle \begin{aligned} P(X = \infty) = \lim_n P(X > n) \geq \exp\left\{- \frac{1 - p}{p} \right\} = \theta. \end{aligned}$$

Thus, with probability at least θ, every node listens to at least one previous node. When that is the case, all the nodes turn red. To see this, assume that n is the first blue node. That is not possible since it listened to some previous nodes that are all red. □

6.2 Continuous-Time Markov Chains

Our goal is to understand networks where packets travel from node to node until they reach their destination. In particular, we want to study the delay of packets from source to destination and the backlog in the nodes.

It turns out that the analysis of such systems is much easier in continuous time than in discrete time. To carry out such analysis, we have to introduce continuous-time Markov chains. We do this on a few simple examples.

6.2.1 Two-State Markov Chain

Figure 6.2 illustrates a random process {X t, t ≥ 0} that takes values in {0, 1}. A random process is a collection of random variables indexed by t ≥ 0. Saying that such a random process is defined means that one can calculate the probability that \(\{X_{t_1} = x_1, X_{t_2} = x_2, \ldots , X_{t_n} = x_n\}\) for any value of n ≥ 1, any 0 ≤ t 1 ≤⋯ ≤ t n, and x 1, …, x n ∈{0, 1}. We explain below how one could calculate such a probability.

Fig. 6.2
figure 2

A random process on {0, 1}

We call X t the state of the process at time t. The possible values {0, 1} are also called states. The state X t evolves according to rules characterized by two positive numbers λ and μ. As Fig. 6.2 shows, if X 0 = 0, the state remains equal to zero for a random time T 0 that is exponentially distributed with parameter λ, thus with mean 1∕λ. The state X t then jumps to 1 where it stays for a random time T 1 that is exponentially distributed with rate μ, independent of T 0, and so on. The definition is similar if X 0 = 1. In that case, X t keeps the value 1 for an exponentially distributed time with rate μ, then jumps to 0, etc.

Thus, the pdf of T 0 is

$$\displaystyle \begin{aligned} f_{T_0} (t) = \lambda \exp\{- \lambda t\} 1\{t \geq 0\}. \end{aligned}$$

In particular,

$$\displaystyle \begin{aligned} P(T_0 \leq \epsilon) \approx f_{T_0} (0) \epsilon = \lambda \epsilon, \mbox{ for } \epsilon \ll 1. \end{aligned}$$

Throughout this chapter, the symbol ≈ means “up to a quantity negligible compared to 𝜖.” It is shown in Theorem 15.3 that exponentially distributed random variable is memoryless. That is,

$$\displaystyle \begin{aligned} P[T_0 > t + s \mid T_0 > t] = P(T_0 > s), s, t \geq 0. \end{aligned}$$

The memoryless property and the independence of the exponential times T k imply that {X t, t ≥ 0} starts afresh from X s at time s. Figure 6.3 illustrates that property. Mathematically, it says that given {X t, t ≤ s} with X s = k, the process {X s+t, t ≥ 0} has the same properties as {X t, t ≥ 0} given that X 0 = k, for k = 0, 1 and for any s ≥ 0. Indeed, if X s = 0, then the residual time that X t remains in 0 is exponentially distributed with rate λ and is independent of what happened before time s, because the time in 0 is memoryless and independent of the previous times in 0 and 1. This property is written as

$$\displaystyle \begin{aligned} P[\{X_{s + t}, t \geq 0\} \in A \mid X_s = k; X_t, t \leq s] = P[\{X_t, t \geq 0\} \in A \mid X_0 = k], \end{aligned}$$

for k = 0, 1, for all s ≥ 0, and for all sets A of possible trajectories. A generic set A of trajectories is

$$\displaystyle \begin{aligned} A = \{(x_t, t \geq 0) \in C_+ \mid x_{t_1} = i_1, \ldots, x_{t_n} = i_n\} \end{aligned}$$

for given 0 < t 1 < ⋯ < t n and i 1, …, i n ∈{0, 1}. Here, C + is the set of right-continuous functions of t ≥ 0 that take values in {0, 1}.

Fig. 6.3
figure 3

The process X t starts afresh from X s at time s

This property is the continuous-time version of the Markov property for Markov chains. One says that the process X t satisfies the Markov property and one calls {X t, t ≥ 0} is a continuous-time Markov chain (CTMC).

For instance,

$$\displaystyle \begin{aligned} & P[X_{s + 2.5} = 1, X_{s + 4} = 0, X_{s + 5.1} = 0 \mid X_s = 0; X_t, t \leq s] \\ &~~~ = P[X_{2.5} = 1, X_4 = 0, X_{5.1} = 0 \mid X_0 = 0]. \end{aligned} $$

The Markov property generalizes to situations where s is replaced by a random τ that is defined by a causal rule, i.e., a rule that does not look ahead. For instance, as in Fig. 6.4, τ can be the second time that X t visits state 0. Or τ could be the first time that it visits state 0 after having spent at least 3 time units in state 1. The property does not extend to non-causal times such as one time unit before X t visits state 1. Random times τ defined by causal rules are called stopping times. This more general property is called the strong Markov property. To prove this property, one conditions on the value s of τ and uses the fact that the future evolution does not depend on this value since the event {τ = s} depends only on {X t, t ≤ s}.

Fig. 6.4
figure 4

The process X t starts afresh from X τ at the stopping time τ

For 0 < 𝜖 ≪ 1 one has

$$\displaystyle \begin{aligned} P[X_{t+\epsilon} = 1 \mid X_t = 0] \approx \lambda \epsilon. \end{aligned}$$

Indeed, the process jumps from 0 to 1 in 𝜖 time units if the exponential time in 0 is less than 𝜖, which has probability approximately λ𝜖.

Similarly,

$$\displaystyle \begin{aligned} P[X_{t+\epsilon} = 0 \mid X_t = 1] \approx \mu \epsilon. \end{aligned}$$

We say that the transition rate from 0 to 1 is equal to λ and that from 1 to 0 is equal to μ to indicate that the probability of a transition from 0 to 1 in 𝜖 units of time is approximately λ𝜖 and that from 1 to 0 is approximately μ𝜖.

Figure 6.5 illustrates these transition rates. This figure is called the state transition diagram.

Fig. 6.5
figure 5

The state transition diagram

The previous two identities imply that

$$\displaystyle \begin{aligned} P(X_{t+\epsilon} = 1) &= P(X_t = 0, X_{t+\epsilon} = 1) + P(X_t = 1, X_{t+\epsilon} = 1) \\ &= P(X_t {=} 0)P[X_{t{+}\epsilon} {=} 1 \mid X_t {=} 0] {+} P(X_t {=} 1)P[X_{t{+}\epsilon} {=}1 \mid X_t {=} 1] \\ &\approx P(X_t = 0) \lambda \epsilon + P(X_t = 1)(1 - P[X_{t+\epsilon} = 0 \mid X_t = 1]) \\ &\approx P(X_t = 0) \lambda \epsilon + P(X_t = 1)(1 - \mu \epsilon). \end{aligned} $$

Also, similarly, one finds that

$$\displaystyle \begin{aligned} P(X_{t+\epsilon} = 0) \approx P(X_t = 0)(1 - \lambda \epsilon) + P(X_t = 1) \mu \epsilon.\end{aligned} $$

We can write these identities in a convenient matrix notation as follows. For t ≥ 0, one defines the row vector π t as

$$\displaystyle \begin{aligned} \pi_t = [P(X_t = 0), P(X_t = 1)]. \end{aligned}$$

One also defines the transition rate matrix Q as follows:

$$\displaystyle \begin{aligned} Q = \left[ \begin{array}{c c} - \lambda & \lambda \\ \mu & - \mu \end{array} \right]. \end{aligned}$$

With that notation, the previous identities can be written as

$$\displaystyle \begin{aligned} \pi_{t+ \epsilon} \approx \pi_t (\mathbf{I} + Q \epsilon), \end{aligned}$$

where I is the identity matrix. Subtracting π t from both sides, dividing by 𝜖, and letting 𝜖 → 0, we find

$$\displaystyle \begin{aligned} \frac{d}{dt} \pi_t = \pi_t Q. \end{aligned} $$
(6.1)

By analogy with the scalar equation dx tdt = ax t whose solution is \(x_t = x_0 \exp \{at\}\), we conclude that

$$\displaystyle \begin{aligned} \pi_t = \pi_0 \exp\{Qt\}, \end{aligned} $$
(6.2)

where

$$\displaystyle \begin{aligned} \exp\{Qt\} := \mathbf{I} + Qt + \frac{1}{2!} Q^2t^2 + \frac{1}{3!} Q^3t^3 + \cdots. \end{aligned}$$

Note that

$$\displaystyle \begin{aligned} \frac{d}{dt} \exp\{Qt\} = \mathbf{0} + Q + Q^2 t + \frac{1}{2!} Q^3 t^2 + \cdots = Q \exp\{Qt\}. \end{aligned}$$

Observe also that π t = π for all t ≥ 0 if and only if π 0 = π and

$$\displaystyle \begin{aligned} \pi Q = 0. \end{aligned} $$
(6.3)

Indeed, if π t = π for all t, then (6.1) implies that \(0 = \frac {d}{dt} \pi _t =\pi _t Q = \pi Q\). Conversely, if π 0 = π with πQ = 0, then

$$\displaystyle \begin{aligned} \pi_t = \pi_0 \exp\{Qt\} = \pi \exp\{Qt\} = \pi \left(\mathbf{I} + Qt + \frac{1}{2!} Q^2t^2 + \frac{1}{3!} Q^3t^3 + \cdots\right) = \pi. \end{aligned}$$

These equations πQ = 0 are called the balance equations. They are

$$\displaystyle \begin{aligned}{}[\pi(0), \pi(1)] \left[ \begin{array}{c c} - \lambda & \lambda \\ \mu & - \mu \end{array} \right] = 0, \end{aligned}$$

i.e.,

$$\displaystyle \begin{aligned} & \pi(0) (- \lambda) + \pi(1) \mu = 0 \\ & \pi(0) \lambda - \pi(1) \mu = 0. \end{aligned} $$

These two equations are identical. To determine π, we use the fact that π(0) + π(1) = 1. Combined with the previous identity, we find

$$\displaystyle \begin{aligned}{}[\pi(0), \pi(1)] = \left[ \frac{\mu}{\lambda + \mu}, \frac{\lambda}{\lambda + \mu}\right]. \end{aligned}$$

The identity π t+𝜖 ≈ π t(I + Q𝜖) shows that one can view {X n𝜖, n = 0, 1, …} as a discrete-time Markov chain with transition matrix P = I + Q𝜖. Figure 6.6 shows the transition diagram that corresponds to this transition matrix. The invariant distribution for P is such that πP = π, i.e., π(I + Q𝜖) = π, so that πQ = 0, not surprisingly.

Fig. 6.6
figure 6

A discrete-time approximation of X t

Note that this discrete-time Markov chain is aperiodic because states have self-loops. Thus, we expect that

$$\displaystyle \begin{aligned} \pi_{n \epsilon} \to \pi, \mbox{ as } n \to \infty . \end{aligned}$$

Consequently, we expect that, in continuous time,

$$\displaystyle \begin{aligned} \pi_t \to \pi, \mbox{ as } t \to \infty. \end{aligned}$$

6.2.2 Three-State Markov Chain

The previous Markov chain alternates between the states 0 and 1. More general Markov chains visit states in a random order. We explain that feature in our next example with 3 states. Fortunately, this example suffices to illustrate the general case. We do not have to look at Markov chains with 4, 5, … states to describe the general model.

In the example shown in Fig. 6.7, the rules of evolution are characterized by positive numbers q(0, 1), q(0, 2), q(1, 2), and q(2, 0). One also defines q 0, q 1, q 2, Γ(0, 1), and Γ(0, 2) as in the figure.

Fig. 6.7
figure 7

A three-state Markov chain

If X 0 = 0, the state X t remains equal to 0 for some random time T 0 that is exponentially distributed with rate q 0. At time T 0, the state jumps to 1 with probability Γ(0, 1) or to state 2 otherwise, with probability Γ(0, 2). If X t jumps to 1, it stays there for an exponentially distributed time T 1 with rate q 1 that is independent of T 0. More generally, when X t enters state k, it stays there for a random time that is exponentially distributed with rate q k that is independent of the past evolution. From this definition, it should be clear that the process X t satisfies the Markov property.

Define π t = [π t(0), π t(1), π t(2)] where π t(k) = P(X t = k) for k = 0, 1, 2. One has, for 0 < 𝜖 ≪ 1,

$$\displaystyle \begin{aligned} P[X_{t+\epsilon} = 1 \mid X_t = 0] \approx q_0 \epsilon \varGamma(0, 1) = q(0,1) \epsilon. \end{aligned}$$

Indeed, the process jumps from 0 to 1 in 𝜖 time units if the exponential time with rate q 0 is less than 𝜖 and if the process then jumps to 1 instead of jumping to 2.

Similarly,

$$\displaystyle \begin{aligned} P[X_{t+\epsilon} = 2 \mid X_t = 0] \approx q_0 \epsilon \varGamma(0, 2) = q(0,2) \epsilon. \end{aligned}$$

Also,

$$\displaystyle \begin{aligned} P[X_{t + \epsilon} = 1 \mid X_t = 1] \approx 1 - q_1 \epsilon, \end{aligned}$$

since this is approximately the probability that the exponential time with rate q 1 is larger than 𝜖. Moreover,

$$\displaystyle \begin{aligned} P[X_{t + \epsilon} = 1 \mid X_t = 2] \approx 0, \end{aligned}$$

because the probability that both the exponential time with rate q 2 in state 2 and the exponential time with rate q 0 in state 0 are less than 𝜖 is roughly (q 2 𝜖) × (q 1 𝜖), and this is negligible compared to 𝜖.

These observations imply that

$$\displaystyle \begin{aligned} \pi_{t+\epsilon}(1) &= P(X_t = 0, X_{t+ \epsilon} = 1) + P(X_t = 1, X_{t+ \epsilon} = 1) +P(X_t = 2, X_{t+ \epsilon} = 1) \\ &= P(X_t {=} 0)P[X_{t{+}\epsilon} {=} 1 \mid X_t {=} 0] {+} P(X_t {=} 1)P[X_{t{+}\epsilon} {=} 1 \mid X_t {=} 1] \\ &~~~~~ + P(X_t = 2)P[X_{t+\epsilon} = 1 \mid X_t = 2] \\ &\approx \pi_t(0) q(0,1) \epsilon + \pi_t(1) (1 - q_1 \epsilon). \end{aligned} $$

Proceeding in a similar way shows that

$$\displaystyle \begin{aligned} \pi_{t + \epsilon}(0) &\approx \pi_t(0)(1 - q_0 \epsilon) + \pi_t(2) q(2, 0) \epsilon \\ \pi_{t + \epsilon}(2) & \approx \pi_t(1) q(1, 2) \epsilon + \pi_t(2) (1 - q_2 \epsilon. \end{aligned} $$

Similarly to the two-state example, let us define the rate matrix Q as follows:

$$\displaystyle \begin{aligned} Q = \left[ \begin{array}{c c c} - q_0 & q(0, 1) & q(0, 2) \\ 0 & - q_1 & q(0, 1) \\ q(2, 0) & 0 & - q_2 \end{array} \right]. \end{aligned}$$

The previous identities can then be written as follows:

$$\displaystyle \begin{aligned} \pi_{t + \epsilon} \approx \pi_t [ \mathbf{I} + Q \epsilon]. \end{aligned}$$

Subtracting π t from both sides, dividing by 𝜖, and letting 𝜖 → 0 then shows that

$$\displaystyle \begin{aligned} \frac{d}{dt} \pi_t = \pi_t Q. \end{aligned}$$

As before, the solution of this equation is

$$\displaystyle \begin{aligned} \pi_t = \pi_0 \exp\{Qt\}, t \geq 0. \end{aligned}$$

The distribution π is invariant if and only if

$$\displaystyle \begin{aligned} \pi Q = 0. \end{aligned}$$

Once again, we note that {X n𝜖, n = 0, 1, …} is approximately a discrete-time Markov chain with transition matrix P = I + Q𝜖 shown in Fig. 6.8. This Markov chain is aperiodic, and we conclude that

$$\displaystyle \begin{aligned} P(X_{n \epsilon} = k ) \to \pi(k), \mbox{ as } n \to \infty. \end{aligned}$$

Thus, we can expect that

$$\displaystyle \begin{aligned} \pi_t \to \pi, \mbox{ as } t \to \infty. \end{aligned}$$

Also, since X n𝜖 is irreducible, the long-term fraction of time that it spends in the different states converge to π, and we can then expect the same for X t.

Fig. 6.8
figure 8

The transition matrix of the discrete-time approximation

6.2.3 General Case

Let \(\mathcal {X}\) be a countable or finite set. The process {X t, t ≥ 0} is defined as follows. One is given a probability distribution π on \(\mathcal {X}\) and a rate matrix \(Q = \{q(i, j), i, j \in \mathcal {X}\}\).

By definition, Q is such that

$$\displaystyle \begin{aligned} q(i, j) \geq 0, \forall i \neq j \mbox{ and } \sum_j q(i, j) = 0, \forall x. \end{aligned}$$

Definition 6.1 (Continuous-Time Markov Chain)

A continuous-time Markov chain with initial distribution π and rate matrix Q is a process {X t, t ≥ 0} such that P(X 0 = i) = π(i). Also,

$$\displaystyle \begin{aligned} P[X_{t + \epsilon} = j | X_t = i, X_u, u < t] = 1\{i = j\} + \epsilon q(i, j) + o(\epsilon). \end{aligned}$$

This definition means that the process jumps from i to j ≠ i with probability q(i, j)𝜖 in 𝜖 ≪ 1 time units. Thus, q(i, j) is the probability of jumping from i to j, per unit of time. Note that the sum of these expressions over all j gives 1, as should be.

One construction of this process is as follows. Say that X t = i. One then chooses a random time τ that is exponentially distributed with rate q i := −q(i, i). At time t + τ, the process jumps and goes to state y with probability Γ(i, j) = q(i, j)∕q i for j ≠ i (Fig. 6.9).

Fig. 6.9
figure 9

Construction of a continuous-time Markov chain

Thus, if X t = i, the probability that X t+𝜖 = j is the probability that the process jumps in (t, t + 𝜖), which is q i 𝜖, times the probability that it then jumps to j, which is Γ(i, j). Hence,

$$\displaystyle \begin{aligned} P[X_{t + \epsilon} = j | X_t = i] = q_i \epsilon \frac{q(i, j)}{q_i} = q(i, j) \epsilon, \end{aligned}$$

up to o(𝜖). Thus, the construction yields the correct transition probabilities.

As we observed in the examples,

$$\displaystyle \begin{aligned} \frac{d}{dt} \pi_t = \pi_t Q, \end{aligned}$$

so that

$$\displaystyle \begin{aligned} \pi_t = \pi_0 \exp\{Qt\}. \end{aligned}$$

Moreover, a distribution π is invariant if and only if it solves the balance equations

$$\displaystyle \begin{aligned} 0 = \pi Q. \end{aligned}$$

These equations, state by state, say that

$$\displaystyle \begin{aligned} \pi(i) q_i = \sum_{j \neq i} \pi(j) q(j, i), \forall i \in \mathcal{X}. \end{aligned}$$

These equations express the equality of the rate of leaving a state and the rate of entering that state.

Define

$$\displaystyle \begin{aligned} P_t(i, j) = P[X_{s + t} = j \mid X_s = i], \mbox{ for } i, j \in \mathcal{X} \mbox{ and } s, t \geq 0. \end{aligned}$$

The Markov property implies that

$$\displaystyle \begin{aligned} P(X_{t_1} = i_1, \ldots , X_{t_n} = i_n) = P(X_{t_1} = i_1)P_{t_2 - t_1}(i_1, i_2) P_{t_3 - t_2}(i_2, i_3) \cdots P_{t_n - t_{n-1}}(i_{n-1}, i_n),\end{aligned} $$

for all \(i_1, \ldots , i_n \in \mathcal {X}\) and all 0 < t 1 < ⋯ < t n.

Moreover, this identity implies the Markov property. Indeed, if it holds, one has

$$\displaystyle \begin{aligned} & P[X_{t_{m+1}} = i_{m+1}, \ldots , X_{t_n} = i_n \mid X_{t_1} = i_1, \ldots , X_{t_m} = i_m] \\ &~~~ = \frac{P(X_{t_1} = i_1, \ldots , X_{t_n} = i_n)}{P(X_{t_1} = i_1, \ldots , X_{t_m} = i_m)} \\ &~~~ = \frac{P(X_{t_1} = i_1)P_{t_2 - t_1}(i_1, i_2) P_{t_3 - t_2}(i_2, i_3) \cdots P_{t_n - t_{n-1}}(i_{n-1}, i_n)} {P(X_{t_1} = i_1)P_{t_2 - t_1}(i_1, i_2) P_{t_3 - t_2}(i_2, i_3) \cdots P_{t_{m-1} - t_{m-2}}(i_{m-2}, i_{m-1})} \\ &~~~ = P_{t_m - t_{m-1}}(i_{m-1}, i_{m}) \cdots P_{t_n - t_{n-1}}(i_{n-1}, i_n). \end{aligned} $$

Hence,

$$\displaystyle \begin{aligned} & P[X_{t_{m+1}} = i_{m+1}, \ldots , X_{t_n} = i_n \mid X_{t_1} = i_1, \ldots , X_{t_m} = i_m] \\ &~~~ = \frac{P(X_{t_{m-1}} = i_{m-1}) P_{t_m - t_{m-1}}(i_{m-1}, i_{m}) \cdots P_{t_n - t_{n-1}}(i_{n-1}, i_n)} { P(X_{t_{m-1}} = i_{m-1}) P_{t_m - t_{m-1}}(i_{m-1}, i_{m})}\\ &~~~ = \frac{P(X_{t_{m-1}} = i_{m-1}, \ldots , X_{t_n} = i_n)}{P(X_{t_{m-1}} = i_{m-1})}\\ &~~~ = P[X_{t_{m}} = i_{m}, \ldots , X_{t_n} = i_n \mid X_{t_{m-1}} = i_{m-1}] . \end{aligned} $$

If X t has the invariant distribution, one has

$$\displaystyle \begin{aligned} P(X_{t_1} = i_1, \ldots , X_{t_n} = i_n) = \pi(i_1)P_{t_2 - t_1}(i_1, i_2) P_{t_3 - t_2}(I_2, i_3) \cdots P_{t_n - t_{n-1}}(i_{n-1}, i_n), \end{aligned}$$

for all \(i_1, \ldots , i_n \in \mathcal {X}\) and all 0 < t 1 < ⋯ < t n.

Here is the result that corresponds to Theorem 15.1. We define irreducibility, transience, and null and positive recurrence as in discrete time. There is no notion of periodicity in continuous time.

Theorem 6.1 (Big Theorem for Continuous-Time Markov Chains)

Consider a continuous-time Markov chain.

  1. (a)

    If the Markov chain is irreducible, the states are either all transient, all positive recurrent, or all null recurrent. We then say that the Markov chain is transient, positive recurrent, or null recurrent, respectively.

  2. (b)

    If the Markov chain is positive recurrent, it has a unique invariant distribution π and π(i) is the long-term fraction of time that X t is equal to i. Moreover, the probability π t(i) that the Markov chain X t is in state i converges to π(i).

  3. (c)

    If the Markov chain is not positive recurrent, it does not have an invariant distribution and the fraction of time that it spends in any state goes to zero.

\({\blacksquare }\)

6.2.4 Uniformization

We saw earlier that a CTMC can be approximated by a discrete-time Markov chain that has a time step 𝜖 ≪ 1. There are two other DTMCs that have a close relationship with the CTMC: the jump chain and the uniformized chain. We explain these chains for the CTMC X t in Fig. 6.7.

The jump chain is X t observed when it jumps. As Fig. 6.7 shows, this DTMC has a transition matrix equal to Γ where

$$\displaystyle \begin{aligned} \varGamma(i, j) = \left\{ \begin{array}{l l} q(i, j)/q_i, & \mbox{ if } i \neq j \\ 0, & \mbox{ if } i = j. \end{array} \right. \end{aligned}$$

Let ν be the invariant distribution of this jump chain. That is, ν = νΓ. Since ν(i) is the long-term fraction of time that the jump chain is in state i, and since the CTMC X t spends an average time 1∕q i in state i whenever it visits that state, the fraction of time that X t spends in state i should be proportional to ν(i)∕q i. That is, one expects

$$\displaystyle \begin{aligned} \pi(i) = A \nu(i) / q_i \end{aligned}$$

for some constant A. That is, one should have

$$\displaystyle \begin{aligned} \sum_j [A \nu(i) / q_i] q(i, j) = 0. \end{aligned}$$

To verify that equality, we observe that

$$\displaystyle \begin{aligned} \sum_j [\nu(i) / q_i] q(i, j) = \sum_{j \neq i} \nu(i) \varGamma(i, j) + \nu(i) q(i, i)/q_i = \nu(i) - \nu(i) = 0. \end{aligned}$$

We used the fact that νΓ = ν and q(i, i) = −q i.

The uniformized chain is not the jump chain. It is a discrete-time Markov chain obtained from the CTMC as follows. Let λ ≥ q i for all i. The rate at which X t changes state is q i when it is in state i. Let us add a dummy jump from i to i with rate λ − q i. The rate of jumps, including these dummy jumps, of this new Markov chain Y t is now constant and equal to λ.

The transition matrix P of Y t is such that

$$\displaystyle \begin{aligned} P(i,j) = \left\{ \begin{array}{l l} (\lambda - q_i)/\lambda, & \mbox{ if } i = j \\ q(i,j)/\lambda, & \mbox{ if } i \neq j. \end{array} \right. \end{aligned}$$

To see this, assume that Y t = i. The next jump will occur with rate λ. With probability (λ − q i)∕λ, it is a dummy jump from i to i. With probability q iλ it is an actual jump where Y t jumps to j ≠ i with probability Γ(i, j). Hence, Y t jumps from i to i with probability (λ − q i)∕λ and from i to j ≠ i with probability (q iλ)Γ(i, j) = q(i, j)∕λ.

Note that

$$\displaystyle \begin{aligned} P = \mathbf{I} + \frac{1}{\lambda} Q, \end{aligned}$$

where I is the identity matrix.

Now, define Z n to be the jump chain of Y t, i.e., the Markov chain with transition matrix P. Since the jumps of Y t occur at rate λ, independently of the value of the state Y t, we can simulate Y t as follows. Let N t be a Poisson process with rate λ. The jump times {t 1, t 2, …} of N t will be the jump times of Y t. The successive values of Y t are those of Z n. Formally,

$$\displaystyle \begin{aligned} Y_t = Z_{N_t}. \end{aligned}$$

That is, if N t = n, then we define Y t = Z n. Since the CTMC Y t spends 1∕λ on average between jumps, the invariant distribution of Y t should be the same as that of X t, i.e., π. To verify this, we check that πP = π, i.e., that

$$\displaystyle \begin{aligned} \pi \left( \mathbf{I} + \frac{1}{\lambda}Q\right) = \pi. \end{aligned}$$

That identity holds since πQ = 0. Thus, the DTMC Z n has the same invariant distribution as X t. Observe that Z n is not the same as the jump chain of X t. Also, it is not a discrete-time approximation of X t. This DTMC shows that a CTMC can be seen as a DTMC where one replaces the constant time steps by i.i.d. exponentially distributed time steps between the jumps.

6.2.5 Time Reversal

As a preparation for our study of networks of queues, we note the following result.

Theorem 6.2 (Kelly’s Lemma)

Let Q be the rate matrix of a Markov chain on \(\mathcal {X}\) . Let also \(\tilde Q\) be another rate matrix on \(\mathcal {X}\) . Assume that π is a distribution on \(\mathcal {X}\) and that

$$\displaystyle \begin{aligned} & q_i = \tilde q_i, i \in \mathcal{X} \mathit{\mbox{ and }} \\ & \pi(i) q(i, j) = \pi(j) \tilde q(j,i), \forall i \neq j. \end{aligned} $$

Then πQ = 0. \({\blacksquare }\)

Proof

We have

$$\displaystyle \begin{aligned} \sum_{j \neq i} \pi(j)q(j, i) = \sum_{j \neq i} p(i) \tilde q(i, j) = p(i) \sum_{j \neq i} \tilde q(i, j) = p(i) \tilde q_i = p(i) q_i, \end{aligned}$$

so that πQ = 0. □

The following result explains the meaning of \(\tilde Q\) in the previous theorem. We state it without proof.

Theorem 6.3

Assume that X t has the invariant distribution π. Then X t reversed in time is a Markov chain with rate matrix \(\tilde Q\) given by

$$\displaystyle \begin{aligned} \tilde q(i,j) = \frac{\pi(j)q(j, i)}{\pi(i)}. \end{aligned}$$

\({\blacksquare }\)

6.3 Product-Form Networks

Theorem 6.4 (Invariant Distribution of Network)

Assume λ k < μ k and let ρ k = λ kμ k , for k = 1, 2, 3. Then the Markov chain X t has a unique invariant distribution π that is given by

$$\displaystyle \begin{aligned} & \pi(x_1, x_2, x_3) = \pi_1(x_1)\pi_2(x_2)\pi_3(x_3) \\ & \pi_k(n) = (1 - \rho_k) \rho_k^n, n \geq 0, k = 1, 2 \\ & \pi_3( a_1, a_2, \ldots, a_n) = p(a_1)p(a_2) \cdots p(a_n) (1 - \rho_3) \rho_3^n, \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ n \geq 0, a_k \in \{1, 2\}, k = 1, \ldots, n, \end{aligned} $$

where p(1) = λ 1∕(λ 1 + λ 2) and p(2) = λ 2∕(λ 1 + λ 2).

Proof

Figure 6.10 shows a guess for the time-reversal of the network.

Fig. 6.10
figure 10

The network (top) and a guess for its time-reversal (bottom). The bottom network is obtained from the top one by reversing the flows of customers. It is a bold guess that the arrivals have exponential inter-arrival times and their rates are independent of the current queue lengths

Let Q be the rate matrix of the top network and \(\tilde Q\) that of the bottom one. Let also π be as stated in the theorem. We show that \(\pi , Q, \tilde Q\) satisfy the conditions of Kelly’s Lemma.

For instance, we verify that

$$\displaystyle \begin{aligned} & \pi([3,2,[1,1,2,1]])q([3,2,[1,1,2,1]], [4,2,[1,1,2]]) \\ &~~ = \pi([4,2,[1,1,2]]) \tilde q([4,2,[1,1,2]], [3,2,[1,1,2,1]]). \end{aligned} $$

Looking at the figure, we can see that

$$\displaystyle \begin{aligned} q([3,2,[1,1,2,1]], [4,2,[1,1,2]]) &= \mu_3 p_1 \\ \tilde q([4,2,[1,1,2]], [3,2,[1,1,2,1]] &= \mu_1 p_1. \end{aligned} $$

Thus, the previous identity reads

$$\displaystyle \begin{aligned} \pi([3,2,[1,1,2,1]]) \mu_3 p_1 = \pi([4,2,[1,1,2]]) \mu_1 p_1, \end{aligned}$$

i.e.,

$$\displaystyle \begin{aligned} \pi([3,2,[1,1,2,1]]) \mu_3 = \pi([4,2,[1,1,2]]) \mu_1. \end{aligned}$$

Given the expression for π, this is

$$\displaystyle \begin{aligned} & (1 - \rho_1)\rho_1^3 \times (1 - \rho_2)\rho_2^2 \times p(1)p(1)p(2)p(1) (1 - \rho_3) \rho_3^4 \mu_3 \\ &~~~~~ = (1 - \rho_1)\rho_1^4 \times (1 - \rho_2)\rho_2^2 \times p(1)p(1)p(2) (1 - \rho_3) \rho_3^3 \mu_1. \end{aligned} $$

After simplifications, this identity is seen to be equivalent to

$$\displaystyle \begin{aligned} p(1) \rho_3 \mu_3 = \rho_1 \mu_1, \end{aligned}$$

i.e.,

$$\displaystyle \begin{aligned} \frac{\lambda_1}{\lambda_3} \frac{\lambda_3}{\mu_3} \mu_3 = \frac{\lambda_1}{\mu_1} \mu_1 \end{aligned}$$

and this equation is seen to be satisfied. A similar argument shows that Kelly’s lemma is satisfied for all pairs of states. □

6.4 Proof of Theorem 5.7

The first step in using the theorem is to solve the flow conservation equations. Let us call class 1 that of the white jobs and class 2 that of the gray job. Then we see that

$$\displaystyle \begin{aligned} \lambda_1^1 = \lambda_2^1 = \gamma, \lambda_1^2 = \lambda_2^2 = \alpha \end{aligned}$$

solve the flow conservation equations for any α > 0. We have to assume γ < μ for the services to be able to keep up with the white jobs. With this assumption, we can choose α small enough so that \(\lambda _1 = \lambda _2 = \lambda := \gamma + \alpha < \min \{\mu _1, \mu _2\}.\)

The second step is to use the theorem to obtain the invariant distribution. It is

$$\displaystyle \begin{aligned} \pi(x_1, x_2) = A h(x_1)h(x_2) \end{aligned}$$

with

$$\displaystyle \begin{aligned} h(x_i) = \left(\frac{\gamma}{\mu}\right)^{n_1(x_i)} \left(\frac{\alpha}{\mu}\right)^{n_2(x_i)} = \rho_1^{n_1(x_i)}\rho_2^{n_2(x_i)}, \end{aligned}$$

where ρ 1 = γμ, ρ 2 = αμ, and n c(x) is the number of jobs of class c in x i, for c = 1, 2. To calculate A, we note that there are n + 1 states x i with n class 1 jobs and 1 class 2 job, and 1 state x i with n classes 1 jobs and no class 2 job. Indeed, the class 2 customer can be in n + 1 positions in the queue with the n customers of class 1.

Also, all the possible pairs (x 1, x 2) must have one class 2 customer either in queue 1 or in queue 2. Thus,

$$\displaystyle \begin{aligned} 1 = \sum_{(x_1, x_2)} \pi(x_1, x_2) = A \sum_{m = 0}^\infty \sum_{n = 0}^\infty G(m, n), \end{aligned}$$

where

$$\displaystyle \begin{aligned} G(m, n) = (m+1) \rho_1^{m+n} \rho_2 + (n+1) \rho_1^{m+n} \rho_2. \end{aligned}$$

In this expression, the first term corresponds to the states with m class 1 customers and one class 2 customer in queue 1 and n customers of class 1 in queue 2; the second term corresponds to the states with m customer of class 1 in queue 1, and n customers of class 1 and one customer of class 2 in queue 2. Thus, AG(m, n) is the probability that there are m customers of class 1 in the first queue and n customers of class 1 in the second queue.

Hence,

$$\displaystyle \begin{aligned} 1 = A \sum_{m = 0}^\infty \sum_{n = 0}^\infty [ (m+1) \rho_1^{m+n} \rho_2 + (n+1) \rho_1^{m+n} \rho_2] = 2A \sum_{m = 0}^\infty \sum_{n = 0}^\infty (m+1) \rho_1^{m+n} \rho_2, \end{aligned}$$

by symmetry of the two terms. Thus,

$$\displaystyle \begin{aligned} 1 = 2 A \rho_2 \left[ \sum_{m = 0}^\infty (m+1) \rho_1^{m}\right]\left[\sum_{n = 0}^\infty \rho_1^n\right]. \end{aligned}$$

To compute the sum, we use the following identities:

$$\displaystyle \begin{aligned} \sum_{n=0}^\infty \rho^n = (1 - \rho)^{-1}, \mbox{ for } 0 < \rho < 1 \end{aligned}$$

and

$$\displaystyle \begin{aligned} \sum_{n=0}^\infty (n+1) \rho^n = \frac{\partial}{\partial \rho} \sum_{n=0}^\infty \rho^{n+1} = \frac{\partial}{\partial \rho} [(1 - \rho)^{-1} - 1] = (1 - \rho)^{-2}. \end{aligned}$$

Thus, one has

$$\displaystyle \begin{aligned} 1 = 2 A \rho_2 (1 - \rho_1)^{-3}, \end{aligned}$$

so that

$$\displaystyle \begin{aligned} A = \frac{(1 - \rho_1)^3}{2 \rho_2}. \end{aligned}$$

Third, we calculate the expected number L of jobs of class 1 in the two queues. One has

$$\displaystyle \begin{aligned} L &= \sum_{m=0}^\infty \sum_{n=0}^\infty A (m+n) G(m, n) \\ &= \sum_{m=0}^\infty \sum_{n=0}^\infty A (m + n) (m+1) \rho_1^{m+n} \rho_2 + \sum_{m=0}^\infty \sum_{n=0}^\infty A (m + n) (n+1) \rho_1^{m+n} \rho_2 \\ &= 2 \sum_{m=0}^\infty \sum_{n=0}^\infty A (m + n) (m+1) \rho_1^{m+n} \rho_2, \end{aligned} $$

where the last identity follows from the symmetry of the two terms. Thus,

$$\displaystyle \begin{aligned} L &= 2 \sum_{m=0}^\infty \sum_{n=0}^\infty A m (m+1) \rho_1^{m+n} \rho_2 + 2 \sum_{m=0}^\infty \sum_{n=0}^\infty A n (m+1) \rho_1^{m+n} \rho_2 \\ &= 2 A \rho_2 \left[\sum_{m=0}^\infty m (m+1) \rho_1^m\right]\left[\sum_{n=0}^\infty \rho_1^n\right] + 2A \rho_2 \left[\sum_{m=0}^\infty (m+1) \rho_1^m\right]\left[\sum_{n=0}^\infty n \rho_1^n\right] \\ &= 2 A \rho_2 \left[\sum_{m=0}^\infty m (m+1) \rho_1^m\right] (1 - \rho_1)^{-1} + 2A \rho_2 (1 - \rho)^{-2} \left[\sum_{n=0}^\infty n \rho_1^n\right]. \end{aligned} $$

To calculate the sums, we use the fact that

$$\displaystyle \begin{aligned} \sum_{m=0}^\infty m(m+1) \rho^m &= \rho \sum_{m=0}^\infty m(m+1) \rho^{m-1} \\ &= \rho \frac{\partial^2}{\partial \rho^2} \sum_{m=0}^\infty \rho^{m+1} = \rho \frac{\partial^2}{\partial \rho^2} [(1 - \rho)^{-1} - 1] \\ &= 2 \rho(1 - \rho)^{-3}. \end{aligned} $$

Also,

$$\displaystyle \begin{aligned} \sum_{n=0}^\infty n \rho_1^n = \rho_1 \sum_{n=0}^\infty n \rho_1^{n-1} = \rho_1 \sum_{n=0}^\infty (n+1) \rho_1^{n} = \rho_1 (1 - \rho_1)^{-2}. \end{aligned}$$

Hence,

$$\displaystyle \begin{aligned} L &= 2 A \rho_2 \times 2 \rho(1 - \rho)^{-3} \times (1 - \rho_1)^{-1} + 2A \rho_2 (1 - \rho)^{-2} \times \rho_1 (1 - \rho_1)^{-2} \\ &= 6A \rho_2 \rho_1 (1 - \rho_1)^{-4}. \end{aligned} $$

Substituting the value for A that we derived above, we find

$$\displaystyle \begin{aligned} L = 3 \frac{\rho_1}{1 - \rho_1}. \end{aligned}$$

Finally, we get the average time W that jobs of class 1 spend in the network: W = Lγ.

Without the gray job, the expected delay W′ of the white jobs would be the sum of delays in two M/M/1 queues, i.e., W′ = L′γ where

$$\displaystyle \begin{aligned} L' = 2 \frac{\rho_1}{1 - \rho_1}. \end{aligned}$$

Hence, we find that

$$\displaystyle \begin{aligned} W = 1.5 W', \end{aligned}$$

so that using a hello message increases the average delay of the class 1 customers by 50%.

6.5 References

The time-reversal arguments are developed in Kelly (1979). That book also explains many other models that can be analyzed using that approach. See also Bremaud (2008), Lyons and Perez (2017), Neely (2010).