Abstract
This chapter provides the derivations of the results in the previous chapter. It also develops the theory of continuous-time Markov chains.
Section 6.1 proves the results on the spreading of rumors. Section 6.2 presents the theory of continuous-time Markov chains that are used to model queueing networks, among many other applications. That section explains the relationships between continuous-time and related discrete-time Markov chains. Sections 6.3 and 6.4 prove the results about product-form networks by using a time-reversal argument.
You have full access to this open access chapter, Download chapter PDF
Application: Social Networks, Communication Networks
Topics: Continuous-Time Markov Chains, Product-Form Queueing Networks
6.1 Social Networks
We provide the proofs of the theorems in Sect. 5.1.
Theorem 6.1 (Spreading of a Message)
Let Z be the number of nodes that eventually receive the message.
-
(a)
If μ < 1, then P(Z < ∞) = 1 and E(Z) < ∞;
-
(b)
If μ > 1, then P(Z = ∞) > 0.
Proof
For part (a), let X n be the number of nodes that are n steps from the root. If X n = k, we can write X n+1 = Y 1 + ⋯ + Y k where Y j is the number of children of node j at level n. By assumption, E(Y j) = μ for all j. Hence,
Hence, E[X n+1∣X n] = μX n. Taking expectations shows that E(X n+1) = μE(X n), n ≥ 0. Consequently,
Now, the sequence Z n = X 0 + ⋯ + X n is nonnegative and increases to \(Z = \sum _{n=0}^ \infty Z_n\). By MCT, it follows that E(Z n) → Z. But
Hence, E(Z) = 1∕(1 − μ) < ∞. Consequently, P(Z < ∞) = 1.
For part (b), one first observes that the theorem does not state that P(Z = ∞) = 1. For instance, assume that each node has three children with probability 0.5 and has no child otherwise. Then μ = 1.5 > 1 and P(Z = 1) = P(X 1 = 0) = 0.5, so that P(Z = ∞) ≤ 0.5 < 1. We define X n, Y j, and Z n as in the proof of part (a).
Let α n = P(X n > 0). Consider the X 1 children of the root. Since α n+1 is the probability that there is one survivor after n + 1 generations, it is the probability that at least one of the X 1 children of the root has a survivor after n generations. Hence,
Indeed, if X 1 = k, the probability that none of the k children of the root has a survivor after n generations is (1 − α n)k. Hence,
Also, α 0 = 1. As n →∞, one has α n → α ∗ = P(X n > 0, for all n). Figure 6.1 shows that α ∗ > 0. The key observations are that
so that the figure is as drawn. □
Theorem 6.2 (Cascades)
Assume p k = p ∈ (0, 1] for all k ≥ 1. Then, all nodes turn red with probability at least equal to θ where
Proof
The probability that node n does not listen to anyone is a n = (1 − p)n. Let X be the index of the first node that does not listen to anyone. Then
Now,
Thus, with probability at least θ, every node listens to at least one previous node. When that is the case, all the nodes turn red. To see this, assume that n is the first blue node. That is not possible since it listened to some previous nodes that are all red. □
6.2 Continuous-Time Markov Chains
Our goal is to understand networks where packets travel from node to node until they reach their destination. In particular, we want to study the delay of packets from source to destination and the backlog in the nodes.
It turns out that the analysis of such systems is much easier in continuous time than in discrete time. To carry out such analysis, we have to introduce continuous-time Markov chains. We do this on a few simple examples.
6.2.1 Two-State Markov Chain
Figure 6.2 illustrates a random process {X t, t ≥ 0} that takes values in {0, 1}. A random process is a collection of random variables indexed by t ≥ 0. Saying that such a random process is defined means that one can calculate the probability that \(\{X_{t_1} = x_1, X_{t_2} = x_2, \ldots , X_{t_n} = x_n\}\) for any value of n ≥ 1, any 0 ≤ t 1 ≤⋯ ≤ t n, and x 1, …, x n ∈{0, 1}. We explain below how one could calculate such a probability.
We call X t the state of the process at time t. The possible values {0, 1} are also called states. The state X t evolves according to rules characterized by two positive numbers λ and μ. As Fig. 6.2 shows, if X 0 = 0, the state remains equal to zero for a random time T 0 that is exponentially distributed with parameter λ, thus with mean 1∕λ. The state X t then jumps to 1 where it stays for a random time T 1 that is exponentially distributed with rate μ, independent of T 0, and so on. The definition is similar if X 0 = 1. In that case, X t keeps the value 1 for an exponentially distributed time with rate μ, then jumps to 0, etc.
Thus, the pdf of T 0 is
In particular,
Throughout this chapter, the symbol ≈ means “up to a quantity negligible compared to 𝜖.” It is shown in Theorem 15.3 that exponentially distributed random variable is memoryless. That is,
The memoryless property and the independence of the exponential times T k imply that {X t, t ≥ 0} starts afresh from X s at time s. Figure 6.3 illustrates that property. Mathematically, it says that given {X t, t ≤ s} with X s = k, the process {X s+t, t ≥ 0} has the same properties as {X t, t ≥ 0} given that X 0 = k, for k = 0, 1 and for any s ≥ 0. Indeed, if X s = 0, then the residual time that X t remains in 0 is exponentially distributed with rate λ and is independent of what happened before time s, because the time in 0 is memoryless and independent of the previous times in 0 and 1. This property is written as
for k = 0, 1, for all s ≥ 0, and for all sets A of possible trajectories. A generic set A of trajectories is
for given 0 < t 1 < ⋯ < t n and i 1, …, i n ∈{0, 1}. Here, C + is the set of right-continuous functions of t ≥ 0 that take values in {0, 1}.
This property is the continuous-time version of the Markov property for Markov chains. One says that the process X t satisfies the Markov property and one calls {X t, t ≥ 0} is a continuous-time Markov chain (CTMC).
For instance,
The Markov property generalizes to situations where s is replaced by a random τ that is defined by a causal rule, i.e., a rule that does not look ahead. For instance, as in Fig. 6.4, τ can be the second time that X t visits state 0. Or τ could be the first time that it visits state 0 after having spent at least 3 time units in state 1. The property does not extend to non-causal times such as one time unit before X t visits state 1. Random times τ defined by causal rules are called stopping times. This more general property is called the strong Markov property. To prove this property, one conditions on the value s of τ and uses the fact that the future evolution does not depend on this value since the event {τ = s} depends only on {X t, t ≤ s}.
For 0 < 𝜖 ≪ 1 one has
Indeed, the process jumps from 0 to 1 in 𝜖 time units if the exponential time in 0 is less than 𝜖, which has probability approximately λ𝜖.
Similarly,
We say that the transition rate from 0 to 1 is equal to λ and that from 1 to 0 is equal to μ to indicate that the probability of a transition from 0 to 1 in 𝜖 units of time is approximately λ𝜖 and that from 1 to 0 is approximately μ𝜖.
Figure 6.5 illustrates these transition rates. This figure is called the state transition diagram.
The previous two identities imply that
Also, similarly, one finds that
We can write these identities in a convenient matrix notation as follows. For t ≥ 0, one defines the row vector π t as
One also defines the transition rate matrix Q as follows:
With that notation, the previous identities can be written as
where I is the identity matrix. Subtracting π t from both sides, dividing by 𝜖, and letting 𝜖 → 0, we find
By analogy with the scalar equation dx t∕dt = ax t whose solution is \(x_t = x_0 \exp \{at\}\), we conclude that
where
Note that
Observe also that π t = π for all t ≥ 0 if and only if π 0 = π and
Indeed, if π t = π for all t, then (6.1) implies that \(0 = \frac {d}{dt} \pi _t =\pi _t Q = \pi Q\). Conversely, if π 0 = π with πQ = 0, then
These equations πQ = 0 are called the balance equations. They are
i.e.,
These two equations are identical. To determine π, we use the fact that π(0) + π(1) = 1. Combined with the previous identity, we find
The identity π t+𝜖 ≈ π t(I + Q𝜖) shows that one can view {X n𝜖, n = 0, 1, …} as a discrete-time Markov chain with transition matrix P = I + Q𝜖. Figure 6.6 shows the transition diagram that corresponds to this transition matrix. The invariant distribution for P is such that πP = π, i.e., π(I + Q𝜖) = π, so that πQ = 0, not surprisingly.
Note that this discrete-time Markov chain is aperiodic because states have self-loops. Thus, we expect that
Consequently, we expect that, in continuous time,
6.2.2 Three-State Markov Chain
The previous Markov chain alternates between the states 0 and 1. More general Markov chains visit states in a random order. We explain that feature in our next example with 3 states. Fortunately, this example suffices to illustrate the general case. We do not have to look at Markov chains with 4, 5, … states to describe the general model.
In the example shown in Fig. 6.7, the rules of evolution are characterized by positive numbers q(0, 1), q(0, 2), q(1, 2), and q(2, 0). One also defines q 0, q 1, q 2, Γ(0, 1), and Γ(0, 2) as in the figure.
If X 0 = 0, the state X t remains equal to 0 for some random time T 0 that is exponentially distributed with rate q 0. At time T 0, the state jumps to 1 with probability Γ(0, 1) or to state 2 otherwise, with probability Γ(0, 2). If X t jumps to 1, it stays there for an exponentially distributed time T 1 with rate q 1 that is independent of T 0. More generally, when X t enters state k, it stays there for a random time that is exponentially distributed with rate q k that is independent of the past evolution. From this definition, it should be clear that the process X t satisfies the Markov property.
Define π t = [π t(0), π t(1), π t(2)] where π t(k) = P(X t = k) for k = 0, 1, 2. One has, for 0 < 𝜖 ≪ 1,
Indeed, the process jumps from 0 to 1 in 𝜖 time units if the exponential time with rate q 0 is less than 𝜖 and if the process then jumps to 1 instead of jumping to 2.
Similarly,
Also,
since this is approximately the probability that the exponential time with rate q 1 is larger than 𝜖. Moreover,
because the probability that both the exponential time with rate q 2 in state 2 and the exponential time with rate q 0 in state 0 are less than 𝜖 is roughly (q 2 𝜖) × (q 1 𝜖), and this is negligible compared to 𝜖.
These observations imply that
Proceeding in a similar way shows that
Similarly to the two-state example, let us define the rate matrix Q as follows:
The previous identities can then be written as follows:
Subtracting π t from both sides, dividing by 𝜖, and letting 𝜖 → 0 then shows that
As before, the solution of this equation is
The distribution π is invariant if and only if
Once again, we note that {X n𝜖, n = 0, 1, …} is approximately a discrete-time Markov chain with transition matrix P = I + Q𝜖 shown in Fig. 6.8. This Markov chain is aperiodic, and we conclude that
Thus, we can expect that
Also, since X n𝜖 is irreducible, the long-term fraction of time that it spends in the different states converge to π, and we can then expect the same for X t.
6.2.3 General Case
Let \(\mathcal {X}\) be a countable or finite set. The process {X t, t ≥ 0} is defined as follows. One is given a probability distribution π on \(\mathcal {X}\) and a rate matrix \(Q = \{q(i, j), i, j \in \mathcal {X}\}\).
By definition, Q is such that
Definition 6.1 (Continuous-Time Markov Chain)
A continuous-time Markov chain with initial distribution π and rate matrix Q is a process {X t, t ≥ 0} such that P(X 0 = i) = π(i). Also,
◇
This definition means that the process jumps from i to j ≠ i with probability q(i, j)𝜖 in 𝜖 ≪ 1 time units. Thus, q(i, j) is the probability of jumping from i to j, per unit of time. Note that the sum of these expressions over all j gives 1, as should be.
One construction of this process is as follows. Say that X t = i. One then chooses a random time τ that is exponentially distributed with rate q i := −q(i, i). At time t + τ, the process jumps and goes to state y with probability Γ(i, j) = q(i, j)∕q i for j ≠ i (Fig. 6.9).
Thus, if X t = i, the probability that X t+𝜖 = j is the probability that the process jumps in (t, t + 𝜖), which is q i 𝜖, times the probability that it then jumps to j, which is Γ(i, j). Hence,
up to o(𝜖). Thus, the construction yields the correct transition probabilities.
As we observed in the examples,
so that
Moreover, a distribution π is invariant if and only if it solves the balance equations
These equations, state by state, say that
These equations express the equality of the rate of leaving a state and the rate of entering that state.
Define
The Markov property implies that
for all \(i_1, \ldots , i_n \in \mathcal {X}\) and all 0 < t 1 < ⋯ < t n.
Moreover, this identity implies the Markov property. Indeed, if it holds, one has
Hence,
If X t has the invariant distribution, one has
for all \(i_1, \ldots , i_n \in \mathcal {X}\) and all 0 < t 1 < ⋯ < t n.
Here is the result that corresponds to Theorem 15.1. We define irreducibility, transience, and null and positive recurrence as in discrete time. There is no notion of periodicity in continuous time.
Theorem 6.1 (Big Theorem for Continuous-Time Markov Chains)
Consider a continuous-time Markov chain.
-
(a)
If the Markov chain is irreducible, the states are either all transient, all positive recurrent, or all null recurrent. We then say that the Markov chain is transient, positive recurrent, or null recurrent, respectively.
-
(b)
If the Markov chain is positive recurrent, it has a unique invariant distribution π and π(i) is the long-term fraction of time that X t is equal to i. Moreover, the probability π t(i) that the Markov chain X t is in state i converges to π(i).
-
(c)
If the Markov chain is not positive recurrent, it does not have an invariant distribution and the fraction of time that it spends in any state goes to zero.
\({\blacksquare }\)
6.2.4 Uniformization
We saw earlier that a CTMC can be approximated by a discrete-time Markov chain that has a time step 𝜖 ≪ 1. There are two other DTMCs that have a close relationship with the CTMC: the jump chain and the uniformized chain. We explain these chains for the CTMC X t in Fig. 6.7.
The jump chain is X t observed when it jumps. As Fig. 6.7 shows, this DTMC has a transition matrix equal to Γ where
Let ν be the invariant distribution of this jump chain. That is, ν = νΓ. Since ν(i) is the long-term fraction of time that the jump chain is in state i, and since the CTMC X t spends an average time 1∕q i in state i whenever it visits that state, the fraction of time that X t spends in state i should be proportional to ν(i)∕q i. That is, one expects
for some constant A. That is, one should have
To verify that equality, we observe that
We used the fact that νΓ = ν and q(i, i) = −q i.
The uniformized chain is not the jump chain. It is a discrete-time Markov chain obtained from the CTMC as follows. Let λ ≥ q i for all i. The rate at which X t changes state is q i when it is in state i. Let us add a dummy jump from i to i with rate λ − q i. The rate of jumps, including these dummy jumps, of this new Markov chain Y t is now constant and equal to λ.
The transition matrix P of Y t is such that
To see this, assume that Y t = i. The next jump will occur with rate λ. With probability (λ − q i)∕λ, it is a dummy jump from i to i. With probability q i∕λ it is an actual jump where Y t jumps to j ≠ i with probability Γ(i, j). Hence, Y t jumps from i to i with probability (λ − q i)∕λ and from i to j ≠ i with probability (q i∕λ)Γ(i, j) = q(i, j)∕λ.
Note that
where I is the identity matrix.
Now, define Z n to be the jump chain of Y t, i.e., the Markov chain with transition matrix P. Since the jumps of Y t occur at rate λ, independently of the value of the state Y t, we can simulate Y t as follows. Let N t be a Poisson process with rate λ. The jump times {t 1, t 2, …} of N t will be the jump times of Y t. The successive values of Y t are those of Z n. Formally,
That is, if N t = n, then we define Y t = Z n. Since the CTMC Y t spends 1∕λ on average between jumps, the invariant distribution of Y t should be the same as that of X t, i.e., π. To verify this, we check that πP = π, i.e., that
That identity holds since πQ = 0. Thus, the DTMC Z n has the same invariant distribution as X t. Observe that Z n is not the same as the jump chain of X t. Also, it is not a discrete-time approximation of X t. This DTMC shows that a CTMC can be seen as a DTMC where one replaces the constant time steps by i.i.d. exponentially distributed time steps between the jumps.
6.2.5 Time Reversal
As a preparation for our study of networks of queues, we note the following result.
Theorem 6.2 (Kelly’s Lemma)
Let Q be the rate matrix of a Markov chain on \(\mathcal {X}\) . Let also \(\tilde Q\) be another rate matrix on \(\mathcal {X}\) . Assume that π is a distribution on \(\mathcal {X}\) and that
Then πQ = 0. \({\blacksquare }\)
Proof
We have
so that πQ = 0. □
The following result explains the meaning of \(\tilde Q\) in the previous theorem. We state it without proof.
Theorem 6.3
Assume that X t has the invariant distribution π. Then X t reversed in time is a Markov chain with rate matrix \(\tilde Q\) given by
\({\blacksquare }\)
6.3 Product-Form Networks
Theorem 6.4 (Invariant Distribution of Network)
Assume λ k < μ k and let ρ k = λ k∕μ k , for k = 1, 2, 3. Then the Markov chain X t has a unique invariant distribution π that is given by
where p(1) = λ 1∕(λ 1 + λ 2) and p(2) = λ 2∕(λ 1 + λ 2).
Proof
Figure 6.10 shows a guess for the time-reversal of the network.
Let Q be the rate matrix of the top network and \(\tilde Q\) that of the bottom one. Let also π be as stated in the theorem. We show that \(\pi , Q, \tilde Q\) satisfy the conditions of Kelly’s Lemma.
For instance, we verify that
Looking at the figure, we can see that
Thus, the previous identity reads
i.e.,
Given the expression for π, this is
After simplifications, this identity is seen to be equivalent to
i.e.,
and this equation is seen to be satisfied. A similar argument shows that Kelly’s lemma is satisfied for all pairs of states. □
6.4 Proof of Theorem 5.7
The first step in using the theorem is to solve the flow conservation equations. Let us call class 1 that of the white jobs and class 2 that of the gray job. Then we see that
solve the flow conservation equations for any α > 0. We have to assume γ < μ for the services to be able to keep up with the white jobs. With this assumption, we can choose α small enough so that \(\lambda _1 = \lambda _2 = \lambda := \gamma + \alpha < \min \{\mu _1, \mu _2\}.\)
The second step is to use the theorem to obtain the invariant distribution. It is
with
where ρ 1 = γ∕μ, ρ 2 = α∕μ, and n c(x) is the number of jobs of class c in x i, for c = 1, 2. To calculate A, we note that there are n + 1 states x i with n class 1 jobs and 1 class 2 job, and 1 state x i with n classes 1 jobs and no class 2 job. Indeed, the class 2 customer can be in n + 1 positions in the queue with the n customers of class 1.
Also, all the possible pairs (x 1, x 2) must have one class 2 customer either in queue 1 or in queue 2. Thus,
where
In this expression, the first term corresponds to the states with m class 1 customers and one class 2 customer in queue 1 and n customers of class 1 in queue 2; the second term corresponds to the states with m customer of class 1 in queue 1, and n customers of class 1 and one customer of class 2 in queue 2. Thus, AG(m, n) is the probability that there are m customers of class 1 in the first queue and n customers of class 1 in the second queue.
Hence,
by symmetry of the two terms. Thus,
To compute the sum, we use the following identities:
and
Thus, one has
so that
Third, we calculate the expected number L of jobs of class 1 in the two queues. One has
where the last identity follows from the symmetry of the two terms. Thus,
To calculate the sums, we use the fact that
Also,
Hence,
Substituting the value for A that we derived above, we find
Finally, we get the average time W that jobs of class 1 spend in the network: W = L∕γ.
Without the gray job, the expected delay W′ of the white jobs would be the sum of delays in two M/M/1 queues, i.e., W′ = L′∕γ where
Hence, we find that
so that using a hello message increases the average delay of the class 1 customers by 50%.
References
P. Bremaud, Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues (Springer, Berlin, 2008)
F. Kelly, Reversibility and Stochastic Networks (Wiley, Hoboken, 1979)
R. Lyons, Y. Perez, Probability on Trees and Networks. Cambridge Series in Statistical and Probabilistic Mathematics (2017)
M.J. Neely, Stochastic Network Optimization with Application to Communication and Queueing Systems (Morgan & Claypool, San Rafael, 2010)
Author information
Authors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2021 The Author(s)
About this chapter
Cite this chapter
Walrand, J. (2021). Networks—B. In: Probability in Electrical Engineering and Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-49995-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-49995-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49994-5
Online ISBN: 978-3-030-49995-2
eBook Packages: Computer ScienceComputer Science (R0)