Application: Social Networks, Communication Networks

Topics: Random Graphs, Queueing Networks

5.1 Spreading Rumors

Picture yourself in a social network. You are connected to a number of “friends” who are also connected to friends. You send a message to some of your friends and they in turn forward it to some of their friends. We are interested in the number of people who eventually get the message.

To explore this question, we model the social network as a random tree of which you are the root. You send a message to a random number of your friends that we model as the children of the root node. Similarly, every node in the graph has a random number of children. Assume that the numbers of children of the different nodes are independent, identically distributed, and have mean μ. The tree is potentially infinite, a clear mathematical idealization. The model ignores cycles in friendships, another simplification.

The model is illustrated in Fig. 5.1. Thus, the graph only models the people who get a copy of the message. For the problem to be non-trivial, we assume that there is a positive probability that some nodes have no children, i.e., that someone does not forward the message. Without this assumption, the message always spreads forever.

Fig. 5.1
figure 1

The spreading of a message as a random tree

We have the following result.

Theorem 5.1 (Spreading of a Message)

Let Z be the number of nodes that eventually receive the message.

  1. (a)

    If μ < 1, then P(Z < ∞) = 1 and E(Z) < ∞;

  2. (b)

    If μ > 1, then P(Z = ∞) > 0.

\({\blacksquare }\)

We prove that result in the next chapter. The result should be intuitive: if Z < 1, the spreading dies out, like a population that does not reproduce enough. This model is also relevant for the spread of epidemics or cyber viruses.

5.2 Cascades

If most of your friends prefer Apple over Samsung, you may follow the majority. In turn, your advice will influence other friends. How big is such an influence cascade?

We model that situation with nodes arranged in a line, in the chronological order of their decisions, as shown in Fig. 5.2. Node n listens to the advice of a subset of {0, 1, …, n − 1} who have decided before him. Specifically, node n listens to the advice of node n − k independently with probability p k, for k = 1, …, n. If the majority of these friends are blue, node n turns blue; if the majority are red, node n turns red; in case of a tie, node n flips a fair coin and turns red with probability 1∕2 or blue otherwise. Assume that, initially, node 0 is red. Does the fraction of red nodes become larger than 0.5, or does the initial effect of node 0 vanish?

Fig. 5.2
figure 2

An influence cascade

A first observation is that if nodes listen only to their left-neighbor with probability p ∈ (0, 1), the cascade ends. Indeed, there is a first node that does not listen to its neighbor and then turns red or blue with equal probabilities. Consequently, there will be a string of red nodes followed by a string of blue node, and so on. By symmetry, the lengths of those strings are independent and identically distributed. It is easy to see they have a finite mean. The SLLN then implies that the fraction of red nodes among the first n nodes converges to 0.5. In other words, the influence of the first node vanishes.

The situation is less obvious if p k = p < 1 for all k. Indeed, in this case, as n gets large, node n is more likely to listen to many previous neighbors. The slightly surprising result is that, no matter how small p is, there is a positive probability that all the nodes turn red.

Theorem 5.2 (Cascades)

Assume p k = p ∈ (0, 1] for all k ≥ 1. Then, all nodes turn red with probability at least equal to θ where

$$\displaystyle \begin{aligned} \theta = \exp\left\{- \frac{1 - p}{p} \right\}. \end{aligned}$$

\({\blacksquare }\)

We prove the result in the next chapter. It turns out to be possible that every node listens to at least one previous node. In that case, all the nodes turn red.

5.3 Seeding the Market

Some companies distribute free products to spread their popularity. What is the best fraction of customers who should get free products? To explore this question, let us go back to our model where each node listens only to its left-neighbor with probability p. The system is the same as before, except that each node gets a free product and turns red with probability λ. The fraction of red nodes increases in λ and we write it as ψ(λ). If the cost of a product is c and the selling price is s, the company makes a profit (s − c)ψ(λ) −  since it makes a profit s − c from a buyer and loses c for each free product. The company then can select λ to optimize its profit. Next, we calculate ψ(λ).

Let π(n − 1) be the probability that user n − 1 is red. If user n listens to n − 1, he turns red unless n − 1 is blue and he does not get a free product. If he does not listen to n − 1, he turns red with probability 0.5 if he does not get a free product and with probability one otherwise. Thus,

$$\displaystyle \begin{aligned} \pi(n) &= p(1 - (1 - \pi(n-1))(1 - \lambda)) + (1 - p)(0.5(1 - \lambda) + \lambda) \\ &= p(1 - \lambda) \pi(n-1) + 0.5 \lambda p + 0.5 + 0.5 \lambda - 0.5p + 0.5 \lambda p. \end{aligned} $$

Since p(1 − λ) < 1, the value of π(n) converges to the value ψ(λ) that solves the fixed point equation

$$\displaystyle \begin{aligned} \psi(\lambda) = p(1 - \lambda)\psi(\lambda) + 0.5 \lambda p + 0.5 + 0.5 \lambda - 0.5p + 0.5 \lambda p. \end{aligned}$$

Hence,

$$\displaystyle \begin{aligned} \psi(\lambda) = 0.5 \frac{1 + \lambda - p + \lambda p}{1 - p(1 - \lambda)}. \end{aligned}$$

To maximize the profit (s − c)ψ(λ) − , we substitute the expression for ψ(λ) in the profit and we set the derivative with respect to λ equal to zero. After some algebra, we find that the optimal λ is given by

$$\displaystyle \begin{aligned} \lambda^* = \min\left\{1, \frac{(1 - p)^{1/2} - (1 - p)}{p} \sqrt{\frac{0.5(s - c)}{c}} \right\}. \end{aligned}$$

Not surprisingly, λ increases with the profit margin (s − c)∕c and decreases with p.

5.4 Manufacturing of Consent

Three people walk into a bar. No, this is not a joke. They chat and, eventually, leave with the same majority opinion. As such events repeat, the opinion of the population evolves. We explore a model of this evolution.

Consider a population of 2N ≥ 4 people. Initially, half believe red and the other half believe blue. We choose three people at random. If two are blue and one is red, they all become blue, and they return to the general population. The other cases are similar. The same process then repeats. Let X n be the number of blue people after n steps, for n ≥ 1 and let X 0 = N. Then X n is a Markov chain. This Markov chain has two absorbing states: 0 and 2N. Indeed, if X n = k for some k ∈{1, …, 2N − 1}, there is a positive probability of choosing three people where two have one opinion and the third has a different one. After their meeting, X n+1 ≠ X n. The Markov chain is such that P(1, 0) = 1 and P(2N − 1, 2N) = 1. Moreover, P(k, k) > 0, P(k, k + 1) > 0, and P(k, k − 1) > 0 for all k ∈{2, …, 2N − 2}. Consequently, with probability one,

$$\displaystyle \begin{aligned} \lim_{n \to \infty} X_n \in \{0, 2N\}. \end{aligned}$$

Thus, eventually, everyone is blue or everyone is red. By symmetry, the two limits have probability 0.5.

What is the effect of the media on the limiting consensus? Let us modify our previous model by assuming that when two blue and one red person meet, they all turn blue with probability 1 − p and remain as before with probability p. Here p models the power of the media at convincing people to stay red. If two red and one blue meet, they all turn red.

We have, for k ∈{2, …, 2N − 2},

$$\displaystyle \begin{aligned} P[X_{n+1} = k + 1 \mid X_n = k] = (1 - p) 3 \frac{k(k-1)(2N - k)}{2N(2N - 1)(2N - 2)} =: p(k). \end{aligned}$$

Indeed, X n increases with probability 1 − p from k to k + 1 if in the meeting two people are blue and one is red. The probability that the first one is blue is k∕(2N) since there are k blue people among 2N. The probability that the second is also blue is then (k − 1)∕(2N − 1). Also, the probability that the third is red is (2N − k)∕(2N − 2) since there are 2N − k red people among the 2N − 2 who remain after picking two blue. Finally, there are three orderings in which one could pick one red and two blue.

Similarly, for k ∈{2, …, 2N − 2},

$$\displaystyle \begin{aligned} P[X_{n+1} = k - 1 \mid X_n = k] = 3 \frac{(2N - k)(2N - k - 1)k}{2N(2N - 1)(2N - 2)} =: q(k). \end{aligned}$$

We want to calculate

$$\displaystyle \begin{aligned} \alpha (k) = P[ T_{2N} < T_0 \mid X_0 = k], \end{aligned}$$

where T 0 is the first time that X n = 0 and T 2N is the first time that X n = 2N. Then, α(N) is the probability that the population eventually becomes all red.

The first step equations are, for k ∈{2, …, 2N − 2},

$$\displaystyle \begin{aligned} \alpha(k) = p(k)\alpha(k+1) + q(k) \alpha(k-1) + (1 - p(k) - q(k)) \alpha(k), \end{aligned}$$

i.e.,

$$\displaystyle \begin{aligned} (p(k) + q(k)) \alpha(k) = p(k)\alpha(k+1) + q(k) \alpha(k-1). \end{aligned}$$

The boundary conditions are α(1) = 0, α(2N − 1) = 1.

We solve these equations numerically, using Python. Our procedure is as follows. We let α(1) = 0 and α(2) = A, for some constant A. We then solve recursively

$$\displaystyle \begin{aligned} \alpha(k+1) &= (1 + \frac{q(k)}{p(k)}) \alpha(k) - \frac{q(k)}{p(k)} \alpha(k-1), k = 2, 3, \ldots, 2N - 2 \\ &= \left(1 + \frac{2N - k - 1}{(1 - p)(k - 1)}\right) \alpha(k) \\ &\quad - \frac{2N - k - 1)}{(1 - p)(k - 1)} \alpha(k-1), k = 2, 3, \ldots, 2N - 2. \end{aligned} $$

Eventually, we find α(2N − 1). This value is proportional to A. Since α(2N − 1) = 1, we then divide all the α(k) by α(2N − 1). Not elegant, but it works. We repeat this process for p = 0, 0.02, 0.04, …, 0.14. Figure 5.3 shows the results for N = 450, i.e., for a population of 900 people.

Fig. 5.3
figure 3

The effect of the media. Here, p is the probability that someone remains red after chatting with two blue people. The graph shows the probability that the whole population turns blue instead of red. A small amount of persuasion goes a long way

5.5 Polarization

In most countries, the population is split among different political and religious persuasions. How is this possible if everyone is faced with the same evidence? One effect is that interactions are not fully mixing. People belong to groups that may converge to a consensus based on the majority opinion of the group.

To model this effect, we consider a population of N people. An adjacency matrix G specifies which people are friends. Here, G(v, w) = 1 if v and w are friends and G(v, w) = 0 otherwise.

Initially, people are blue or red with equal probabilities. We pick one person at random. If that person has a majority of red friends, she becomes red. If the majority of her friends are blue, she becomes blue. If it is a tie, she does not change. We repeat the process. Note that the graph does not change; it is fixed throughout. We want to explore how the coloring of people evolves over time.

Let X n(v) ∈{B, R} be the state of person v at time n, for n ≥ 0 and v ∈{1, …, N}. We pick v at random. We count the number of red friends and blue friends of v. They are given by

$$\displaystyle \begin{aligned} \sum_w G(v, w) 1\{X_n(w) = R\} \mbox{ and } \sum_w G(v, w) 1\{X_n(w) = B\}. \end{aligned}$$

Thus,

$$\displaystyle \begin{aligned} X_{n+1}(v) = \left\{ \begin{array}{l l} R, & \mbox{ if } \sum_w G(v, w) 1\{X_n(w) {\,=\,} R\} > \sum_w G(v, w) 1\{X_n(w) {\,=\,} B\} \\ B, & \mbox{ if } \sum_w G(v, w) 1\{X_n(w) {\,=\,} R\} < \sum_w G(v, w) 1\{X_n(w) {\,=\,} B\} \\ X_n(v), & \mbox{ otherwise.} \end{array} \right. \end{aligned}$$

We have the following result.

Theorem 5.3

The state X n = {X n(v), v = 1, …, N} of the system always converges. However, the limit may be random. \({\blacksquare }\)

Proof

Define the function V (X n) as follows:

$$\displaystyle \begin{aligned} V(X_n) = \sum_v \sum_w 1\{X_n(v) \neq X_n(w)\}. \end{aligned}$$

That is, V (X n) is the number of disagreements among friends. The rules of evolution guarantee that V (X n+1) ≤ V (X n) and that P(V (X n+1) < V (X n)) > 0 unless P(X n+1 = X n) = 1. Indeed, if the state of v changes, it is to make that person agree with more of her neighbors. Also, if there is no v who can reduce her number of disagreements, then the state can no longer change. These properties imply that the state converges.

A simple example shows that the limit may be random. Consider four people at the vertices of a square that represents G. Assume that two opposite vertices are blue and the other two are red. If the first person v to reconsider her opinion is blue, she turns red, and the limit is all red. If v is red, the limit is all blue. Thus, the limit is equally likely to be all red or all blue. □

In the limit, it may be that a fraction of the nodes are red and the others are blue. For instance, if the nodes are arranged in a line graph, then the limit is alternating sequences of at least two red nodes and sequences of at least two blue nodes.

The properties of the limit depend on the adjacency graph G. One might think that a close group of friends should have the same color, but that is not necessarily the case, as the example of Fig. 5.4 shows.

Fig. 5.4
figure 4

A close group of friends, the four vertices of the square, do not share the same color

5.6 MM∕1 Queue

We discuss a simple model of a queue, called an MM∕1 queue. We turn to networks in the next section. This section uses concepts from continuous-time Markov chains that we develop in the next chapter. Thus, the discussion here is a bit informal, but is hopefully clear enough to be read first.

Figure 5.5 illustrates a queue where customers (this is the standard terminology) arrive and a server serves them one at a time, in a first come, first served order. The times between arrivals are independent and exponentially distributed with rate λ. Thus, the average time between two consecutive arrivals is 1∕λ, so that λ customers arrive per unit of time, on average. The service times are independent and exponentially distributed with rate μ. The durations of the service times and the arrival times are independent. The expected value of a service time is 1∕μ. Thus, if the queue were always full, there would be μ service completions per unit time, on average. If λ < μ, the server can keep up with the arrivals, and the queue should empty regularly. If λ > μ, one can expect the number of customers in the queue to increase without bound.

Fig. 5.5
figure 5

An MM∕1 queue, a possible realization, and its state transition diagram

In the notation MM∕1, the first M indicates that the inter-arrival times are memoryless, the second M indicates that the service times are memoryless, and the 1 indicates that there is one server. As you may expect, there are related notations such as DM∕3 or MG∕5, and so on, where the inter-arrival times and the service times have other properties and there are multiple servers.

Let X t be the number of customers in the queue at time t, for t ≥ 0. We call X t the queue length process. The middle part of Fig. 5.5 shows a possible realization of that process. Observing the queue length process up to some time t provides information about previous inter-arrival times and service times and also about when the last arrival occurred and when the last service started. Since the inter-arrival times and service times are independent and memoryless, this information is independent of the time until the next arrival or the next service completion. In particular, given {X s, s ≤ t}, the likelihood that a new arrival occurs during (t, t + 𝜖] is approximately λ𝜖 for 𝜖 ≪ 1; also the likelihood that a service completes during (t, t + 𝜖] is approximately μ𝜖 if X t > 0 and zero if X t = 0.

The bottom part of Fig. 5.5 is a state transition diagram that indicates the rates of transitions. For instance, the arrow from 1 to 2 is marked with λ to indicate that, in 𝜖 ≪ 1 s, the queue length jumps from 1 to 2 with probability λ𝜖. The figure shows that arrivals (that increase the queue length) occur at the same rate λ, independently of the queue length. Also, service completions (that reduce the queue length) occur at rate μ as long as the queue is nonempty.

Note that

$$\displaystyle \begin{aligned} P(X_{t+\epsilon} = 0) &= P(X_t = 0, X_{t+\epsilon} = 0) + P(X_t = 1, X_{t+\epsilon} = 0) \\ &\approx P(X_t = 0)(1 - \lambda \epsilon) + P(X_t = 1) \mu \epsilon. \end{aligned} $$

The first identity is the law of total probability: the event {X t+𝜖 = 0} is the union of the two disjoint events {X t = 0, X t+𝜖 = 0} and {X t = 1, X t+𝜖 = 0}. The second identity uses the fact that {X t = 0, X t+𝜖 = 0} occurs when X t = 0 and there is no arrival during (t, t + 𝜖]. This event has probability P(X t = 0) multiplied by (1 − λ𝜖) since arrivals are independent of the current queue length. The other term is similar.

Now, imagine that π is a pmf on \(\mathbb {Z}_{\geq 0} := \{0, 1, \ldots \}\) such that P(X t = i) = π(i) for all time t and \(i \in \mathbb {Z}_{\geq 0}\). That is, assume that π is an invariant distribution for X t. In that case, P(X t+𝜖 = 0) = π(0), P(X t = 0) = π(0), and P(X t = 1) = π(1). Hence, the previous identity implies that

$$\displaystyle \begin{aligned} \pi(0) \approx \pi(0)(1 - \lambda \epsilon) + \pi(1) \mu \epsilon. \end{aligned}$$

Subtracting π(0)(1 − λ𝜖) from both terms gives

$$\displaystyle \begin{aligned} \pi(0) \lambda \epsilon \approx \pi(1) \mu \epsilon. \end{aligned}$$

Dividing by 𝜖 shows thatFootnote 1

$$\displaystyle \begin{aligned} \pi(0) \lambda = \pi(1) \mu. \end{aligned} $$
(5.1)

Similarly, for i ≥ 1, one has

$$\displaystyle \begin{aligned} P(X_{t+\epsilon} = i) &= P(X_t = i-1, X_{t+\epsilon} = i) + P(X_t = i, X_{t+\epsilon} = i) \\ &\quad + P(X_t = i+1, X_{t+\epsilon} = i) \\ &\approx P(X_t = i-1) \lambda \epsilon + P(X_t = i)(1 - \lambda \epsilon - \mu \epsilon) + P(X_t = i+1) \mu \epsilon. \end{aligned} $$

Hence,

$$\displaystyle \begin{aligned} \pi(i) \approx \pi(i-1) \lambda \epsilon + \pi(i)(1 - \lambda \epsilon - \mu \epsilon) + \pi(i+1) \mu \epsilon. \end{aligned}$$

This relation implies that

$$\displaystyle \begin{aligned} \pi(i) (\lambda + \mu) = \pi(i-1) \lambda + \pi(i+1) \mu, i \geq 1. \end{aligned} $$
(5.2)

The Eqs. (5.1)–(5.2) are called the balance equations. Thus, if π is invariant for X t, it must satisfy the balance equations. Looking back at our calculations, we also see that if π satisfies the balance equations, and if P(X t = i) = π(i) for all i, then P(X t+𝜖 = i) = π(i) for all i. Thus, π is invariant for X t if and only if it satisfies the balance equations.

One can solve the balance equations (5.1)–(5.2) as follows. Equation (5.1) shows that π(1) = ρπ(0) with ρ = λμ. Subtracting (5.1) from (5.2) yields

$$\displaystyle \begin{aligned} \pi(1) \lambda = \pi(2) \mu. \end{aligned}$$

This equation then shows that π(2) = π(1)ρ = π(0)ρ 2. Continuing in this way shows that π(n) = π(0)ρ n for n ≥ 0. To find π(0), we use the fact that ∑n π(n) = 1. That is

$$\displaystyle \begin{aligned} \sum_{n=0}^\infty \pi(0) \rho^n = 1. \end{aligned}$$

If ρ ≥ 1, i.e., if λ ≥ μ, this is not possible. In that case there is no invariant distribution. If ρ < 1, then the previous equation becomes

$$\displaystyle \begin{aligned} \pi(0) \frac{1}{1 - \rho} = 1, \end{aligned}$$

so that π(0) = 1 − ρ and

$$\displaystyle \begin{aligned} \pi(n) = (1 - \rho) \rho^n, n \geq 0. \end{aligned}$$

In particular, when X t has the invariant distribution π, one has

$$\displaystyle \begin{aligned} E(X_t) = \sum_{n = 0}^\infty n (1 - \rho) \rho^n = \frac{\rho}{1 - \rho} = \frac{\lambda}{\mu - \lambda} =: L. \end{aligned}$$

To calculate the average delay W of a customer in the queue, one can use Little’s Law L = λW. This identity implies that

$$\displaystyle \begin{aligned} W = \frac{1}{\mu - \lambda}. \end{aligned}$$

Another way of deriving this expression is to realize that if a customer finds k other customers in the queue upon his arrival, he has to wait for k + 1 service completions before he leaves. Since very service completions lasts 1∕μ on average, his average delay is (k + 1)∕μ. Now, the probability that this customer finds k other customers in the queue is π(k). To see this, note that the probability that a customer who enters the queue between time t and t + 𝜖 finds k customers in the queue is

$$\displaystyle \begin{aligned} P[X_t = k \mid X_{t+\epsilon} = X_t + 1]. \end{aligned}$$

Now, the conditioning event is independent of X t, because the arrivals occur at rate λ, independently of the queue length. Thus, the expression above is equal to P(X t = k) = π(k). Hence,

$$\displaystyle \begin{aligned} W = \sum_{k=0}^\infty \frac{k+1}{\mu} \pi(k) = \sum_{k=0}^\infty \frac{k+1}{\mu} (1 - \rho) \rho^k = \frac{1}{\mu - \lambda}, \end{aligned}$$

as some simple algebra shows.

5.7 Network of Queues

Figure 5.6 shows a representative network of queues. Two types of customers arrive into the network, with respective rates γ 1 and γ 2. The first type goes through queue 1, then queue 3, and should leave the network. However, with probability p 1 these customers must go back to queue 1 and try again. In a communication network, this event models an transmission error where a packet (a group of bits) gets corrupted and has to be retransmitted. The situation is similar for the other type. Thus, in 𝜖 ≪ 1 time unit, a packet of the first type arrives with probability γ 1 𝜖, independently of what happened previously. This is similar to the arrivals into an MM∕1 queue. Also, we assume that the service times are exponentially distributed with rate μ k in queue k, for k = 1, 2, 3.

Fig. 5.6
figure 6

A network of queues

Let \(X_t^k\) be the number of customers in queue k at time t, for k = 1, 2 and t ≥ 0. Let also \(X_t^3\) be the list of customer types in queue 3 at time t. For instance, in Fig. 5.6, one has \(X_t^3 = (1, 1, 2, 1)\), from tail to head of the queue to indicate that the customer at the head of the queue is of type 1, that he is followed by a customer of type 2, etc. Because of the memoryless property of the exponential distribution, the process \(X_t = (X_1^1, X_t^2, X_t^3)\) is a Markov chain: observing the past up to time t does not help predict the time of the next arrival or service completion.

Figure 5.6 shows the transition rates out of the current state (3, 2, (1, 1, 2, 1)). For instance, with rate μ 3 p 1, a service completes in queue 3 and that customer has to go back to queue 1, so that the new state is (4, 2, (1, 1, 2)). The other transitions are similar.

One can then, in principle, write down the balance equations and try to solve them. This looks like a very complex task and it seems very unlikely that one could solve these equations analytically. However, a miracle occurs and one has the remarkably simple result stated in the next theorem. Before we state the result, we need to define λ 1, λ 2, and λ 3. As sketched in Fig. 5.6, for k = 1, 2, 3, the quantity λ k is the rate at which customers go through queue k, in the long term. These rates should be such that

$$\displaystyle \begin{aligned} \lambda_1 &= \gamma_1 + \lambda_1 p_1 \\ \lambda_2 &= \gamma_2 + \lambda_2 p_2 \\ \lambda_3 &= \lambda_1 + \lambda_2. \end{aligned} $$

For instance, the rate λ 1 at which customers enter queue 1 is the rate γ 1 plus the rate at which customers of type 1 that leave queue 3 are sent back to queue 1. Customers of type 1 go through queue 3 at rate λ 1, since they come out of queue 1 at rate λ 1; also, a fraction p 1 of these customers go back to queue 1. The other expressions can be understood similarly. The equations above are called the flow conservation equations.

These equations admit the following solution:

$$\displaystyle \begin{aligned} \lambda_1 = \frac{\gamma_1}{1 - p_1}, \lambda_2 = \frac{\gamma_2}{1 - p_2}, \lambda_3 = \lambda_1 + \lambda_2. \end{aligned}$$

Theorem 5.4 (Invariant Distribution of Network)

Assume λ k < μ k and let ρ k = λ kμ k , for k = 1, 2, 3. Then the Markov chain X t has a unique invariant distribution π that is given by

$$\displaystyle \begin{aligned} \pi(x_1, x_2, x_3) &= \pi_1(x_1)\pi_2(x_2)\pi_3(x_3) \\ \pi_k(n) &= (1 - \rho_k) \rho_k^n, n \geq 0, k = 1, 2 \\ \pi_3( a_1, a_2, \ldots, a_n) &= p(a_1)p(a_2) \cdots p(a_n) (1 - \rho_3) \rho_3^n, \\ n &\geq 0, a_k \in \{1, 2\}, k = 1, \ldots, n, \end{aligned} $$

where p(1) = λ 1∕(λ 1 + λ 2) and p(2) = λ 2∕(λ 1 + λ 2).

\({\blacksquare }\)

This result shows that the invariant distribution has a product form.

We prove this result in the next chapter. It indicates that under the invariant distribution π, the states of the three queues are independent. Moreover, the state of queue 1 has the same invariant distribution as an MM∕1 queue with arrival rate λ 1 and service rate μ 1, and similarly for queue 2. Finally, queue 3 has the same invariant distribution as a single queue with arrival rates λ 1 and λ 2 and service rate μ 3: the length of queue 3 has the same distribution as an MM∕1 queue with arrival rate λ 1 + λ 3 and the types of the customers in the queue are independent and of type 1 with probability p(1) and 2 with probability p(2).

This result is remarkable not only for its simplicity but mostly because it is surprising. The independence of the states of the queues is shocking: the arrivals into queue 3 are the departures from the other two queues, so it seems that if customers are delayed in queues 1 and 2, one should have larger values for \(X_t^1\) and \(X_t^2\) and a smaller one for the length of queue 3. Thus, intuition suggests a strong dependency between the queue lengths. Moreover, the fact that the invariant distributions of the queues are the same as for MM∕1 queues is also shocking. Indeed, if there are many customers in queue 1, we know that a fraction of them will come back into the queue, so that future arrivals into queue 1 depend on the current queue length, which is not the case for an MM∕1 queue. The paradox is explained in a reference.

We use this theorem to calculate the delay of customers in the network.

Theorem 5.5

For k = 1, 2, the average delay W k of customers of type k is given by

$$\displaystyle \begin{aligned} W_k = \frac{1}{1 - p_k} \left( \frac{1}{\mu_k - \lambda_k} + \frac{1}{\mu_3 - \lambda_1 - \lambda_2}\right), \end{aligned}$$

where

$$\displaystyle \begin{aligned} \lambda_1 = \frac{\gamma_1}{1 - p_1} \mathit{\mbox{ and }} \lambda_2 = \frac{\gamma_2}{1 - p_2}. \end{aligned}$$

\({\blacksquare }\)

Proof

We use Little’s Law that says that L k = γ k W k where L k is the average number of customers of type k in the network. Consider the case k = 1. The other one is similar. L 1 is the average number of customers in queue 1 plus the average number of customers of type 1 in queue 3.

The average length of queue 1 is λ 1∕(μ 1 − λ 1) because the invariant distribution of queue 1 is the same as that of an MM∕1 queue with arrival rate λ 1 and service rate μ 1.

The average length of queue 3 is (λ 1 + λ 2)∕(μ 3 − λ 1 − λ 2) because the invariant distribution of queue 3 is the same as queue with arrival rate λ 1 and λ 2 and service rate μ 3. Also, the probability that any customer in queue 3 is of type 1 is p(1) = λ 1∕(λ 1 + λ 2). Thus, the average number of customers of type 1 in queue 3 is

$$\displaystyle \begin{aligned} p(1) \frac{\lambda_1 + \lambda_2}{\mu_3 - \lambda_1 - \lambda_2} = \frac{\lambda_1}{\mu_3 - \lambda_1 - \lambda_2}. \end{aligned}$$

Hence,

$$\displaystyle \begin{aligned} L_1 = \frac{\lambda_1}{\mu_1 - \lambda_1} + \frac{\lambda_1}{\mu_3 - \lambda_1 - \lambda_2}. \end{aligned}$$

Combined with Little’s Law, this expression yields W 1. □

5.8 Optimizing Capacity

We use our network model to optimize the rates of the transmitters. The basic idea is that nodes with more traffic should have faster transmitter. To make this idea precise, we formulate an optimization problem: minimize a delay cost subject to a given budget for buying the transmitters.

We carry out the calculations not because of the importance of the specific example (it is not important!) but because they are representative of problems of this type.

Consider once again the network in Fig. 5.6. Assume that the cost of the transmitters is c 1 μ 1 + c 2 μ 2 + c 3 μ 3. The delay cost is d 1 W 1 + d 2 W 2 where W k is the average delay for packets of type k (k = 1, 2). The problem is then as follows:

$$\displaystyle \begin{aligned} & \mbox{Minimize }D(\mu_1, \mu_2, \mu_3) := d_1 W_1 + d_2 W_2 \\ & \mbox{subject to } C(\mu_1, \mu_2, \mu_3) := c_1 \mu_1 + c_2 \mu_2 + c_3 \mu_3 \leq B. \end{aligned} $$

Thus, the objective function is

$$\displaystyle \begin{aligned} D(\mu_1, \mu_2, \mu_3) = \sum_{k=1, 2} \frac{d_k}{1 - p_k} \left( \frac{1}{\mu_k - \lambda_k} + \frac{1}{\mu_3 - \lambda_1 - \lambda_2}\right). \end{aligned}$$

We convert the constrained optimization problem into an unconstrained one by replacing the constraint by a penalty. That is, we consider the problem

$$\displaystyle \begin{aligned} \mbox{Minimize } D(\mu_1, \mu_2, \mu_3) + \alpha (C(\mu_1, \mu_2, \mu_3) - B), \end{aligned}$$

where λ > 0 is a Lagrange multiplier that penalizes capacities that have a high cost. To solve this problem for a given value of λ, we set to zero the derivative of this expression with respect to each μ k. For k = 1, 2 we find

$$\displaystyle \begin{aligned} 0 &= \frac{\partial}{\partial \mu_k} D(\mu_1, \mu_2, \mu_3) + \frac{\partial}{\partial \mu_1} \alpha C(\mu_1, \mu_2, \mu_3) \\ &= - \frac{d_k}{1 - p_k} \frac{1}{(\mu_k - \lambda_k)^2} + \alpha c_k. \end{aligned} $$

For k = 3, we find

$$\displaystyle \begin{aligned} 0 = - \frac{d_1/(1 - p_1) + d_2/(1 - p_2)}{(\mu_3 - \lambda_1 - \lambda_2)^2} + \alpha c_3. \end{aligned}$$

Hence,

$$\displaystyle \begin{aligned} \mu_k &= \lambda_k + \left( \frac{d_k}{\alpha c_k (1 - p_k)}\right)^{1/2}, \mbox{ for } k = 1, 2 \\ \mu_3 &= \lambda_1 + \lambda_2 + \left( \frac{d_1/(1 - p_1) + d_2/(1 - p_2)}{\alpha c_3} \right)^{1/2}. \end{aligned} $$

These identities express μ 1, μ 2, and μ 3 in terms of α. Using these expressions in C(μ 1, μ 2, μ 3), we find that the cost is given by

$$\displaystyle \begin{aligned} C(\mu_1, \mu_2, \mu_3) &= c_1 \lambda_1 + c_2 \lambda_2 + c_3 (\lambda_1 + \lambda_3) \\ &\quad + \frac{1}{\sqrt{\alpha}} \left[ \sum_{k=1,2} \left(\frac{d_kc_k}{1 - p_k}\right)^{1/2} + c_3^{1/2} \left( \sum_{k = 1,2} \frac{d_k}{1 - p_k} \right)^{1/2} \right]. \end{aligned} $$

Using C(μ 1, μ 2, μ 3) = B then enables to solve for α. As a last step, we substitute that value of α in the expressions for the μ k. We find,

$$\displaystyle \begin{aligned} \mu_k &= \lambda_k + D \left(\frac{d_k}{c_k(1 - p_k)}\right)^{1/2} , \mbox{ for } k = 1, 2 \\ \mu_3 &= \lambda_1 + \lambda_2 + D \left(\sum_{k = 1,2} \frac{d_k}{c_k(1 - p_k)}\right)^{1/2}, \end{aligned} $$

where

$$\displaystyle \begin{aligned} D = \frac{B - c_1 \lambda_1 - c_2 \lambda_2 - c_3 (\lambda_1 + \lambda_2)} { \sum_{k=1,2} \left( \frac{d_k c_k}{1 - p_k} \right)^{1/2} + c_3^{1/2} \left(\sum_{k=1,2} \frac{d_k c_k}{1 - p_k} \right)^{1/2}}. \end{aligned}$$

These results show that, for k = 1, 2, the capacity μ k increases with d k, i.e., the cost of delays of packets of type k; it also decreases with c k, i.e., the cost of providing that capacity.

A numerical solution can be obtained using a scipy optimization tool called minimize. Here is the code.

import numpy as np from scipy.optimize import minimize d = [1, 2] # delay cost coefficients c = [2, 3, 4] # capacity cost coefficients l = [3, 2] # rates l[0] = lambda1, etc p = [0.1, 0.2] # error probabilities B = 60 # capacity budget UB = 50 # upper bound on capacity # x = mu1, mu2, mu3: x[0] = mu1, etc def objective(x): # objective to minimize     z = 0     for k in range(2):         z = z + (d[k]/(1 - p[k]))∗(1/(x[k] - l[k])                  + 1/(x[2] - l[0]-l[1]))     return z def constraint(x): # budget constraint >= 0     z = B     for k in range(3):         z = z - c[k]∗x[k]     return z x0 = [5,5,10] # initial value for optimization b0 = (l[0], UB)  # lower and upped bound for x[0] b1 = (l[1], UB)  # lower and upped bound for x[1] b2 = (l[0]+l[1], UB) # lower and upped bound for x[1] bnds = (b0,b1,b2) # bounds for the three variables x con = {'type': 'ineq', 'fun': constraint}       # specifies constraints sol = minimize(objective,x0,method='SLSQP',        bounds = bnds, constraints=con) # sol will be the solution print(sol)

The code produces an approximate solution. The advantage is that one does not need any analytical skills. The disadvantage is that one does not get any qualitative insight.

5.9 Internet and Network of Queues

Can one model the internet as a network of queues? If so, does the result of the previous section really apply? Well, the mathematical answers are maybe and maybe.

The internet transports packets (groups of bits) from node to node. The nodes are sources and destinations such as computers, webcams, smartphones, etc., and network nodes such as switches or routers. The packets go from buffer to buffer. These buffers look like queues. The service times are the transmission times of packets. The transmission time of a packet (in seconds) is the number of bits in the packet divided by the rate of the transmitter (in bits per second). The packets have random lengths, so the service times are random. So, the internet looks like a network of queues. However, there are some important ways in which our network of queues is not an exact model of the internet. First, the packet lengths are not exponentially distributed. Second, a packet keeps the same number of bits as it moves from one queue to the next. Thus, the service times of a given packet in the different queues are all proportional to each other. Third, the time between the arrival two successive packets from a given node cannot be smaller than the transmission time of the first packet. Thus, the arrival times and the service times in one queue are not independent and the times between arrivals are not exponentially distributed.

The real question is whether the internet can be approximated by a network similar to that of the previous section. For instance, if we use that model, are we very far off when we try to estimate delays of queue lengths? Experiments suggest that the approximation may be reasonable to a first order. One intuitive justification is the diversity of streams of packets. It goes as follows. Consider one specific queue in a large network node of the internet. This node is traversed by packets that come from many different sources and go to many destinations. Thus, successive packets that arrive at the queue may come from different previous nodes, which reduces the dependency of the arrivals and the service times. The service time distribution certainly affects the delays. However, the results obtained assuming an exponential distribution may provide a reasonable estimate.

5.10 Product-Form Networks

The example of the previous sections generalizes as follows. There are N ≥ 1 queues and C ≥ 1 classes of customers. At each queue i, customers of class c ∈{1, …, C} arrive with rate \(\gamma _i^c\), independently of the past and of other arrivals. Queue i serves customers with rate μ i. When a customer of class c completes service in queue i, it goes to queue j and becomes a customer of class d with probability \(r_{i, j}^{c, d}\), for i, j ∈{1, …, N} and c, d ∈{1, …, C}. That customer leaves the network with probability \(r_{i, 0}^c = 1 - \sum _{j = 1}^N \sum _{d = 1}^C r_{i, j}^{c, d}\). That is, a customer of class c who completes service in queue i either goes to another queue or leaves the network.

Define \(\lambda _i^c\) as the average rate of customers of class c that go through queue i, for i ∈{1, …, N} and for c ∈{1, …, C}. Assume that the rate of arrivals of customers of a given class into a queue is equal to the rate of departures of those customers from the queue. Then the rates \(\lambda _i^c\) should satisfy the following flow conservation equations:

$$\displaystyle \begin{aligned} \lambda_i^c = \gamma_i^c + \sum_{j=1}^N\sum_{d = 1}^C r_{j, i}^{d, c}, i \in \{1, \ldots, N\}, c \in \{1, \ldots, C\}. \end{aligned}$$

Let also X(t) = {X i(t), i = 1, …, N} where X i(t) is the configuration of queue i at time t ≥ 0. That is, X i(t) is the list of customer classes in queue i, from the tail of the queue to the head of the queue. For instance, X i(t) = 132, 312 if the customer at the tail of queue i is of class 1, the customer in front of her is of class 3, and so on, and the customer at the head of the queue and being served is of class 2. If the queue is empty, then X i(t) = [], where [] designates the empty string.

One then has the following theorem.

Theorem 5.6 (Product-Form Networks)

  1. (a)

    Let \(\{\lambda _i^c, i = 1, \ldots , N; c - 1, \ldots , C\}\) be a solution of the flow conservation equations. If \(\lambda _i := \sum _{c = 1}^C \lambda _i^c < \mu _i\) for i = 1, …, N, then X t is a Markov chain and its invariant distribution is given by

    $$\displaystyle \begin{aligned} \pi(x) = A \varPi_{i=1}^N g_i(x_i), \end{aligned}$$

    where

    $$\displaystyle \begin{aligned} g_i(c_1 \cdots c_n) = \frac{\lambda_i^{c_1} \cdots \lambda_i^{c_N}}{\mu_i^n} \end{aligned}$$

    and A is a constant such that π i sums to one over all the possible configurations of the queues.

  2. (b)

    If the network is open in that every customer can leave the network, then the invariant distribution becomes

    $$\displaystyle \begin{aligned} \pi(x) = \varPi_{i=1}^N \pi_i(x_i), \end{aligned}$$

    where

    $$\displaystyle \begin{aligned} \pi_i(c_1 \cdots c_n) = \left(1 - \frac{\lambda_i}{\mu_i}\right) \frac{\lambda_i^{c_1} \cdots \lambda_i^{c_n}}{\mu_i^n}. \end{aligned}$$

    In this case, under the invariant distribution, the queue lengths at time t are all independent, the length of queue i has the same distribution as that of an MM∕1 queue with arrival rate λ i and service rate μ i , and the customer classes are all independent and are equal to c with probability \(\lambda _i^c/\lambda _i\).

The proof of this theorem is the same as that of the particular example given in the next chapter.

5.10.1 Example

Figure 5.7 shows a network with two types of jobs. There is a single gray job that visits the two queues as shown. The white jobs go through the two queues once. The gray job models “hello” messages that the queues keep on exchanging to verify that the system is alive. For ease of notation, we assume that the service rates in the two queues are identical.

Fig. 5.7
figure 7

A network of queues

We want to calculate the average time that the white jobs spend in the system and compare that value to the case when there is no gray job. That is, we want to understand the “cost” of using hello messages. The point of the example is to illustrate the methodology for networks where some customers never leave. The calculations show the following somewhat surprising result.

Theorem 5.7

Using a hello message increases the expected delay of the white jobs by 50%.

We prove the theorem in the next chapter. In that proof, we use Theorem 5.6 to calculate the invariant distribution of the system, derive the expected number L of white jobs in the network, then use Little’s Law to calculate the average delay W of the white jobs as W = Lγ. We then compare that value to the case where there is not gray job.

5.11 References

The literature on social networks is vast and growing. The textbook Easley and Kleinberg (2012) contains many interesting models and result. The text Shah (2009) studies the propagation of information in networks.

The book Kelly (1979) is the most elegant presentation of the theory of queueing networks. It is readily available online. The excellent notes Kelly and Yudovina (2013) discuss recent results. The nice textbook Srikant and Ying (2014) explains network optimization and other performance evaluation problems. The books Bremaud (2017) and Lyons and Perez (2017) are excellent sources for deeper studies of networks. The text Walrand (1988) is more clumsy but may be useful.

5.12 Problems

Problem 5.1

There are K users of a social network who collaborate to estimate some quantity by exchanging information. At each step, a pair (i, j) of users is selected uniformly at random and user j sends a message to user i with his estimate. User i then replaces his estimate by the average of his estimate and that of user j. Show that the estimates of all the users converge in probability to the average value of the initial estimates. This is an example of consensus algorithm.

Hint

Let X n(i) be the estimate of user i at step n and X n the vector with components X n(i). Show that

$$\displaystyle \begin{aligned} E[X_{n+1}(i) \mid X_n] = (1 - \alpha) X_n(i) + \alpha A, \end{aligned}$$

where α = 1∕(2(K − 1)) and A =∑i X 0(i)∕K. Consequently,

$$\displaystyle \begin{aligned} E[| X_{n+1}(i) - A | \mid X_n ] = (1 - \alpha) |X_{n}(i) - A|, \end{aligned}$$

so that

$$\displaystyle \begin{aligned} E[| X_{n+1}(i) - A |] = (1 - \alpha)E[ |X_{n}(i) - A|] \end{aligned}$$

and

$$\displaystyle \begin{aligned} E[|X_n(i) - A|] \to 0. \end{aligned}$$

Markov’s inequality then shows that P(|X n(i) − A| > 𝜖) → 0 for any 𝜖 > 0.

Problem 5.2

Jobs arrive at rate γ in the system shown in Fig. 5.8. With probability p, a customer is sent to queue 1, independently of the other jobs; otherwise, the job is sent to queue 2. For i = 1, 2, queue i serves the jobs at rate μ i. Find the value of p that minimizes the average delay of jobs in the system. Compare the resulting average delay to that of the system where the jobs are in one queue and join the available server when they reach the head of the queue, and the fastest server if both are idle, as shown in the bottom part of Fig. 5.8.

Fig. 5.8
figure 8

Optimizing p (top) versus joining the free server (bottom)

Hint

The system of the top part of the figure is easy to analyze: with probability p, a job faces the average delay 1∕(μ 1 − γp) in the top queue and with probability 1 − p the job faces the average delay 1∕(μ 2 − γ(1 − p)), One the finds the value of p that minimizes the expected delay. For the system in the bottom part of the figure, the state is n with n ≥ 2 when there are at least two jobs and the two servers are busy, or (1, s) where s ∈{1, 2} indicates which server is busy, or 0 when the system is empty. One then needs to find the invariant distribution of the state, compute the average number of jobs, and use Little’s Law to find the average delay. The state transition diagram is shown in Fig. 5.9.

Fig. 5.9
figure 9

The state transition diagram. Here, μ := μ 1 + μ 2

Problem 5.3

This problem compares parallel queues to a single queue. There are N servers. Each server serves customers at rate μ. The customers arrive at rate . In the first system, the customers are split into N queues, one for each server. Customers arrive at each queue with rate λ. The average delay is that of an MM∕1 queue, i.e., 1∕(μ − λ). In the second system, the customers join a single queue. The customer at the head of the queue then goes to the next available server. Calculate the average delay in this system. Write a Python program to plot the average delays of the two systems as a function ρ := λμ for different values of N.

Hint

The state diagram is shown in Fig. 5.10.

Fig. 5.10
figure 10

The state transition diagram

Problem 5.4

In this problem, we explore a system of parallel queues where the customers join the shortest queue. Customers arrive at rate and there are N queues, each with a server who serves customers at rate μ > λ. When a customer arrives, she joins the shortest queue. The goal is to analyze the expected delay in the system. Unfortunately, this problem cannot be solved analytically. So, your task is to write a Python program to evaluate the expected delay numerically. The first step is to draw the state transition diagram. Approximate the system by discarding customers who arrive when there are already M customers in the system. The second step is to write the balance equations. Finally, one writes a program to solve the equations numerically.

Problem 5.5

Figure 5.11 shows a system of N queues that serve jobs at rate μ. If there is a single job, it takes on average Nμ time units for it to go around the circle. Thus, the average rate at which a job leaves a particular queue is μN. Show that when there are two jobs, this rate is 2μ∕(N + 1).

Fig. 5.11
figure 11

The system