The proportion of the population never hearing a rumour

Sudbury (J Appl Prob 22:443–446, 1985) showed for the Maki–Thompson model of rumour spreading that the proportion of the population never hearing the rumour converges in probability to a limiting constant (approximately equal to 0.203) as the population size tends to infinity. We extend the analysis to a generalisation of the Maki–Thompson model.


Introduction
The following model of rumour spreading was introducing by Maki and Thompson [2], as a variant of an earlier model of Daley and Kendall [1]: there is a population of size n, some of whom initially know a rumour, and are referred to as infected. Time is discrete. In each time step, an infected individual chosen uniformly at random (or arbitrarily) contacts a member of the population chosen uniformly at random (including itself). If this individual hasn't yet heard the rumour (is susceptible), then the contacted individual becomes infected; otherwise, the contacting individual loses interest in spreading the rumour and is termed removed (but remains in the population and can be contacted by other infectives. In the Daley-Kendall model, if an infective contacts another infective, both become removed, whereas, in the Maki-Thompson model, only the initiator of the contact is removed.) The process ends when there are no more infectives. A natural question to ask is how many individuals remain susceptible at this terminal time, and consequently never hear the rumour. It was shown by Sudbury [4] that in the large population limit of n tending to infinity, the random proportion of the population never hearing the rumour converges in probability to a limiting constant.
We consider the following generalisation of the Maki-Thompson model: each infective loses interest in spreading the rumour (and becomes removed) after k failed attempts, i.e., after contacting infected or removed individuals k times. Here, k ≥ 1 is a specified constant, which is a parameter of the model; if k = 1, we recover the original model. Our main result is as follows.
Theorem 1. Consider the generalisation of the Maki-Thompson model described above, parametrised by k and starting with a single infective and n−1 susceptibles. Let S ∞ denote the number of susceptibles when the process terminates, i.e., when the number of infectives hits zero. Then, where y * is the unique solution in (0, 1) of the equation (k + 1)(1 − y) = − log y, and logarithms are natural unless specified otherwise.
The proof is presented in the next section. We observe that y * = y * (k) is a decreasing function of k, and is well-approximated by e −(k+1) for large k. This tells us that, qualitatively, the proportion of the population not hearing a rumour decays exponentially in the number of failed attempts before agents lose interest in spreading the rumour.
Pittel [3] showed in the Maki-Thompson model that the proportion of nodes not hearing the rumour, suitably centred and rescaled, converges in distribution to a normal random variable. An extension of this result to our generalised model is an open problem.

Model and Analysis
Denote by S t the number of susceptibles present in time slot t. If at least one infective is present during this time slot, then there is an infection attempt during this time slot, which succeeds with probability S t /n (or S(t)/(n − 1) if an infective never contacts itself; the distinction is immaterial for large n). In that case, S(t + 1) = S(t) − 1. Otherwise, the number of failure attempts associated with the infective node which initiated the contact is incremented by 1; if its value becomes equal to k, the infective node becomes removed. We could describe this process as a Markov chain by keeping track of I 0 t , I 1 t , . . . , I k−1 t , which denote respectively the number of infective nodes which have seen 0, 1, . . . , k−1 failed infection attempts. A simpler Markovian representation is obtained by keeping track of I t , the number of infection attempts avaible in time step t, which increases by k whenever a new node is infected. We initialise the process with S 0 = n − 1 and I 0 = k; the process terminates when I t hits zero for the first time. If I t > 0, then where we use the abbreviation w.p. for "with probability". Let T denote the random time that the process terminates, i.e, when I t hits zero for the first time. We see from (1) DefineS t , t = 0, 1, 2, . . . to be a Markov process on the state space {0, 1, . . . , n− 1} with transition probabilities with initial conditionS 0 = n − 1. ThenS t and S t have the same transition probabilities while I t is non-zero; hence, it is clear that we can couple the processes S t andS t in such a way that they are equal until the random time T . Consequently, we can write which relates T to a level crossing time of a lazy random walk. As the random walkS t is non-increasing, S T is explicitly determined by T ; we have While it is possible to study the random variable T directly by analysing the random walkS t , we will follow the work of Sudbury [4] and consider a somewhat indirect approach. The random walkS t is exactly the same as the random walk s k in that paper, but the level-crossing required for stopping is different.
Define the filtration F t = σ(S u , 1 ≤ u ≤ t), t ∈ N, and notice that the random time T defined in (3) is a stopping time, i.e., the event {T ≤ t} is F t -measurable. Moreover, T is bounded by (k + 1)n. Let The lemma below is an exact analogue of a corresponding result in [4] and follows easily from the transition probabilities in (2), so the proof is omitted.
Applying the optional stopping theorem (OST) to M 1 (t ∧ T ), we get We show that for large n the above random variables concentrate around their mean values and, after suitable rescaling, converging in probability.

Lemma 2.
LetS T denotes the final number of susceptibles and T the random time (number of attempts to spread the rumour) after which the process terminates in a population of size n. The dependence of T andS T on n has been suppressed in the notation. Then, Proof. The proof is largely reproduced from [4] but is included for completeness. It proceeds by bounding the variance of the random variables of interest and invoking Chebyshev's inequality. We have by (5) that whereas, applying the OST to M 2 (t ∧ T ), we get Combining the last two equations, we can write Now, the first term in the above sum is negative, since (1 − 1 n ) 2 > 1 − 2 n . Next, since T is bounded above by (k + 1)n, we have where we have used the fact that E ( n n−1 ) TS T =S 0 to obtain the asymptotic equivalence on the last line. (Recall that, for sequences x n and y n , we write x n ∼ y n to denote that x n /y n → 1 as n → ∞.) Thus, we conclude that which tends to zero as n tends to infinity, sinceS 0 = n − 1. The claim of the lemma now follows from (5) and Chebyshev's inequality.
Consider the sequence of random vectors T n ,S T n , which take values in the compact set K = [0, k + 1] × [0, 1]; the dependence of T andS T on n has not been made explicit in the notation. Define f : Then we see from (4) and Lemma 2 that We want to use this to prove convergence in probability of the sequences T /n andS T /n. Firstly, we observe that if f (x, y) = (0, 0), then y solves the equation (k + 1)(1 − y) + log y = 0, and x = (k + 1)(1 − y). The function y → (k + 1)(1 − y) + log y is strictly concave and is zero at y = 1; by considering its derivative at 1 and its value near 0, it can be seen that the function has one other zero, which lies in (0, 1). Call this value y * and define x * = (k + 1)(1 − y * ). We now have the following. Lemma 3. Fix δ > 0. Then, as n tends to infinity, where B δ (x, y) denotes the open ball of radius δ centred on (x, y).
Proof. Suppose this is not the case. Then, there is an α > 0 and infinitely many n such that Since f is continuous, so is its norm. Hence, its minimum on the compact set K \ {B δ (0, 0) ∪ B δ (x * , y * )} is attained, and must be strictly positive as f has no zeros other than (0, 1) and x * , y * ). Hence, there is an ǫ > 0 such that f (x, y) > ǫ whenever (x, y) / ∈ B δ (0, 1) ∪ B δ (x * , y * ). Thus, we have shown that there are infinitely many n such that which contradicts (7). This proves the claim of the lemma.
We now need the following elementary tail bound on the binomial distribution in order to complete the proof of Theorem 1.
Lemma 4. Let X be binomially distributed with parameters n and p, denoted X ∼ Bin(n, p). Then, for any q > p, we have Proof. Recall the well-known large deviations bound, which is a consequence of Sanov's theorem. This inequality, or slight variants, are known as Bernstein or Chernoff bounds. The claim of the lemma follows from the above inequality by noting that which follows from the inequality log x ≤ x − 1.
Hence, it follows from Lemma 4 that, for ǫ < k k+1 , we have P(S T /n ≥ 1 − ǫ) ≤ ⌊ǫn⌋ j=1 exp −(k + 1)j k k + 1 log kn (k + 1)j − k k + 1 + j n It is easy to see that both sums above vanish as n tends to infinity. This completes the proof of the theorem.