Large Deviations for Subcritical Bootstrap Percolation on the Erdős–Rényi Graph

We study atypical behavior in bootstrap percolation on the Erdős–Rényi random graph. Initially a set S is infected. Other vertices are infected once at least r of their neighbors become infected. Janson et al. (Ann Appl Probab 22(5):1989–2047, 2012) locates the critical size of S, above which it is likely that the infection will spread almost everywhere. Below this threshold, a central limit theorem is proved for the size of the eventually infected set. In this work, we calculate the rate function for the event that a small set S eventually infects an unexpected number of vertices, and identify the least-cost trajectory realizing such a large deviation.

exploration process. For instance, we have more recently used these methods to study the performance of the greedy independent set algorithm on sparse random graphs [26].
In bootstrap percolation, some subset S 0 ⊂ [n] is initially infected. Other vertices are infected once at least r of their neighbors become infected. Most of the literature has focused on the typical behavior. Of particular interest is the critical size at which point a uniformly random initial set S 0 is likely to infect most of the graph. Less is known about the atypical behavior, such as when a small set S 0 is capable of eventually infecting many more vertices than expected (e.g. influencers or superspreaders in a social network, viral marketing, etc.).
For analytical convenience, we rephrase the dynamics in terms of an exploration process (cf. [23,30,32]) in which vertices are infected one at a time. At any given step, vertices are either susceptible, infected or healthy. All susceptible vertices become infected eventually, and then remain infected. When a vertex is infected, some of the currently healthy vertices may become susceptible. The process ends once a stable configuration has been reached in which no vertices are susceptible.
More formally, at each step t, there are sets I t and S t of infected and susceptible vertices. Vertices in [n] \ (I t ∪ S t ) are currently healthy. Initially, I 0 = ∅. In step t ≥ 1, some vertex v t ∈ S t−1 is infected. All remaining edges from v t are revealed. To obtain S t from S t−1 , we remove v t and add all neighbors of v t with exactly r − 1 neighbors in I t−1 . We then add v t to I t−1 to obtain I t . The process ends at step t * = min{t ≥ 1 : S t = ∅} when no further vertices can be infected. For technical convenience, we set |S t | = 0 for all t ≥ t * . Let I * = I t * denote the eventually infected set. Since one vertex is infected in each step t ≤ t * , we have |I t | = t and |S t | ≥ |S t−1 | − 1 for all such t. In particular, t * = |I * |. Clearly, I * does not depend on the order in which vertices are infected.

Theorem 1 ([23] Theorem 3.1) Let p be as in
The extreme cases p ∼ c/n and p ∼ c/n 1/r are also addressed in [23], where the model behaves differently. We assume (1) throughout this work.
Moreover, in the subcritical case, a central limit theorem is proved in [23] (see Theorem 3.8). In this work, we study large deviations from the typical behavior in the subcritical case α < 1.
The asymptotically optimal trajectoryŷ α,β (x) for |S xϑ |/ϑ is given at (9) below (see also Fig. 2). The rate function ξ(α, β) is found by substituting this into the associated cost function (8). Detailed heuristics are given in Sect. 1.5 below. See Sect. 2 for the proof of Theorem 3.
The point ϑ (associated with β = 1) is critical. As such, we simply have that ξ(α, β) = ξ(α, 1) for β > 1. The reason for this is that the underlying branching process (the Binomial chain |S t | discussed in Sects. 1.4 and 1.5 below) governing the dynamics becomes critical upon surviving to time t = ϑ. Surviving beyond this point, supposing that it has been reached, is no longer exponentially unlikely. In other words, the optimal (asymptotic) trajectoryŷ(x) that |S xϑ |/ϑ typically follows in order to survive beyond x = 1 is equal toŷ α,1 (x) on [0, 1] (this has cost −ξ(α, 1)). From then on (x > 1), there is a zero-cost path thatŷ(x) can follow.
We note here that in [23] (see Theorem 3.1) it is shown that |I * |/ϑ converges to the typical value ϕ α in probability. By Theorem 3 (and the Borel-Cantelli lemma) it follows that this convergence holds almost surely.

Related Work
Torrisi et al. [33] established a full large deviations principle in the supercritical case, α > 1, where typically |I * | ∼ n. As discussed in [33], the main step in this regard is establishing sharp tail estimates (as in our Theorem 3 above). The full large deviations principle then follows by "elementary topological considerations." Although we have not pursued it, we suspect that a full large deviations principle also holds in the present subcritical setting. In closing, let us remark that it might be interesting to investigate the nature of G n, p , conditioned on the event that a given S 0 eventually infects a certain number of vertices, or on the existence of such a set S 0 .

Motivation
We came to this problem in studying H -bootstrap percolation on G n, p , as introduced by Balogh et al. [11], where all edges in G n, p are initially infected and any other edge in an otherwise infected copy of H becomes infected. In the case that H = K 4 , there is a useful connection with the usual r -neighbor bootstrap percolation model when r = 2. Theorem 3 (when r = 2 and ϑ = (log n)) plays a role (together with [9,27]) in locating the critical probability p c ∼ 1/ √ 3n log n, where it becomes likely that all edges in K n are infected eventually. This solves an open problem in [11].

Contagious Sets
A susceptible set S 0 is called contagious if it infects all of G n, p eventually (i.e., I * = [n]). Such sets have been studied for various graphs (e.g. [13,18,19,28]). Recently, Feige et al. [16] considered the G n, p case.
By Theorem 1, G n, p has contagious sets of size (ϑ), however, there exist contagious sets that are much smaller. In [16], upper and lower bounds are obtained for the minimal size m(G n, p , r ) of a contagious set in G n, p . More recently [9], we showed that is the sharp threshold for contagious sets of the smallest possible size r . For p < p c , Theorem 3 yields lower bounds for m(G n, p , r ) that sharpen those in [16] by a linear, multiplicative factor in r . Of course, finding sets of this size (if they exist) is a difficult and interesting problem (cf. the NP-complete problem of target set selection from viral marketing [14,25]).

Corollary 4 Suppose that, for some
Then, with high probability, This result follows by an easy union bound, applying Theorem 3 in the case that α = 0 and β = 1, see Appendix A.4.
By [9] this lower bound is sharp for p close to p c , that is, when ϑ ∼ log n. The methods in [9] might establish sharpness at least for ϑ ≤ O(log n).

Binomial Chain
As in [23], we study the bootstrap percolation dynamics using the Binomial chain construction based on the work of Scalia-Tomba [30] (cf. Selke [32]). We only state here in this section the properties of this framework that we require, and refer the reader to Sect. 2 of [23] for the details.
Let N t be the number of vertices that have become susceptible during some time s ∈ (0, t], so that |S t | = N t − t + |S 0 |. By revealing edges (incident to infected vertices) on a needto-know basis, the process N t can be expressed as the sum of n − |S 0 | independent and identically distributed processes, each of which is 0 until some NegBin(r , p) time, and then jumps to 1 (and remains at 1 thereafter). Informally, when a vertex is infected, it gives all of its neighbors a "mark." A vertex, which was not initially susceptible, is susceptible or infected at a given time if it has received at least r marks by this time. In this way (see [23,30]), it can be shown that |S t | is a Markov process, with where π t = P(Bin(t, p) ≥ r ). Moreover, its increments are distributed as

Heuristics
We first briefly recall the heuristic for Theorem 1 given in Sect. 6 of [23]. By the law of large numbers, with high probability and so Next, we describe a natural heuristic, using the Euler-Lagrange equation, that allows us to anticipate the least-cost, deviating trajectories (the functionsŷ α,β in (9) below), which lead to Theorem 3. The proof, given in Sect. 2 below, makes this rigorous by a discrete analogue of the Euler-Lagrange equation. We think this method will be of use in studying the tail behavior of other random processes.
Consider a trajectory y(x) ≥ 0 from α r to 0 over [0, β]. Suppose that |S xϑ |/ϑ has followed this trajectory until step t − 1 = xϑ. In the next step t, some vertex v t ∈ S t−1 is infected. There are approximately a Poisson with mean np r t−1 r −1 ≈ x r −1 (this approximation holds by (1) and standard combinatorial estimates) number of vertices that are neighbors with v t and r − 1 of the t − 1 vertices infected in previous steps s < t. Such vertices become susceptible in step t. Therefore, to continue along this trajectory, we require this Poisson random variable to take the value (The "+1" accounts for the vertex v t ∈ S t−1 that is infected in step t, and so removed from the next susceptible set S t .) As is well-known, this event has approximate log probability is the Legendre-Fenchel transformation of the cumulant-generating function of a mean λ Poisson. Hence |S xϑ |/ϑ ≈ y(x) on [0, β] with approximate log probability (cf. (13) below). Maximizing this integral is particularly simple, since the integrand depends on y , but not y. The Euler-Lagrange equation implies that the least-cost trajectory satisfies except where possibly the boundary constraint y(x) ≥ 0 might intervene. Since, as noted above, |S t | ≥ |S t−1 | − 1 for all t, we may assume that β ≥ α r . That is, any trajectory y(x) of |S xϑ |/ϑ decreases no faster than −x. Also note that (α − α r )/α r = 1/(r α r −1 ), and that for any larger b > 1/(r α r −1 ) the function bx r − x + α r has no zeros in [0, 1].
Let x i = x i+1 − x i denote the forward difference operator. a, b ∈ R, a function f (u, v) with continuous partial derivatives f u and f v , and evenly spaced points x 0 ≤ x 1 ≤ · · · ≤ x m . Then the maximizerŷ of

Lemma 5 Fix
The proof of this result amounts to adding a Lagrange multiplier to constrain i y i and then comparing the derivative to 0. A more general version, more closely resembling the regular Euler-Lagrange equation, appears in [20]. This allows for more complicated functions f (x i , x i+1 , y i , y i+1 , y i / x i ) and points x i that need not be evenly spaced. The proof is analogous to that of its continuous counterpart, using summation by parts instead of integration by parts, for instance.

Upper Bounds Whenˇ< 'W
e begin with the simpler case that β < ϕ α . The opposite case β > ϕ α follows by an elaboration of these arguments (see Sect. 2.2 below). Since β < ϕ α , note that P(S 0 , β) is simply the probability that |S xϑ | = 0 for some x ≤ β, as this occurs if and only if |I * | ≤ βϑ.
To begin, we discretize the unit interval [0, 1] as follows. Let m = ϑ/(log ϑ) 2 . Consider the points x i = (i/ϑ) (log ϑ) 2 , for i = 0, 1, . . . , m. Note that the points x i ϑ are evenly spaced integers. Also note that x m ∼ 1, since ϑ 1. Let Y n denote the set of trajectories y i = |S x i ϑ |/ϑ such that all y i / x i ≥ −1, and (4) y i = 0 for all x i ≥ β.
Note that we can assume (3) since, as discussed above, |S t | ≥ |S t−1 | − 1 for all t. Since |S t | is Markov, By (3) and (4) it follows that all y i ≤ β for any y ∈ Y n . Hence |Y n | ≤ ϑ m . Therefore, taking a union bound, whereŷ maximizes the product over y ∈ Y n . Noting that (m/ϑ) log ϑ 1, we find altogether that We now turn to the issue of identifyingŷ ∈ Y n . By (5) it follows that Hence, using the standard bound n k ≤ (en/k) k and 1 − x ≤ e −x , (We have written the upper bound in this way so as to compare with the lower bound at (18) below.) Before substituting this upper bound into (10), we collect the following technical result. The proof is elementary, though somewhat tedious, see Sect. Appendix A.5 below. Note that by (1), 1 ϑ 1/ p.

Lemma 6
We have that Altogether, we find that Since log x − x is increasing for x ∈ (0, 1] and (x r i )/r ≤ x r −1 i+1 x i , it follows by (10) that where (cf. (8) above) andŷ ∈ Y n maximizes the sum in (13). In order to apply Lemma 5, we lift the restriction that all y i ϑ ∈ Z, and maximize between any two given points whereŷ > 0. Since , for some constant b, between any two points x j < x k whereŷ i > 0 for j < i < k. On the other hand, if bothŷ j =ŷ k = 0, then necessarilyŷ i = 0 for j < i < k. By standard results on the Euler approximation of differential equations (see e.g. Theorems 7.3 and 7.5 in Sect. I.7 of [21]), it follows that, on all segments wherê Altogether, in the limit, it suffices to consider trajectories that take the form (β − α r )(x/β ) r − x + α r (until they hit 0), for some β ∈ [α r , β], since (as discussed in Sect. 1.5) these are the only functions y(x) = bx r − x + α r for which (i) y(0) = α r , (ii) y (x) ≥ −1 and (iii) y(x) = 0 for some x ≤ β. Hence, by the above considerations, and the continuity of f , we find that lim sup To conclude, we observe, by Appendices A.1 and A.2, that the right hand side equates to sup β ∈[α r ,β] ξ(α, β ) = ξ(α, β).

Upper bounds Whenˇ> 'T
he case β > ϕ α follows by the same method of proof, however, there are two additional technical complications. Specifically, (i) the set of relevant trajectories Y n in this case (defined below) no longer satisfies |Y n | ≤ [O(ϑ)] m , and (ii) to obtain an upper bound for (1/ϑ) log P(S 0 , β), as in (15) above, we need to take a supremum over a more complicated set of trajectories. This latter issue is due in part to the fact that is not a priori clear that the optimal trajectoryŷ should hit 0 before x = 1 (that is, thatŷ is one ofŷ α,β ). This indeed turns out to be the case, however, even so,ŷ α,β is slightly more complicated (defined piecewise) when β > α.
We no longer have that |Y n | ≤ [O(ϑ)] m . However, for t ≤ O(ϑ), by (5) and Chernoff's bound, Therefore, for A sufficiently large, the log probability that any |S t | > Aϑ while t ≤ O(ϑ) is less than ϑξ(α, β). Hence, arguing as the previous section, we find that lim sup where Y is the set of non-negative trajectories y(x) that start at y(0) = α r and take the form bx r − x + a, for some b ≥ 0, wherever they are positive. However, it suffices to consider a smaller set than Y . Indeed, observe that the maximizerŷ ∈ Y is non-increasing. This is intuitive, since the process is sub-critical while the total number of infected vertices remains less than ϑ. To see this formally, note that (i) the derivative of any trajectory bx r − x + a is br x r −1 − 1 ≤ 0 for any x ≤ 1 unless b > 1/r , and (ii) we have by (14) that is decreasing in b > 1/r . Hence, it suffices to consider trajectories which take the form (β − α r )(x/β ) r − x + α r until they hit 0 at some β ∈ [α r , α], and then, if β < β, are 0 thereafter until x = β. (Note that, for any b > 1/(r α r −1 ), the function bx r − x + α r has no zeros and, since α < 1, is increasing eventually on [0, 1].) Therefore, by Appendix A.1, By basic calculus (see Appendix A.3) it can be shown that the right hand side is bounded by ξ(α, β).

Lower Bounds
The lower bound is much simpler. As discussed above, it essentially suffices to consider any trajectory y(x) ∼ŷ α,β (x) which contributes to P(S 0 , β), and show that the scaled log probability that |S xϑ |/ϑ follows this trajectory is asymptotic to ξ(α, β). Once again, there is some asymmetry in the cases β < ϕ α and β > ϕ α due to the definition of P(S 0 , β). For β < ϕ α , we note that if, for instance, all |S x i ϑ |/ϑ =ỹ i , wherẽ then |I * | ≤ βϑ. The indicator present here ensures that |S xϑ | hits 0 by x = β. On the other hand, if β > ϕ α , setỹ Then if all |S x i ϑ |/ϑ =ỹ i we have |I * | ≥ βϑ. The indicator in this case ensures that |S t | > 0 between increments while t < βϑ. Next, we show that To this end, note that by (11) and the standard bounds (cf. (12)). Therefore, in a similar way as for (13) above (however instead using where f , once again, is as defined at (14). Therefore, by the choice ofỹ i , it can be seen (using Appendix A.1) that yielding (17), and thus concluding the proof of Theorem 3.

A.1: Rate Function
We show that whereŷ α,β and f are as in (9) and (14) above (that is, the cost of the least-cost trajectorŷ ξ(α, β)). First note that, whenever y α,β (x) > 0, in which case On the other hand, note that Therefore, if β > α, then we find (after some algebraic simplifications) that
Hence, using (1) and (19) (and the standard bounds (n − k) k ≤ n k k! ≤ n k and (1 − x) y ≥ 1 − x y) we find The upper bound requires slightly more attention. Note that, by the choice of m, log m x 1 ϑ. Therefore log m ≤ x i ϑ for all large n. Hence, for all large n, π(x i ϑ) < P(Bin(x i+1 ϑ, p) > log m) +  Therefore, by (1), (19) and the choice of m, Next, by (19), it follows that, for all ≤ log m and large n, Therefore, by (1) and (19), for all large n, Altogether, we find that rn ϑ π(x i ϑ) = (x r i ) 1 + O pϑ + 1 log ϑ as claimed.