# Near-critical SIR epidemic on a random graph with given degrees

- 1.2k Downloads

## Abstract

Emergence of new diseases and elimination of existing diseases is a key public health issue. In mathematical models of epidemics, such phenomena involve the process of infections and recoveries passing through a critical threshold where the basic reproductive ratio is 1. In this paper, we study near-critical behaviour in the context of a susceptible-infective-recovered epidemic on a random (multi)graph on *n* vertices with a given degree sequence. We concentrate on the regime just above the threshold for the emergence of a large epidemic, where the basic reproductive ratio is \(1 + \omega (n) n^{-1/3}\), with \(\omega (n)\) tending to infinity slowly as the population size, *n*, tends to infinity. We determine the probability that a large epidemic occurs, and the size of a large epidemic. Our results require basic regularity conditions on the degree sequences, and the assumption that the third moment of the degree of a random susceptible vertex stays uniformly bounded as \(n \rightarrow \infty \). As a corollary, we determine the probability and size of a large near-critical epidemic on a standard binomial random graph in the ‘sparse’ regime, where the average degree is constant. As a further consequence of our method, we obtain an improved result on the size of the giant component in a random graph with given degrees just above the critical window, proving a conjecture by Janson and Luczak.

## Keywords

SIR epidemic Random graph with given degrees Configuration model Critical window## Mathematics Subject Classification

05C80 60F99 60J28 92D30## 1 Introduction

Infectious diseases continue to pose a serious threat to individual and public health. Accordingly, health organisations are constantly seeking to analyse and assess events that may present new challenges. These may include acts of bioterrorism, and other events indicating emergence of new infections, which threaten to spread rapidly across the globe facilitated by the efficiency of modern transportation. Likewise, a lot of effort is being directed into suppressing outbreaks of established diseases such as influenza and measles, as well as into eliminating certain endemic diseases, such as polio and rabies.

In an SIR epidemic model, an infectious disease spreads through a population where each individual is either susceptible, infective or recovered. The population is represented by a network (graph) of contacts, where the vertices of the network correspond to individuals and the edges correspond to potential infectious contacts. Different individuals will have different lifestyles and patterns of activity, leading to different numbers of contacts; for simplicity, we assume that each person’s contacts are randomly chosen from among the rest of the population. The degree of a vertex is the number of contacts of the corresponding individual.

We assume that infectious individuals become recovered at rate \(\rho \geqslant 0\) and infect each neighbour at rate \(\beta > 0\). Then the basic reproductive ratio \({\mathcal R}_0\) (i.e. the average number of secondary cases of infection arising from a single case) is given by the average size-biased susceptible degree times the probability that a given infectious contact takes place before the infective individual recovers.

Emergence and elimination of a disease involves the process of infectious transitions and recoveries being pushed across a critical threshold, usually corresponding to the basic reproductive ratio \({\mathcal R}_0\) equal to 1, see Antia et al. (2003), Bull and Dykhuizen (2003), O’Regan and Drake (2013) and Scheffer et al. (2009). For example, a pathogen mutation can increase the transmission rate and make a previously ‘subcritical’ disease (i.e. not infectious enough to cause a large outbreak) into a ‘supercritical’ one, where a large outbreak may occur, see Antia et al. (2003). Moreover, after a major outbreak in the supercritical case, disease in the surviving population is subcritical. However, subsequently, as people die and new individuals are born (i.e. immunity wanes), \({\mathcal R}_0\) will slowly increase, and, when it passes 1, another major outbreak may occur. Equally, efforts at disease control may result in subcriticality for a time, but then inattention may lead to an unnoticed parameter shift to supercriticality. Thus, under certain conditions, one can expect most large outbreaks to occur close to criticality, and so there is practical interest in theoretical understanding of the behaviour of near-critical epidemics.

Critical SIR epidemics have been studied for populations with complete mixing, under different assumptions, by Ben-Naim and Krapivsky (2004), Gordillo et al. (2008), Hofstad et al. (2010) and Martin-Löf (1998); this is equivalent to studying epidemic processes on the complete graph, or on the Erdős-Rényi graph *G*(*n*, *p*). In Ben-Naim and Krapivsky (2004), near-criticality is discussed using non-rigorous arguments. Martin-Löf (1998) studies a generalized Reed–Frost epidemic model, where the number of individuals that a given infective person infects has an essentially arbitrary distribution. The binomial case is equivalent to studying the random graph *G*(*n*, *p*) on *n* vertices with edge probability *p*. The author considers the regime where \({\mathcal R}_0 - 1 = a n^{-1/3}\) and the initial number of infectives is \(b n^{1/3}\), for constant *a*, *b*. A limit distribution is derived for the final size of the epidemic, observing bimodality for certain values of *a* and *b* (corresponding to ‘small’ and ‘large’ epidemics). Further analytical properties of the limit distribution are derived in Hofstad et al. (2010). In Gordillo et al. (2008), a standard SIR epidemic for populations with homogeneous mixing is studied, with vaccinations during the epidemic; a diffusion limit is derived for the final size of a near-critical epidemic.

In the present paper, we address near-critical phenomena in the context of an epidemic spreading in a population of a large size *n*, where the underlying graph (network) is a random (multi)graph with given vertex degrees. In other words, we specify the number of contacts for each individual, and consider a graph chosen uniformly at random from among all graphs with the specified sequence of contact numbers. This random graph model allows for greater inhomogeneity, with a rather arbitrary distribution of the number of contacts for different persons. We study the regime just above the critical threshold for the emergence of a large epidemic, where the basic reproductive ratio is \(1 + \omega (n) n^{-1/3}\), with \(\omega (n)\) growing large as the population size *n* grows. (For example, when the population size is about 1 million, we could consider \({\mathcal R}_0\) of order about 1.01.)

From the theory of branching processes, at the start of an epidemic, each infective individual leads to a large outbreak with probability of the order \({\mathcal R}_0-1\). Roughly, our results confirm the following, intuitively clear from the above observation, picture. If the size *n* of the population is very large, with the initial total infectious degree \(X_{\mathrm {I},0}\) (i.e. total number of potential infectious contacts at the beginning of the epidemic or total number of acquaintances of initially infectious individuals) much larger than \(({\mathcal R}_0-1)^{-1}\), then a large epidemic will occur with high probability. If the initial total infectious degree is much smaller than \(({\mathcal R}_0-1)^{-1}\), then the outbreak will be contained with high probability. In the intermediate case where \(X_{\mathrm {I},0}\) and \(({\mathcal R}_0-1)^{-1}\) are of the same order of magnitude, a large epidemic can occur with positive probability, of the order \(\exp \bigl (-c X_{\mathrm {I},0} ({\mathcal R}_0-1)\bigr )\), for some positive constant *c*. So, if the population size is about a million, and \({\mathcal R}_0\) about 1.01, then \(X_{\mathrm {I},0}\) much larger than 100 will result in a large epidemic with high probability. On the other hand, if \(X_{\mathrm {I},0}\) is less than 10, say, than the outbreak will be contained with high probability.

Furthermore, we determine the likely size of a large epidemic. Here, there are three possible regimes, depending on the size of the initial total infectious degree relative to \(n ({\mathcal R}_0-1)^2\). Broadly speaking, if \(X_{\mathrm {I},0}\) is much larger than \(n ({\mathcal R}_0-1)^2\), then the total number of people infected will be proportional to \((n X_{\mathrm {I},0})^{1/2}\). On the other hand, if \(X_{\mathrm {I},0}\) is much smaller than \(n ({\mathcal R}_0-1)^2\) then, in the event that there is a large epidemic, the total number of people infected will be proportional to \(n ({\mathcal R}_0-1)\). The intermediate case where \(X_{\mathrm {I},0}\) and \(n ({\mathcal R}_0-1)^2\) are of the same order ‘connects’ the two extremal cases.

Note that, if \(X_{\mathrm {I},0}\) is of the same or larger order of magnitude than \(n ({\mathcal R}_0-1)^2\) (the first and third case in the paragraph above), then \(X_{\mathrm {I},0} ({\mathcal R}_0-1)\) is very large, so a large epidemic does occur with high probability. This follows since, by our assumption, \(n({\mathcal R}_0-1)^3 = \omega (n)^3\) is large for large *n*.

*n*is of the order no larger than \(n^{1/3}\). So in particular, in a population of size 1 million, the super-spreaders (i.e. individuals with largest numbers of contacts) should not be able to infect more than around 100 individuals.

To demonstrate this behaviour for a particular example, we used stochastic simulations that make use of special Monte Carlo techniques that allow us to consider multiple initial conditions within the same realisation of the process. The algorithm is described in Appendix A. Figure 1 shows our results for the relationship between the epidemic final size \(\mathcal {Z}\) and the initial force of infection \(X_{I,0}\) for 20 realisations of the process, with each realisation involving multiple different initial conditions. The model rate parameters are \(\rho =1\) and \(\beta =1\). The network has Poisson degree distribution with mean \(\lambda \) and was generated as an Erdős–Rényi random graph with edge probability \(\lambda /n\). We implement scaling at different population sizes giving parameter sets \((n = 10^{5}, \lambda = 2.04, R_{0} = 1.02), (n = 10^{6}, \lambda = 2.02, R_{0} = 1.01)\) and \((n = 10^{7}, \lambda = 2.01, R_{0} = 1.005)\). These plots show the emergence of the three epidemic sizes that our results predict as *n* increases, i.e. ‘small’ epidemics of size *O*(1), ‘large’ epidemics of size proportional to \((nX_{I,0})^{1/2}\), and ‘large’ epidemics of size comparable to \(n({\mathcal R}_0-1)\).

Epidemics on graphs with given degrees have been considered in a number of recent studies, both within the mathematical biology and probability communities. A set of ordinary differential equations approximating the time evolution of a large epidemic were obtained by Volz (2008), see also Miller (2011), and also Miller et al. (2012). These papers consider the case where the epidemic starts very small. Differential equations for an epidemic starting with a large number of infectives appear in Miller (2014). Convergence of the random process to these equations in the case where the second moment of the degree of a random vertex is uniformly bounded (both starting with only few infectives and with a large number of infectives) was proven in Janson et al. (2014). (See also Decreusefond et al. 2012; Bohman and Picollelli 2012, where related results are proven in the case where the fifth moment of the degree of a random vertex is uniformly bounded and in the case of bounded vertex degrees respectively. See also Barbour and Reinert 2013 for results in the case of bounded vertex degrees and general infection time distributions.)

However, we appear to be the first to study the ‘barely supercritical’ SIR epidemic on a random graph with given degrees. As a corollary, we determine the probability and size of a large near-critical epidemic on a sparse binomial (Erdős–Rényi) random graph, also to our knowledge the first such results in the literature.

Our approach also enables us to prove the conjecture of Janson and Luczak (2009), establishing their Theorem 2.4 concerning the size of the largest component in the barely supercritical random graph with given vertex degrees under weakened assumptions.

We proceed in the spirit of Janson et al. (2014) and Janson and Luczak (2009), evolving the epidemic process simultaneously with constructing the random multigraph. The main technical difficulties involve delicate concentration of measure estimates for quantities of interest, such as the current total degrees of susceptible, recovered and infective vertices. Also, our proofs involve couplings of the evolution of the total infective degree with suitable Brownian motions.

The remainder of the paper is organised as follows. In Sect. 2, we define our notation and state our main results (Theorems 2.4, 2.5). Section 3 is devoted to the proof of Theorem 2.4; to this end, we define a time-changed version of the epidemic and use the modified process to prove concentration of measure estimates for various quantities of interest. In Sect. 4, we prove Theorem 2.5. In Appendix B, we state and prove a new result concerning the size of the giant component in the supercritical random (multi)graph with a given degree sequence.

## 2 Model, notation, assumptions and results

Let \(n \in \mathbb {N}\) and let \((d_i)_{i = 1}^n= (d_i^{(n)})_{i = 1}^n\) be a given sequence of non-negative integers. Let \(G= G(n, (d_i)_{i = 1}^n )\) be a simple graph (no loops or multiple edges) with *n* vertices, chosen uniformly at random subject to vertex *i* having degree \(d_i\) for \(i=1, \ldots , n\), tacitly assuming there is any such graph at all (\(\sum _{i = 1}^n d_i\) must be even, at least). For each \(k \in {\mathbb Z}^+\), let \(n_k\) denote the total number of vertices with degree *k*.

Given the graph \(G\), the epidemic evolves as a continuous-time Markov chain. Each vertex is either susceptible, infective or recovered. Every infective vertex recovers at rate \(\rho _n \geqslant 0\) and also infects each susceptible neighbour at rate \(\beta _n > 0\).

Let \(n_{\mathrm {S}}\), \(n_{\mathrm {I}}\), and \(n_{\mathrm {R}}\) denote the initial numbers of susceptible, infective and recovered vertices, respectively. Further, let \(n_{\mathrm {S},k}\), \(n_{\mathrm {I},k}\) and \(n_{\mathrm {R},k}\) respectively, be the number of these vertices with degree \(k \geqslant 0\). Thus, \(n_{\mathrm {S}}+ n_{\mathrm {I}}+ n_{\mathrm {R}}= n\) and \(n_{\mathrm {S}}= \sum _{k=0}^\infty n_{\mathrm {S},k}\), \(n_{\mathrm {I}}= \sum _{k=0}^\infty n_{\mathrm {I},k}\), \(n_{\mathrm {R}}= \sum _{k=0}^\infty n_{\mathrm {R},k}\), and \(n_{k}= n_{\mathrm {S},k}+ n_{\mathrm {I},k}+ n_{\mathrm {R},k}\). We assume that this information is given with the degree sequence. Note that all these quantities (as well as many of the quantities introduced below) depend on *n*. To lighten the notation, we usually do not indicate the *n* dependence explicitly.

## Remark 2.1

We allow \(n_{\mathrm {R}}>0\), i.e., that some vertices are “recovered” (i.e., immune) already when we start. It is often natural to take \(n_{\mathrm {R}}=0\), but one application of \(n_{\mathrm {R}}>0\) is to study the effect of vaccination; this was done in a related situation in Janson et al. (2014) and we leave the corresponding corollaries of the results below to the reader. Note that initially recovered vertices are not themselves affected by the epidemic, but they influence the structure of the graph and thus the course of the epidemic, so we cannot just ignore them.

## Remark 2.2

Note that initially susceptible, infective and recovered vertices can have different degree distributions. However, we assume that given the vertex degrees, the connections in the graph are made at random, independently of the initial status of the vertices. Equivalently, if we first construct the connections at random, we assume that the initially infective and recovered vertices are selected at random, where we may select on the basis of their degrees, but not on any other properties of the graph.

For example, if some individuals are vaccinated before the outbreak, and thus are regarded as initially recovered as discussed in Remark 2.1, then our model covers the case when the vaccinated individuals are chosen uniformly at random, as well as the case when vaccination is directed at high-risk groups and individuals are vaccinated with a probability depending on their degree (number of contacts), but the model does not include more complicated vaccination schemes that take into account also, e.g., the degrees of the contacts.

Similarly, if the disease has been spreading for some time before we start our calculations, then there are correlations because an infected vertex that was not initially infected has to be connected to an infected or recovered vertex (the one that infected it); thus our model is not directly applicable. As suggested by an anonymous referee, if we know the history so far of the epidemic, this can be handled be removing those edges that have tried to infect (whether to a susceptible individual or not); the remaining network is uniformly random with given vertex degrees and our model applies to it.

Note that the basic reproductive ratio \({\mathcal R}_0\) determines the approximate geometric growth rate of the disease during the early stages of the epidemic. The value \({\mathcal R}_0 =1\) is therefore the threshold for the epidemic to take off in the population, in the sense that, if \({\mathcal R}_0 > 1\), then a macroscopic fraction of the susceptibles can be infected (Andersson 1999; Newman 2002; Volz 2008; Bohman and Picollelli 2012; Janson et al. 2014). Here we will consider the case where \({\mathcal R}_0 = 1 + \omega (n) n^{-1/3}\), with \(\omega (n)\) tending to infinity slowly (slower than \(n^{1/3}\)) as \(n \rightarrow \infty \).

We consider asymptotics as \(n \rightarrow \infty \), and all unspecified limits below are as \(n \rightarrow \infty \). Throughout the paper we use the notation \(o_{\mathrm p}\) in a standard way, as in Janson (2011). That is, for a sequence of random variables \((Y^{(n)})_1^\infty \) and real numbers \((a_n)_1^\infty \), ‘\(Y^{(n)}= o_p(a_n)\)’ means \(Y^{(n)}/a_n \overset{\mathrm {p}}{\longrightarrow }0\). Similarly, \(Y^{(n)}=O_{\mathrm p}(1)\) means that, for every \(\varepsilon >0\), there exists \(K_\varepsilon \) such that \(\mathbb {P}{}(|Y^{(n)}|>K_\varepsilon )<\varepsilon \) for all *n*. Given a sequence of events \((A_n)_1^\infty \), \(A_n\) is said to hold w.h.p. (with high probability) if \(\mathbb {P}{}(A_n) \rightarrow 1\).

- (D1)\(D_{\mathrm {S},n}\) converges in distribution to a probability distribution \((p_k)_{k = 0}^\infty \) with a finite and positive mean \(\lambda :=\sum _{k=0}^\infty kp_k\), i.e.$$\begin{aligned} \frac{n_{\mathrm {S},k}}{n_{\mathrm {S}}} \rightarrow p_k, \quad k \geqslant 0. \end{aligned}$$(2.4)
- (D2)The third power \(D_{\mathrm {S},n}\) is uniformly integrable as \(n\rightarrow \infty \). That is, given \(\varepsilon > 0\), there exists \(M > 0\) such that, for all
*n*,$$\begin{aligned} \sum _{k > M}\frac{k^3n_{\mathrm {S},k}}{n_{\mathrm {S}}} < \varepsilon . \end{aligned}$$(2.5) - (D3)
The second moment of the degree of a randomly chosen vertex is uniformly bounded, i.e. \(\sum _{k=0}^\infty k^2n_{k}= O(n)\).

- (D4)As \(n \rightarrow \infty \),$$\begin{aligned} \alpha _n \rightarrow 0 \quad \mathrm {and} \quad n_{\mathrm {S}}\alpha _n^3 \rightarrow \infty . \end{aligned}$$(2.6)
- (D5)The total degree \(\sum _{k=0}^\infty kn_{\mathrm {I},k}\) of initially infective vertices satisfiesand the limit$$\begin{aligned} \sum _{k=0}^\infty kn_{\mathrm {I},k}=o(n), \end{aligned}$$(2.7)exists (but may be 0 or \(\infty \)). Furthermore, either \(\nu =0\) or$$\begin{aligned} \nu {:=} \lim _{n \rightarrow \infty }\frac{1}{n_{\mathrm {S}}\alpha _n^2}\sum _{k=0}^\infty kn_{\mathrm {I},k}\in [0,\infty ] \end{aligned}$$(2.8)$$\begin{aligned} d_{\mathrm {I},*}:=\max \{k:n_{\mathrm {I},k}\geqslant 1\}=o\left( \sum _{k=0}^{\infty }kn_{\mathrm{I},k}\right) . \end{aligned}$$(2.9)
- (D6)
We have \(p_0 + p_1 + p_2 < 1\).

- (D7)
\(\liminf _{{n\rightarrow \infty }} n_S/n>0\).

*n*,

## Remark 2.3

Let \(G^*=G^* (n, (d_i)_1^n )\) be the random multigraph with given degree sequence \((d_i)_1^n\) defined by the *configuration model*: we take a set of \(d_i\) half-edges for each vertex *i* and combine half-edges into edges by a uniformly random matching (see e.g. Bollobás 2001). Conditioned on the multigraph being simple, we obtain \(G = G (n, (d_i)_1^n )\), the uniformly distributed random graph with degree sequence \((d_i)_1^n\). The configuration model has been used in the study of epidemics in a number of earlier works, see, for example, Andersson (1998), Ball and Neal (2008), Britton et al. (2007), Decreusefond et al. (2012), Bohman and Picollelli (2012). As in many other papers, including Janson et al. (2014), we prove our results for the SIR epidemic on \(G^*\), and, by conditioning on \(G^*\) being simple, we then deduce that these results also hold for the SIR epidemic on \(G\). The results below thus hold for both the random multigraph \(G^*\) and the random simple graph \(G\).

This argument relies on the probability that \(G^*\) is simple being bounded away from zero as \(n \rightarrow \infty \); by the main theorem of Janson (2009b) (see also Janson 2014) this occurs provided condition (D3) holds. Most of the results below are of the “w.h.p.” type (or can be expressed in this form); then this transfer to the simple graph case is routine and will not be commented on further. The exception is Theorem 2.5(iii), where we obtain a limiting probability strictly between 0 and 1, and we therefore need a more complicated argument, see Sect. 4; we also use an extra assumption in this case.

We now state our main result, that, under the conditions above, the epidemic is either very small, or of a size at least approximatively proportional to \(n\alpha _n\) (and thus to \(n({\mathcal R}_0-1)\)). As just said, the theorem holds for both the multigraph \(G^*\) and the simple graph \(G\).

## Theorem 2.4

Suppose that (D1)–(D7) hold.

- (i)If \(\nu =0\), then there exists a sequence \(\varepsilon _n\rightarrow 0\) such that, for each
*n*, w.h.p. one of the following holds.- (a)
\({\mathcal Z}/n_{\mathrm {S}}\alpha _n < \varepsilon _n \) (the epidemic is small and ends prematurely).

- (b)
\(|{\mathcal Z}/n_{\mathrm {S}}\alpha _n -2\lambda /\lambda _3| < \varepsilon _n\) (the epidemic is large and its size is well concentrated).

- (a)
- (ii)
If \(0<\nu <\infty \), then \({\mathcal Z}/n_{\mathrm {S}}\alpha _n \overset{\mathrm {p}}{\longrightarrow }\lambda (1 + \sqrt{1+2\nu \lambda _3})/\lambda _3\).

- (iii)If \(\nu =\infty \), then$$\begin{aligned} \frac{{\mathcal Z}}{\bigl (n_{\mathrm {S}}\sum _{k=0}^\infty kn_{\mathrm {I},k}\bigr )^{1/2}} \overset{\mathrm {p}}{\longrightarrow }\frac{\sqrt{2}\,\lambda }{\sqrt{\lambda _3}} \end{aligned}$$(2.16)

Thus, (2.17) says that, except in the case (i)(a), the total variation distance between the degree distribution \(({\mathcal Z}_k/{\mathcal Z})\) of the vertices that get infected and the size-biased distribution \((kp_k/\lambda )\) converges to 0 in probability.

Note that case (i) of Theorem 2.4 says that, for a range of initial values of the number of infective half-edges (viz. when \(\nu =0\)), if the epidemic takes off at all, then it has approximately the size \((2\lambda /\lambda _3)n_{\mathrm {S}}\alpha _n\). Hence, in this range, the size of the epidemic does (to the first order) not depend on the initial number of infective half-edges (only the probability of a large outbreak does), so this can be seen as the “natural” size of an epidemic. This also means that in this range, most of the outbreak can be traced back to a single initial infective half-edge.

However, when the initial number of infective half-edges number gets larger, the many small outbreaks coming from the different initially infective half-edges will add up to a substantial outbreak. So there is a threshold where this bulk of combined small outbreaks is of about the same size as the “natural” size of a large outbreak. The value \(\nu \) is, in the limit as \(n \rightarrow \infty \), the ratio of the initial number divided by this threshold, so it shows, roughly, whether the combined small outbreaks give a large contribution to the outbreak or not. Our theorem then shows that, if the initial number of infective half-edges is larger (to be precise, \(\nu >0\)), then they force a larger outbreak, with a size that is proportional to the square root of the initial number of infective half-edges in the range \(\nu =\infty \). (For \(0<\nu <\infty \), there is a smooth transition between the two extremal cases.)

The following result gives conditions for the occurrence of a large epidemic in Theorem 2.4(i). In anticipation of later notation, let \(X_{\mathrm {I},0}:= \sum _{k=0}^\infty k n_{\mathrm {I},k}\) be the total degree of initially infective vertices (i.e. the total number of initially infective half-edges).

## Theorem 2.5

- (i)
If \(\alpha _n X_{\mathrm {I},0} \rightarrow 0\), then \({\mathcal Z}= o_{\mathrm p}(\alpha _n^{-2})= o_{\mathrm p}(n_{\mathrm {S}}\alpha _n)\), and thus case (i)(a) in Theorem 2.4 occurs w.h.p.

- (ii)
If \(\alpha _n X_{\mathrm {I},0} \rightarrow \infty \) then case (i)(b) in Theorem 2.4 occurs w.h.p.

- (iii)Suppose that \(\alpha _n X_{\mathrm {I},0}\) is bounded above and below. In the simple graph case, assume also that \(\sum _{k\geqslant 1} k^2 n_{\mathrm {I},k}=o(n)\) and \(\sum _{k\geqslant \alpha _n^{-1}} k^2 n_{\mathrm {R},k}=o(n)\). Then both cases (i)(a) and (i)(b) in Theorem 2.4 occur with probabilities bounded away from 0 and 1. Furthermore, if \(d_{\mathrm {I},*}=o(X_{\mathrm {I},0})\), then the probability that case (i)(a) in Theorem 2.4 occurs isMoreover, in the case the epidemic is small, \({\mathcal Z}=O_{\mathrm p}\bigl (\alpha _n^{-2}\bigr )\).$$\begin{aligned} \exp \left( -\frac{\lambda _2+\lambda + \sum _{k=0}^\infty k n_{\mathrm {R},k}/n_{\mathrm {S}}}{\lambda _2\lambda _3} \alpha _n X_{\mathrm {I},0}\right) +o(1). \end{aligned}$$(2.18)

Note that \(\sum _{k=0}^\infty k n_{\mathrm {R},k}/n_{\mathrm {S}}\) in (2.18) is bounded because of (D3) and (D7), and that (2.18) holds in cases (i) and (ii) too. A more complicated formula extending (2.18) holds also in the case when the condition \(d_{\mathrm {I},*}=o(X_{\mathrm {I},0})\) fails, see (4.63) in Remark 4.4.

## Remark 2.6

## Remark 2.7

The condition (2.7) that the total degree of initally infective vertices is *o*(*n*) is, by (D3) and the Cauchy–Schwarz inequality, equivalent to \(n_I=o(n)\), at least if we ignore isolated infective vertices. Note that the opposite case, when \(n_I/n\) has a strictly positive limit, is treated in Janson et al. (2014, Theorems 2.6 and 2.7) (under otherwise similar assumptions).

## Remark 2.8

The assumption (2.9) (which is required only when \(\nu >0\)) says that no single infective vertex has a significant fraction of the total infective degree.

## Remark 2.9

## Remark 2.10

We saw in Remark 2.9 that (D1), (D2) and (D4) imply (2.21). Since \(n-n_0\leqslant \sum _{k=0}^\infty kn_k\), it follows that \(n-n_0=O(n_{\mathrm {S}})\). Hence, assumption (D7) is needed only to the exclude the rather trivial case that almost all of the population consist of isolated infective vertices, which cannot spread the epidemic. Note also that (D7) implies that it does not matter whether we use \(n_{\mathrm {S}}\) or *n* in estimates such as (2.11).

### 2.1 *G*(*n*, *p*) and *G*(*n*, *m*)

The results above apply to the graphs \(G(n,p)\) and \(G(n,m)\) by conditioning on the sequence of vertex degrees (which are now random), since given the vertex degrees, both \(G(n,p)\) and \(G(n,m)\) are uniformly distributed over all (simple) graphs with these vertex degrees. Moreover, if \({n\rightarrow \infty }\) and \(p\sim \lambda /n\), or \(m\sim n\lambda /2\), for some \(\lambda >0\), then the degree distribution is asymptotically Poisson \({\text {Po}}(\lambda )\). For \(G(n,p)\), this leads to the following result.

## Corollary 2.11

- (i)If \(\mu =0\), then there exists a sequence \(\varepsilon _n\rightarrow 0\) such that for each
*n*, w.h.p. one of the following holds.Moreover, the probability that (a) holds is- (a)
\({\mathcal Z}/(n \gamma _n) < \varepsilon _n \).

- (b)
\(|{\mathcal Z}/(n \gamma _n) -2| < \varepsilon _n\).

In particular, (a) holds w.h.p. if \(\gamma _nn_{\mathrm {I}}\rightarrow 0\) and (b) holds w.h.p. if \(\gamma _nn_{\mathrm {I}}\rightarrow \infty \).$$\begin{aligned} \exp \bigl (-(1+\lambda ^{-1})\gamma _n n_I\bigr ) +o(1). \end{aligned}$$(2.24) - (a)
- (ii)
If \(0<\mu <\infty \), then \({\mathcal Z}/n \gamma _n \overset{\mathrm {p}}{\longrightarrow }1 + \sqrt{1 + 2\mu }\).

- (iii)If \(\mu = \infty \), then$$\begin{aligned} \frac{{\mathcal Z}}{(n_{\mathrm {S}}n_{\mathrm {I}})^{1/2}} \overset{\mathrm {p}}{\longrightarrow }\sqrt{2}. \end{aligned}$$

## Proof

*k*; for convenience, we use the Skorohod coupling theorem (Kallenberg 2002, Theorem 4.30) so we may assume that this holds a.s. for each

*k*; thus (2.4) holds a.s. Similarly we may assume that \(\sum _k k^4n_{k}/n\) converges a.s., and then (D2) and (D3) hold a.s. Furthermore, \(\alpha _n\) is now random, and it is easy to see from (2.2) that

## 3 Proof of Theorem 2.4

### 3.1 Simplifying assumptions

We assume for convenience that \(n_{\mathrm {I}}= o(n)\). In fact, we may assume that \(n_{\mathrm {I},0}=0\) by deleting all initially infective vertices of degree 0, since these are irrelevant; then \(n_{\mathrm {I}}=o(n)\) as a consequence of (2.7). Note that this will not affect \({\mathcal R}_0\), \(\alpha _n\), \(\nu \) or the other constants and assumptions above.

Similarly, we assume that initially there are no recovered vertices, that is \(n_{\mathrm {R}}= 0\). It is easy to modify the proofs below to handle the case \(n_{\mathrm {R}}\geqslant 1\). Alternatively, we may observe that our results in the case \(n_{\mathrm {R}}= 0\) imply the corresponding results for general \(n_{\mathrm {R}}\) by the following argument. (See Janson 2009a for similar arguments in a related situation.) We replace each initially recovered vertex of degree *k* by *k* separate susceptible vertices of degree 1, so there are a total of \(X_{\mathrm {R},0}:=\sum _{k=0}^\infty kn_{\mathrm {R},k}\) additional “fake” susceptible vertices of degree 1; this will not change the course of the epidemic (in the multigraph case) except that some of these fake susceptible vertices will be infected. (Note that they never can infect anyone else.) The alteration will not affect \({\mathcal R}_0\), although \(\alpha _n\) and the asymptotic distribution \((p_k)\) will be modified. Note that \(X_{\mathrm {R},0}=O(n_{\mathrm {S}})\) by (D3) and (D7); by considering suitable subsequences we may thus assume that \(X_{\mathrm {R},0}/n_{\mathrm {S}}\rightarrow r\) for some \(r\in [0,\infty )\). It is easy see that the modified degree distribution satisfies all the assumptions above and that, if we use a prime to indicate quantities after the replacement, then \(n_{\mathrm {S}}'=n_{\mathrm {S}}+X_{\mathrm {R},0}\sim (1+r)n_{\mathrm {S}}\), \(\alpha _n'\sim \alpha _n/(1+r)\), \(n_{\mathrm {S}}'\alpha _n'=n_{\mathrm {S}}\alpha _n\), \(\nu '=(1+r)\nu \), \(p_1'=(p_1+r)/(1+r)\), \(p_k'=p_k/(1+r)\) for \(k\ne 1\), \(\lambda '=(\lambda +r)/(1+r)\), \(\lambda _2'=\lambda _2/(1+r)\), \(\lambda _3'=\lambda _3/(1+r)\).

If case (i)(a) in Theorem 2.4 occurs for the modified process, it occurs for the original process too, since \({\mathcal Z}\leqslant {\mathcal Z}'\), and there is nothing more to prove.

It is then easy to check that Theorem 2.4 and Theorem 2.5 for the original process both follow from these results in the case with no initially recovered vertices.

*n*if necessary. Finally, recall that in the proofs we first consider the random multigraph \(G^*\).

### 3.2 Time-changed epidemic on a random multigraph

We first study the epidemic on the configuration model multigraph \(G^*\), revealing its edges (i.e. pairing off the half-edges) while the epidemic spreads, as in Janson et al. (2014) (see other variants in Andersson 1998; Ball and Neal 2008; Decreusefond et al. 2012; Bohman and Picollelli 2012). We call a half-edge susceptible, infective or recovered according to the type of vertex it is attached to. Unpaired half-edges are said to be *free*. Initially, each vertex *i* has \(d_i\) half-edges and all of them are free.

Each free infective half-edge chooses a free half-edge at rate \(\beta _n > 0\), uniformly at random from among all the other free half-edges. Together the pair form an edge, and are removed from the set of free half-edges. If the chosen free half-edge belongs to a susceptible vertex then that vertex becomes infective. Infective vertices recover at rate \(\rho _n \geqslant 0\).

We stop the process when no infective free half-edges remain, which is the time when the epidemic stops spreading. Some infective vertices may remain but they trivially recover at i.i.d. exponential times. Some free susceptible and recovered half-edges may also remain. These could be paired uniformly to reveal the remaining edges in \(G^*\), if desired. However, this step is irrelevant for the course of the epidemic.

In order to prove our results, we perform a time change in the process: when in a state with \(x_{\mathrm {I}}\geqslant 1\) free infective half-edges, and a total of *x* free half-edges of any type, we multiply all transition rates by \((x-1)/\beta _n x_{\mathrm {I}}\) (this multiple is at least \(1/(2\beta _n)\), since \(x_{\mathrm {I}}\geqslant 1\) implies that \(x \geqslant 2\)). Then each free susceptible half-edge gets infected at rate 1, each infective vertex recovers at rate \(\rho _n(x-1)/\beta _nx_{\mathrm {I}}\), and each free infective half-edge pairs off at rate \((x-1)/x_{\mathrm {I}}\).

In the time changed process, let \(S_{t}\), \(I_{t}\) and \(R_{t}\) denote the numbers of susceptible, infective and recovered vertices, respectively, at time \(t \geqslant 0\). Let \(S_{t}(k)\) be the number of susceptible vertices of degree \(k\geqslant 0\) at time *t*. Then \(S_{t} = \sum _{k=0}^\infty S_{t}(k)\) is decreasing and \(R_{t}\) is increasing in *t*. Moreover, \(S_{0}(k) = n_{\mathrm {S},k}\), \(I_{0} = n_{\mathrm {I}}\) and \(R_{0} = n_{\mathrm {R}}= 0\). Also, we let \(X_{\mathrm {S},t}\), \(X_{\mathrm {I},t}\) and \(X_{\mathrm {R},t}\) be the numbers of free susceptible, infective and recovered half-edges, respectively, at time *t*. Then \(X_{\mathrm {S},t} = \sum _{k=0}^\infty k S_{t}(k)\) is decreasing, \(X_{\mathrm {S},0} = \sum _{k=0}^\infty k n_{\mathrm {S},k}\), \(X_{\mathrm {I},0} = \sum _{k=0}^\infty k n_{\mathrm {I},k}\) and \(X_{\mathrm {R},0} = 0\) (by our simplifying assumptions in Sect. 3.1).

### 3.3 Concentration of measure

## Theorem 3.1

The above result establishes concentration on time intervals of length \(O(\tilde{\alpha }_n)\). In Sect. 3.4, we use it to show that, for a suitable choice of \(\tilde{\alpha }_n\), the duration of the epidemic satisfies \(T^*=O(\tilde{\alpha }_n)\) w.h.p. It follows that the theorem then holds also with \(t_0=\infty \), see Remark 3.6.

The remainder of this subsection contains the proof of Theorem 3.1. We first need two lemmas concerning the evolution of the number of susceptible vertices and the total number of free half-edges.

*k*in the modified process. Then \((\tilde{S}_{t\wedge T^*}(k): k \in {\mathbb Z}^+, t \geqslant 0)\) has the same distribution as \((S_{t\wedge T^*}(k): k \in {\mathbb Z}^+, t \geqslant 0)\), and so, to prove (3.9) and (3.10), it suffices to prove that

*t*, let

## Lemma 3.2

## Proof of Lemma 3.2

We enumerate the initially susceptible vertices as \(i \!=\! 1,2,\ldots ,n_{\mathrm {S}}\) and denote by \(d_{\mathrm {S},i}\) the degree of initially susceptible vertex *i*. Let \(L_i\) be the time at which initially susceptible vertex *i* becomes infective (in the modified process). Then each \(L_i\) has exponential distribution with rate \(d_{\mathrm {S},i}\), and the \(L_i\) (\(i = 1,2,\ldots ,n_{\mathrm {S}}\)) are all independent of one another. It follows that, for each fixed *t*, the random variables \(F_{i,t}{:=} d_{\mathrm {S},i}^2(\mathop {\mathbbm {1}_{L_i > t}}\nolimits - e^{-td_{\mathrm {S},i}})\) each have mean zero and are all independent. Note that \(W_t = \sum _{i = 1}^{n_{\mathrm {S}}} F_{i,t}\).

*n*sufficiently large, \(d_{\mathrm {S},*}\leqslant (n\tilde{\alpha }_n)^{1/2}\), and then for any \(u \geqslant 2c_0 t_0\) and \(a = u (n \tilde{\alpha }_n)^{1/2}d_{\mathrm {S},*}\), by (3.17),

*n*sufficiently large and for each each \(t \leqslant t_0 \tilde{\alpha }_n\) and \(u \geqslant 2c_0t_0\),

*t*, each of the sums \(\sum _{k=0}^\infty k^2 \tilde{S}_{t}(k)\) and \(\sum _{k=0}^\infty k^2 n_{\mathrm {S},k}e^{-kt}\) is also decreasing in

*t*. Thus, for any \(0 \leqslant l < \omega _n\),

*n*sufficiently large and \(u \geqslant 2c_0t_0\), by (3.19),

*n*, \(2\omega _n\geqslant e^{c_0t_0}\), and then (3.21) holds trivially for \(u<2c_0t_0\) too. Hence, for large

*n*,

We now prove a concentration of measure result for the total number \(X_{t}\) of free half-edges.

## Lemma 3.3

## Proof

*t*in the modified process. Then it suffices to prove that

*j*to \(j -2\) at rate

*j*. By Janson and Luczak (2009, Lemma 6.1), with \(d = 2\), \(\gamma = 1\), and \(x = \sum _{k=0}^\infty kn_{k}- 1\),

## Proof of Theorem 3.1

*k*, \((\tilde{S}_{t}(k))\) is a linear death chain starting from \(n_{\mathrm {S},k}\) and decreasing by 1 at rate

*kx*when in state

*x*, and so

*x*free half-edges, infective vertices recover at rate \(\rho _n (x-1)/\beta _n x_{\mathrm {I}}\). Also, each free recovered half-edge is chosen to be paired at rate 1, and thus the number of recovered free half-edges decreases by 1 at rate \(x_{\mathrm {R}}\). Hence, for any \(t\geqslant 0\),

*k*free half-edges recovers. Also, each recovered half-edge or vertex was either initially infective or was initially susceptible and then became infected prior to recovery. Hence

Finally, (3.12) follows from (3.10), (3.11), (3.23), and the fact that \(X_{\mathrm {I},t} = X_{t} - X_{\mathrm {S},t} - X_{\mathrm {R},t}\). \(\square \)

### 3.4 Duration of the time-changed epidemic

We stated Theorem 3.1 using a rather arbitrary \(\tilde{\alpha }_n\), but from now on we fix it as follows. We distinguish between the cases \(\nu <\infty \) and \(\nu =\infty \), and introduce some further notation:

*f*, and that \(f(t)>0\) on \((0,\varkappa )\) and \(f(t)<0\) on \((\varkappa ,\infty )\); we have \(f(0)=0\) if \(\nu =0\) but \(f(0)>0\) if \(\nu >0\). Note further that in the case \(\nu =\infty \), \(\tilde{\alpha }_n/\alpha _n\rightarrow \infty \) by (2.8); in particular, \(\tilde{\alpha }_n\geqslant \alpha _n\) except possibly for some small

*n*that we will ignore. Moreover, \(\tilde{\alpha }_n\rightarrow 0\) by (D4) (\(\nu <\infty \)) or (D5) (\(\nu =\infty \)).

We have verified that our choice of \(\tilde{\alpha }_n\) satisfies the conditions of Theorem 3.1, so Theorem 3.1 applies. We use this to show a more explicit limit result for \(X_{\mathrm {I},t}\).

## Lemma 3.4

## Proof

The idea is to combine Theorem 3.1 with a Taylor expansion of \(h_{\mathrm {I},n}(t)\) around zero.

We can now find (asymptotically) the duration \(T^*\), except that when \(\nu =0\), we cannot yet say whether the epidemic is very small or rather large.

## Lemma 3.5

- (i)
If \(0<\nu \leqslant \infty \), then \(T^*/\tilde{\alpha }_n\overset{\mathrm {p}}{\longrightarrow }\varkappa \).

- (ii)If \(\nu =0\), then for every \(\varepsilon > 0\), w.h.p., either
- (a)
\(0 \leqslant T^*/\alpha _n < \varepsilon \), or

- (b)
\(|T^*/\alpha _n - \varkappa | < \varepsilon \).

- (a)

## Proof

Take \(t_0=2\varkappa \). Then \(f(t_0)<0\), so (3.43) implies that \(\mathbb {P}{}(T^*/\tilde{\alpha }_n\geqslant t_0)\rightarrow 0\), i.e., \(T^*< t_0 \tilde{\alpha }_n\) w.h.p. Consequently, we may w.h.p. take \(t=T^*/\tilde{\alpha }_n\) in (3.43) and conclude \(\bigl |X_{\mathrm {I},T^*}/n\tilde{\alpha }_n^2 - f(T^*/\tilde{\alpha }_n)\bigr | \overset{\mathrm {p}}{\longrightarrow }0\). Since \(X_{\mathrm {I},T^*}=0\) by definition, this says \(f(T^*/\tilde{\alpha }_n) \overset{\mathrm {p}}{\longrightarrow }0\).

Consider *f*(*t*) for \(t\in [0,\infty )\). If \(\nu >0\), then *f*(*t*) has a unique zero at \(\varkappa \), and is bounded away from 0 outside every neighbourhood of \(\varkappa \); hence \(T^*/\tilde{\alpha }_n\overset{\mathrm {p}}{\longrightarrow }\varkappa \) follows. If \(\nu =0\), \(f(t)=0\) both for \(t=0\) and \(t=\varkappa \), and (ii) follows. \(\square \)

### 3.5 Final size

## Proof of Theorem 2.4

*k*that ever become infected, and \({\mathcal Z}=\sum _k{\mathcal Z}_k\). For each \(k \in {\mathbb Z}^+\),

## 4 Proof of Theorem 2.5

We continue to use the simplifying assumptions in Sect. 3.1. We consider the epidemic in the original time scale and construct it from independent exponential random variables. At time \(t=0\), we allocate each of the \(n_{\mathrm {I}}\) initially infective vertices an \({\text {Exp}}(\rho _n)\) recovery time. We also give each free infective half-edge at time 0 an \({\text {Exp}}(\beta _n)\) pairing time. If the pairing time for a free infective half-edge is less than the recovery time of its parent vertex, then we colour that free half-edge red. Otherwise, we colour it black. We now wait until the first recovery or pairing time. At a recovery time, we change the status of the corresponding vertex to recovered. At a pairing time of a red free half-edge, we choose another free half-edge uniformly at random. If the chosen free half-edge belongs to a susceptible vertex then that vertex becomes infective, is given an \({\text {Exp}}(\rho _n)\) recovery time, and its remaining free half-edges are given independent \({\text {Exp}}(\beta _n)\) pairing times. Then, as above, we colour red any free half-edge with pairing time less than recovery time, and colour black all other free half-edges at the chosen vertex. The process continues in this fashion until no red free half-edges remain. Note that we do nothing at the pairing time of a black free half-edge, since it is no longer infective, and so black free half-edges behave like recovered free half-edges. Also, a red free half-edge will definitely initiate a pairing event at some point (provided it has not been chosen by another red free half-edge first). However, ignoring the colourings we obtain the same process as before.

Let \(Z_t\) be the number of red free half-edges at time \(t \geqslant 0\). Note that \(Z_t\) changes only at pairing events, but not at recovery times. (The point of the colouring is to anticipate the recoveries, which then can be ignored.) Further, let \(\bar{Z}_{m}:=Z_{T_m}\), where \(T_m\) is the time of the *m*:th pairing event (and \(T_0:=0\)), and let \(\zeta _m:=\Delta \bar{Z}_{m}:=\bar{Z}_{m}-\bar{Z}_{m-1}\). (Note that our processes are all right continuous, so \(\bar{Z}_{m}\) is the number of red free half-edges immediately after the *m*-th pairing has occurred and we have coloured any new infective free half-edges.) Thus the process stops at \(T_{m_*}\), where \(m_*:=\min \{m\geqslant 0:\bar{Z}_{m}=0\}\). (This is not exactly the same stopping condition as used earlier, but the difference does not matter; there may still be some infective half-edges, but they are black and will recover before infecting any more vertex.) Let \(\mathcal F_m =\mathcal F(T_m)\) be the corresponding discrete-time filtration generated by the coloured SIR process up to time \(T_m\).

*k*free half-edges at time \(t \geqslant 0\). Furthermore, define

We begin by showing that a substantial fraction of the initially infective half-edges are red. Recall that \(X_{\mathrm {I},0}=\sum _{k=0}^\infty k n_{\mathrm {I},k}\) is the total degree of the initially infective vertices and that \(d_{\mathrm {I},*}\) is the maximum degree among these vertices.

## Lemma 4.1

- (i)
If \(d_{\mathrm {I},*}=o(X_{\mathrm {I},0})\), then \(Z_0=\pi _n X_{\mathrm {I},0}\bigl (1+o_{\mathrm p}(1)\bigr )\).

- (ii)More generally, for any \(d_{\mathrm {I},*}\), we have$$\begin{aligned} \lim _{\delta \rightarrow 0} \limsup _{n\rightarrow \infty } \mathbb {P}\Bigl (Z_0 \leqslant \delta X_{\mathrm {I},0}\Bigr ) = 0. \end{aligned}$$(4.2)

## Proof

*i*, so that \(X_{\mathrm {I},0} = \sum _{i = 1}^{n_{\mathrm {I}}} d_{\mathrm {I},i}\). We also let \(Z_{0,i}\) be the number of red free half-edges at vertex

*i*, so \(Z_0 = \sum _{i = 1}^{n_{\mathrm {I}}} Z_{0,i}\), where the \(Z_{0,i}\) are independent, with \(\mathbb {E}{}Z_{0,i} = d_{\mathrm {I},i}\pi _n \) and \(Z_{0,i} \leqslant d_{\mathrm {I},i}\). It follows that \(\mathbb {E}{}Z_0=\sum _{i = 1}^{n_{\mathrm {I}}}\mathbb {E}{}Z_{0,i} =\pi _nX_{\mathrm {I},0}\) and

(ii) Take any \(\delta > 0\) with \(\delta <\frac{1}{2}\min _n \pi _n\).

## Lemma 4.2

## Proof

## Proof of Theorem 2.5(ii)

*k*, we get on the average \(\pi _n (k-1)\) new red free half-edges; in the second case we instead lose one red free half-edge, in addition to the pairing red free half-edge that we always lose. The probability of pairing with a susceptible half-edge belonging to a vertex of degree

*k*is \(kS_{T_m}(k)/X_{T_m}\) and the probability of pairing with another red free half-edge is \(\bar{Z}_{m}/X_{T_m}\). Hence, for \(m+1 \leqslant m_{**}\), using (4.13)–(4.15), (4.1) and the definition (2.1) of \({\mathcal R}_0\),

*n*is large and \(m< m_{**}\), then

*k*at a jump if a red free half-edge pairs with a free susceptible half-edge at a vertex of degree

*k*, the expected square of any jump satisfies, for \(m<m_{**}\),

*n*, by assumption (D2).

Let \(W_m = \bar{Z}_{m\wedge m_{**}}-Z_0\). It follows from (4.16) and (4.17) that Lemma 4.2 applies with \(\tau =m_{**}\), \(v=c_{1}\alpha _n\) and \(w=c_{2}\).

*t*before the epidemic dies out, \(\sum _{k=0}^\infty k^2 (n_{\mathrm {S},k}- S_{t}(k)) > \delta n \alpha _n\).

To study the cases (i) and (iii), we analyse the number of red free half-edges more carefully. Let the random variable *Y*(*k*) be the number of new red free half-edges when a vertex of degree *k* is infected. Given the recovery time \(\tau \) of the vertex, \(Y(k)\sim {\text {Bin}}\bigl (k-1, 1-e^{-\beta _n\tau }\bigr )\), and, since \(\tau \sim {\text {Exp}}(\rho _n)\), the probability \(1-e^{-\beta _n\tau }\) has the Beta distribution \(B(1,\rho _n/\beta _n)\). Consequently, *Y*(*k*) has the beta-binomial distribution with parameters \((k-1,1,\rho _n/\beta _n)\). More generally, if *D* is a positive integer valued random variable, then *Y*(*D*) denotes a random variable that conditioned on \(D=k\) has the distribution *Y*(*k*). We have the following elementary result, recalling the notation (4.1).

## Lemma 4.3

*D*,

## Proof

*D*. \(\square \)

*A*be a constant, and consider only \(m\leqslant M:=\lfloor A\alpha _n^{-2}\rfloor \). At pairing event

*m*for \(m \leqslant M\), the number of free half-edges is at least \(\sum _k kn_{k}-2A\alpha _n^{-2}\geqslant \sum _k kn_{k}\cdot (1-A_1 n^{-1}\alpha _n^{-2})\) for some constant \(A_1\). Thus, the probability that a susceptible vertex with \(\ell \) half-edges is infected is at most, for \(A_2:=2A_1\) and large

*n*,

*n*, although we omit this from the notation.) Then \(\zeta _m:=\Delta \bar{Z}_{m}\), conditioned on what has happened earlier, is stochastically dominated by \(\zeta ^+\). Hence, there exist independent copies \((\zeta ^+_m)_1^\infty \) of \(\zeta ^+\) such that \(\zeta _m\leqslant \zeta ^+_m\) for all \(m\leqslant M\) such that the epidemic has not yet stopped; furthermore, \((\zeta ^+_m)_1^\infty \) are also independent of \(Z_0\). If the epidemic stops at \(m_*<M\) (because \(Z_{m_*}=0\) so there are no more pairing events), then we for convenience extend the definition of \(\zeta _m\) and \(\bar{Z}_{m}\) to all \(m\leqslant M\) by defining \(\zeta _m:=\zeta ^+_m\) for \(m>m_*\), and still requiring \(\zeta _m=\Delta \bar{Z}_{m}\). Consequently, \(\zeta _m\leqslant \zeta _m^+\) for all \(m\leqslant M\) and thus the (possibly extended) sequence \((\bar{Z}_{m})_0^M\) is dominated by the random walk \((\bar{Z}_{m}^+)_0^M\) with \(\bar{Z}_{m}^+:=Z_0+\sum _{i=1}^m \zeta ^+_i\).

*t*by \(\bar{Z}_{t}^+:=\bar{Z}_{\lfloor t\rfloor }^+\). (We define \(\bar{Z}_{t}\) and \(\bar{Z}_{t}^-\) below in the same way.) Here the convergence is in distribution in the Skorohod space

*D*[0,

*A*], but we may by the Skorohod coupling theorem (Kallenberg 2002, Theorem 4.30) assume that the processes for different

*n*are coupled such that a.s. (4.35) holds uniformly on [0,

*A*].

## Proof of Theorem 2.5(i)

*A*] a.s. For any fixed \(\delta >0\), the right-hand side is a.s. negative for some \(t\in [0,\delta ]\), and thus w.h.p. \(\alpha _n\bar{Z}_{t\alpha _n^{-2}}^+ <0\) for some \(t\in [0,\delta ]\). Since \(Z_m\leqslant \bar{Z}_{m}\) it follows that w.h.p. \(m_*\leqslant \delta \alpha _n^{-2}\), i.e. the epidemic stops with \(Z_m=0\) after at most \(\delta \alpha _n^{-2}\) infections. Hence, w.h.p.

## Proof of Theorem 2.5(iii) in the multigraph case

*m*:th pairing event, when we are to pair an infective red half-edge \(y_m\), if \(x_m\) still is free and \(x_m\ne y_m\), we pair \(y_m\) with \(x_m\); otherwise we resample and pair \(y_m\) with a uniformly chosen free half-edge \(\ne y_m\). Furthermore, we let \(\zeta _m^-:=-1\) if \(x_m\) is initially infective, and if \(x_m\) belongs to an initially susceptible vertex of degree

*k*, we let \(\zeta _m^-\) be a copy of \(Y(k)-1\) (independent of the history); if \(x_m\) still is susceptible at the

*m*:th pairing event (and thus free, so we pair with \(x_m\)), we may assume that \(\zeta _m^-:=\zeta _m\), the number of new red free half-edges minus 1. Note that \((\zeta _m^-)_{m\geqslant 1}\) is an i.i.d. sequence of random variables with the distribution \(Y(D^-)-1\), where \(D^-\) has the distribution obtained by taking \(A_2=0\) in (4.26); furthermore, \((\zeta ^-_m)_1^\infty \) are independent of \(Z_0\). Let \(\bar{Z}_{m}^-:=Z_0+\sum _{i=1}^m\zeta _i^-\). Note that (4.27)–(4.34) hold for \(D^-\) and \(\zeta _m^-\) too (with some simplifications), and thus, in analogy with (4.36),

Let \(\zeta _m':=\zeta _m^--\zeta _m\). Thus \(\zeta _m'=0\) if \(x_m\) is susceptible at time \(T_m\). If \(x_m\) was initially susceptible, with degree *k*, but has been infected, then \(\zeta _m'\leqslant \zeta _m^-+2\leqslant k\). If \(x_m\) was initially infected, then \(\zeta _m^-=-1\) and thus \(\zeta _m'\leqslant \zeta _m^-+2\leqslant 1\).

Consider as above only \(m\leqslant M:=\lfloor A\alpha _n^{-2}\rfloor \), for some (large) constant \(A>0\). For \(m>m_*\), when the epidemic has stopped, we have defined \(\zeta _m=\zeta _m^+\). Since \(\zeta _m^\pm \overset{\mathrm {d}}{=}Y(D^\pm )-1\) and \(D^-\) is stochastically dominated by \(D^+\), we may in this case assume that \(\zeta _m=\zeta _m^+\geqslant \zeta _m^-\), and thus \(\zeta _m'\leqslant 0\).

For \(m\leqslant M\), the number of initially susceptible half-edges that have been infected is at most, using (2.11) and (2.6), \( m d_{\mathrm {S},*}= O\bigl (\alpha _n^{-2}d_{\mathrm {S},*}\bigr ) = o\bigl (\alpha _n^{-2}n^{1/3}\bigr )=o(n)\). Hence the number of free half-edges at \(T_m\) is at least \(\sum _k kn_{\mathrm {S},k}-m d_{\mathrm {S},*}=\lambda n-o(n)\geqslant {}c_{3}n \) for \(c_{3}:=\lambda /2\) if *n* is large enough. It follows that the probability that a given initially susceptible vertex of degree *k* has been infected before \(T_m\) is at most \(mk/(c_{3}n)\), and the probability that one of its half-edges is chosen as \(x_m\) is at most \(k/(c_{3}n)\) for every \(m\leqslant M\). Similarly, the probability that \(x_m\) is initially infective is at most \(X_{\mathrm {I},0}/(c_{3}n)\).

*t*, and thus by continuity a.s. for all \(t\in [0,A]\).

*D*[0,

*A*], it follows that

*Y*. Since the random variable

*Y*has a continuous distribution, (4.43) implies that, uniformly in \(x\in \mathbb {R}\),

*a*and

*b*, and condition on the event \(\mathcal E_n^{a,b}:=\{a\leqslant \alpha _n Z_0 <b\}\), then for every subsequence such that \(\liminf _{n\rightarrow \infty }\mathbb {P}{}(\mathcal E_n^{a,b})>0\), the arguments above leading to (4.36), (4.38) and (4.42)–(4.44) still hold. (We need \(\liminf _{n\rightarrow \infty }\mathbb {P}{}(\mathcal E_n^{a,b})>0\) in order to get a conditional version of (4.40).) Consequently, \( \mathbb {P}{}\bigl (Y_n \leqslant x\mid \mathcal E_n^{a,b}\bigr )=\mathbb {P}{}( Y\leqslant x)+o(1) \), and thus, recalling that \(B_t\) is independent of \(Z_0\),

*a*and

*b*,

*C*; thus \(0\leqslant \alpha _nZ_0\leqslant \alpha _nX_{\mathrm {I},0}\leqslant C\). Let \(\delta >0\) and divide the interval [0,

*C*] into a finite number of subintervals \([a_j,b_j]\) with lengths \(b_j-a_j<\delta \). By summing (4.47) for these intervals, we obtain

*Y*from (4.43),

If \(m_*>M\), consider again \(m_{**}\) defined by (4.13) (but taking minimum over \(m \geqslant M\)), for a sufficiently small \(\delta >0\). Note that, as in the proof of (ii), if \(\bar{Z}_{m_{**}}>0\), then (4.19) holds and w.h.p. \(T^*>\varepsilon \alpha _n\) for some small \(\varepsilon >0\), and thus w.h.p. (b) in Theorem 2.4(i) holds. In other words, for some small \(\varepsilon >0\), if \(m_*\leqslant M\), then \({\mathcal Z}<\varepsilon n \alpha _n\), and if \(m_*>M\) and \(\bar{Z}_{m_{**}}>0\), then \({\mathcal Z}>\varepsilon n\alpha _n\) w.h.p.

## Proof of Theorem 2.5(iii) in the simple graph case

As said in Sect. 2, this result for the random simple graph \(G\) does not follow immediately from the multigraph case (as the other results in this paper do). We use here instead the argument for the corresponding result in Janson et al. (2014, Section 6), with minor modifications as follows. We continue to work with the random multigraph \(G^*\). Also, we now allow initially recovered vertices, since our trick in Sect. 3.1 to eliminate them does not work for the simple graph case.

Fix a sequence \(\varepsilon _n\rightarrow 0\) such that Theorem 2.4(i) holds, and let \({\mathcal L}\) be the event that there are less than \(\varepsilon _n^{1/2}n_{\mathrm {S}}\alpha _n\) pairing events; note that if \({\mathcal L}\) occurs, then \({\mathcal Z}<\varepsilon _n^{1/2}n_{\mathrm {S}}\alpha _n\), while if \({\mathcal L}\) does not occur, w.h.p. \({\mathcal Z}>\varepsilon _nn_{\mathrm {S}}\alpha _n\) by a simple argument (using e.g. Chebyshev’s inequality); hence \({\mathcal L}\) says w.h.p. that the epidemic is small.

Furthermore, let *W* be the number of loops and pairs of parallel edges in \(G^*\); thus \(G^*\) is simple if and only if \(W=0\), and we are interested in the conditional probability \(\mathbb {P}{}({\mathcal L}\mid W=0)\). By Janson (2014) (at least if we consider suitable subsequences), \(W\overset{\mathrm {d}}{\longrightarrow }\widehat{W}\) for some random variable \(\widehat{W}\), with convergence of all moments.

*v*that is not initially infected and has degree less than \(\overline{d}\), then the probability that the infection will reach

*v*within less than \(\varepsilon _n^{1/2}n_{\mathrm {S}}\alpha _n\) pairing events is \(O\bigl (\overline{d}\varepsilon _n^{1/2}n \alpha _n/n\bigr )=o(1)\), so w.h.p.

*v*is not infected before it is determined whether \({\mathcal L}\) occurs or not.

The rest of the proof is exactly as in Janson et al. (2014), to which we refer for details. \(\square \)

## Remark 4.4

*i*-th initially infective vertex. Hence, letting \(\chi \) denote the fraction in (2.18), so \(\chi \sim 2\lambda _2^{-1}\sigma ^{-2}\pi _n\), the probability is

Note also that \(\psi _n(k)\geqslant 0\) by Jensen’s inequality; thus an extremely uneven distribution of the degrees of the initially infective vertices will increase the probability of a small outbreak.

## Notes

### Acknowledgments

We are grateful to anonymous referees for their useful comments that helped us improve the paper.

## References

- Andersson H (1998) Limit theorems for a random graph epidemic model. Ann Appl Probab 8(4):1331–1349MathSciNetCrossRefMATHGoogle Scholar
- Andersson H (1999) Epidemic models and social networks. Math Sci 24(2):128–147MathSciNetMATHGoogle Scholar
- Antia R, Regoes RR, Koella JC, Bergstrom CT (2003) The role of evolution in the emergence of infectious diseases. Nature 426:658–661CrossRefGoogle Scholar
- Ball F, Neal P (2008) Network epidemic models with two levels of mixing. Math Biosci 212(1):69–87MathSciNetCrossRefMATHGoogle Scholar
- Barbour AD, Reinert G (2013) Approximating the epidemic curve. Electron J Probab 18(54):30MathSciNetMATHGoogle Scholar
- Ben-Naim E, Krapivsky PL (2004) Size of outbreaks near the epidemic threshold. Phys Rev E 69(5):050901CrossRefGoogle Scholar
- Bohman T, Picollelli M (2012) SIR epidemics on random graphs with a fixed degree sequence. Random Struct Algorithms 41(2):179–214MathSciNetCrossRefMATHGoogle Scholar
- Bollobás B (2001) Random graphs, 2nd edn. Cambridge University Press, CambridgeCrossRefMATHGoogle Scholar
- Boucheron S, Lugosi G, Massart P (2013) Concentration inequalities. Oxford University Press, OxfordCrossRefMATHGoogle Scholar
- Britton T, Janson S, Martin-Löf A (2007) Graphs with specified degree distributions, simple epidemics, and local vaccination strategies. Adv Appl Probab 39(4):922–948MathSciNetCrossRefMATHGoogle Scholar
- Bull J, Dykhuizen D (2003) Epidemics-in-waiting. Nature 426:609–610CrossRefGoogle Scholar
- Decreusefond L, Dhersin J, Moyal P, Tran VC (2012) Large graph limit for an SIR process in random network with heterogeneous connectivity. Ann Appl Probab 22(2):541–575MathSciNetCrossRefMATHGoogle Scholar
- Gordillo LF, Marion SA, Martin-Löf A, Greenwood PE (2008) Bimodal epidemic size distributions for near-critical SIR with vaccination. Bull Math Biol 70(2):589–602MathSciNetCrossRefMATHGoogle Scholar
- House T, Ross JV, Sirl D (2012) How big is an outbreak likely to be? Methods for epidemic final-size calculation. Proc R Soc A 469(2150):20120436MathSciNetCrossRefGoogle Scholar
- Janson S (2009a) On percolation in random graphs with given vertex degrees. Electron J Probab 14(5):87–118MathSciNetMATHGoogle Scholar
- Janson S (2009b) The probability that a random multigraph is simple. Comb Probab Comput 18(1–2):205–225MathSciNetCrossRefMATHGoogle Scholar
- Janson S (2011) Probability asymptotics: notes on notation. arXiv:1108.3924
- Janson S (2014) The probability that a random multigraph is simple, II. J Appl Probab 51A:123–137MathSciNetCrossRefMATHGoogle Scholar
- Janson S, Luczak MJ (2009) A new approach to the giant component problem. Random Struct Algorithms 34(2):197–216MathSciNetCrossRefMATHGoogle Scholar
- Janson S, Luczak M, Windridge P (2014) Law of large numbers for the SIR epidemic on a random graph with given degrees. Random Struct Algorithms 45(4):724–761MathSciNetCrossRefMATHGoogle Scholar
- Kallenberg O (2002) Foundations of modern probability, 2nd edn. Springer-Verlag, New YorkCrossRefMATHGoogle Scholar
- Ludwig D (1975) Final size distributions for epidemics. Math Biosci 23:33–46MathSciNetCrossRefMATHGoogle Scholar
- Martin-Löf A (1998) The final size of a nearly critical epidemic, and the first passage time of a Wiener process to a parabolic barrier. J Appl Probab 35(3):671–682MathSciNetCrossRefMATHGoogle Scholar
- McDiarmid C (1998) Concentration. In: Habib M, McDiarmid C, Ramirez J, Reed B (eds) Probabilistic methods for algorithmic discrete mathematics. Springer, Berlin, pp 195–248CrossRefGoogle Scholar
- Miller JC (2011) A note on a paper by Erik Volz: SIR dynamics in random networks. J Math Biol 62(3):349–358MathSciNetCrossRefMATHGoogle Scholar
- Miller JC (2014) Epidemics on networks with large initial conditions or changing structure. PLoS One 9(7): e101421. doi: 10.1371/journal.pone.0101421
- Miller JC, Slim AC, Volz EM (2012) Edge-based compartmental modelling for infectious disease spread. J R Soc Int 9:890–906CrossRefGoogle Scholar
- Newman MEJ (2002) Spread of epidemic disease on networks. Phys Rev E 66(1):016128, 11MathSciNetCrossRefGoogle Scholar
- O’Regan SM, Drake JM (2013) Theory of early warning signals of disease emergence and leading indicators of elimination. Theor Ecol 6:333–357CrossRefGoogle Scholar
- Pellis L, Ferguson N, Fraser C (2008) The relationship between real-time and discrete-generation models of epidemic spread. Math Biosci 216(1):63–70MathSciNetCrossRefMATHGoogle Scholar
- Revuz D, Yor M (1999) Continuous Martingales and Brownian motion, 3rd edn. Springer-Verlag, BerlinCrossRefMATHGoogle Scholar
- Scheffer M, Bascompte J, Brock WA, Brovkin V, Carpenter SR, Dakos V, Held H, van Nes EH, Rietkerk M, Sugihara G (2009) Early-warning signals for critical transitions. Nature 461(1):53–59CrossRefGoogle Scholar
- Sellke T (1983) On the asymptotic distribution of the size of a stochastic epidemic. J Appl Probab 20(2):390–394MathSciNetCrossRefMATHGoogle Scholar
- van der Hofstad R, Janssen AJEM, Leeuwaarden JSH (2010) Critical epidemics, random graphs, and Brownian motion with a parabolic drift. Adv Appl Probab 42(3):706–738MathSciNetCrossRefMATHGoogle Scholar
- Volz E (2008) SIR dynamics in random networks with heterogeneous connectivity. J Math Biol 56(3):293–310MathSciNetCrossRefMATHGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.