Susceptibility Sets and the Final Outcome of Collective Reed–Frost Epidemics

This paper is concerned with exact results for the final outcome of stochastic SIR (susceptible →$\rightarrow $ infective →$\rightarrow $ recovered) epidemics among a closed, finite and homogeneously mixing population. The factorial moments of the number of initial susceptibles who ultimately avoid infection by such an epidemic are shown to be intimately related to the concept of a susceptibility set. This connection leads to simple, probabilistically illuminating proofs of exact results concerning the total size and severity of collective Reed–Frost epidemic processes, in terms of Gontcharoff polynomials, first obtained in a series of papers by Claude Lefèvre and Philippe Picard. The proofs extend easily to include general final state random variables defined on SIR epidemics, and also to multitype epidemics.


Introduction
One of Claude Lefèvre's main contributions to epidemic theory is concerned with exact results for the final outcome of stochastic SIR (susceptible → infective → recovered) epidemics among a closed, finite populations. An SIR epidemic is one in which there are three types of individuals, namely susceptibles, infectives and recovered. If a susceptible individual is contacted by an infective then it too becomes an infective and remains so for a (possibly random) period of time, called its infectious period, after which it recovers and is Frank Ball frank.ball@nottingham.ac.uk 1 School of Mathematical Sciences, University of Nottingham, University Park, Nottingham, NG7 2RD, UK immune to further infection. An epidemic is started by some individuals becoming infective and ends when there is no infective present in the population. The total size of such an epidemic is the number of initial susceptibles who ultimately become infected. The severity of the epidemic is the sum of the infectious periods of all individuals infected during the course of the epidemic, including the initial infectives. The distribution of total size and severity has received considerable attention in the literature; see Lefèvre (1990) for a brief review of work prior to 1990.
Together with Philippe Picard, Claude Lefèvre introduced in Picard and Lefèvre (1990) an extension of the ordinary Reed-Frost model (see Bailey 1975, Chapters 8 and 14), called the collective Reed-Frost epidemic process. As described in Lefèvre and Picard (1995), for many commonly-studied SIR models, the total size and severity has the same distribution as that of an appropriate collective Reed-Frost process. Moreover, by exploiting a nonstandard family of polynomials first introduced by Gontcharoff (1937), a unified analysis of the total size and severity of collective Reed-Frost epidemics was developed in Picard and Lefèvre (1990), which both generalised a number of previous results and obtained them in a more systematic fashion.
The method of the analysis in Picard and Lefèvre (1990) is as follows. First, a suitable family of martingales is defined on the epidemic process. Then, an optional stopping theorem is used to derive a set of equations satisfied by expectations of certain functions of total size and severity. Finally, Gontcharoff polynomials are used to derive an expression for the joint generating function Laplace transform of total size and severity. Thus, Gontcharoff polynomials are used purely as a tool and they are not given any probabilistic interpretation. For a class of epidemic models that admit a Sellke construction (Sellke 1983), Ball and O'Neill (1999) use direct probabilistic arguments which highlight the connection between the total size distribution and Gontcharoff polynomials. However, they do not give a probabilistic interpretation to the Gontcharoff polynomials and lengthy conditioning arguments are required unless the model is ordinary Reed-Frost.
The notion of general final state random variables for SIR epidemics was introduced in Ball and O'Neill (1999). These random variables are sums over all individuals infected during the course of the epidemic of random quantities of interest associated with an individual, so one example is the severity of an epidemic. The random quantities may be vector-valued. The joint generating function Laplace transform of the total size and general final state random variables is derived in Ball and O'Neill (1999) for models that admit a Sellke construction, though generally the resulting expression is not amenable for calculation as it is in terms of expectations of complicated functions of random variables. Following lengthy algebra, more explicit results are derived in Ball and O'Neill (1999) when the tolerances in the Sellke construction follow an exponential distribution, in which case the model can also be viewed as a special case of the collective Reed-Frost process.
Many stochastic SIR epidemic models, including collective Reed-Frost processes, admit a random graph representation (see, for example, Ball 1983 and Barbour and Mollison 1990), in which for any ordered pair of individuals, i and j say, a directed edge from i to j is present if and only if i will try to infect j if i becomes infected. (The infection will fail if j has previously been infected.) Given such a graph and any subset A of individuals in the population, the susceptibility set S A of A is defined to be set of individuals j ∈ A c such that if j is infected then at least one member of A will ultimately become infected (see Section 3.1). Thus j ∈ S A if and only if there is a chain of directed edges from j to a member of A. In this paper it is shown that for collective Reed-Frost processes there is a straightforward connection between the distribution of the size of S A and factorial moments of the number of susceptibles remaining at the end of an epidemic, which leads to simple, probabilistically illuminiating derivations of exact results concerning the distribution of total size, severity and general final state random variables.
The paper is structured as follows. A brief background to symmetric sampling procedures, Gontcharoff polynomials and collective Reed-Frost epidemics is given in Section 2. The final outcome of collective Reed-Frost epidemics is considered in Section 3. The probability mass function for the size of a susceptibility set is shown in Section 3.1 to have a simple representation in terms of Gontcharoff polynomials, which is used in Section 3.2 to derive expressions for the factorial moments and probability generating function of the number of survivors of an epidemic. (The number of survivors of an epidemic is the number of susceptibles remaining at the end of the epidemic, i.e. the difference between the initial number of susceptibles and the total size, so the probability generating function of the total size is obtained easily from that of the number of survivors.) A simple relationship between the probability generating function of number of survivors and the joint generating function Laplace transform of the number of survivors and severity is proved in Section 3.3 and then used to determine the latter. The final outcome results are extended to the case when initial and non-initial infectives follow different probabilistic laws in Section 3.4. The derivation of the joint generating function Laplace transform of the number of survivors and severity extends easily to that of the number of survivors and general final state random variables. This is outlined in Section 3.5, where the flexibility afforded by the framework of general final state random variables is illustrated by using it to analyse two extensions of the collective Reed-Frost process considered in Picard and Lefèvre (1990), namely epidemics with several types of infectives, in Section 3.5.1, and epidemics in which infectives have different degrees of infectiousness during various stages of their infectious period, in Section 3.5.2.
Most of the results of Section 3 are derived for an extended version of the collective Reed-Frost model, in which individuals can also be infected from outside of the population. The arguments of Section 3 extend easily and naturally to multitype collective Reed-Frost processes , in which the population is partitioned into several groups that are homogeneous but different from each other, with results being expressed in terms of an extension of Gontcharoff polynomials to several variables given in Lefèvre and Picard (1990). This extension is outlined in Section 4.

Symmetric Sampling Procedures
This subsection contains a summary of results concerning symmetric sampling procedures, which are required in the sequel. The summary is based on Section 1 of Martin-Löf (1986), which should be consulted for further detail.
Consider a fixed finite population N = {1, 2, . . . , N} of size N . Let X be a random subset of N and let X = |X | denote the size of X . For A ⊆ N , let p A = P(X = A) and r A = P(X ⊇ A). A symmetric sampling procedure is one in which for all A ⊆ N , p A depends only on the size, |A| = a, of A, so p A = p a / N a , where p a = P(X = a). It follows that r A also depends only on a, so one can write r A = r a . Note that Note that if X is a symmetric sampling procedure on N and C is a fixed subset of N with |C| = c then X = X ∩ C is a symmetric sampling procedure on C, with r a = r a (a = 0, 1, . . . , c).

Gontcharoff Polynomials
Let U = u 0 , u 1 , . . . be a given sequence of real numbers. The Gontcharoff polynomials The following properties of Gontcharoff polynomials, required in the sequel, are proved in Lefèvre and Picard (1990), Section 2.
where aU denotes the sequence au 0 , au 1 , . . . . Picard and Lefèvre (1990) introduced the following collective Reed-Frost epidemic process as a model for the spread of an infectious disease among a closed, homogeneously mixing population partitioned into three types of individuals, viz susceptibles, infectives and recovered. The epidemic evolves in discrete time t = 0, 1, . . . . It is convenient to suppose that the unit of time corresponds to the (assumed constant) latent period of an infective and that the infectious period of an infective is reduced to a single point in time. For t = 0, 1, . . . , let X t and Y t denote respectively the numbers of susceptibles and infectives present at time t, and suppose that (X 0 , Y 0 ) = (n, m). For k = 1, 2, . . . , let q k be the probability that any given infective fails to contact anyone in any given set of k susceptibles, where contact is interpreted as being sufficient to transmit the disease, and let q 0 = 1. (Note that the model assumes that q k is the same for all infectives and for all subsets of k susceptibles.) Then the q k s induce a symmetric sampling procedure for determining the susceptibles who are contacted by a given infective. Different infectives make contacts independently of each other. It follows that {(X t , Y t ); t = 0, 1, . . . } is a Markov chain, where, for t = 0, 1, . . . , the random variable X t+1 is given by the number of the X t susceptibles at time t who avoid contact with any of the Y t infectives and Y t+1 = X t − X t+1 . The epidemic stops as soon as there is no infective present in the population.

Collective Reed-Frost Epidemics
Note that by the observation at the end of Section 2.1, there is no need to restrict attention to susceptibles when defining the q k s. Instead, one can consider for each infective, independent symmetric sampling procedures, induced by the same q k s, defined on the population of n + m − 1 individuals excluding that infective. Such a formulation gives the generalized Reed-Frost process introduced by Martin-Löf (1986), so the two models are equivalent.
There is a well known connection between SIR epidemic models and random graphs. For the above generalized Reed-Frost process consider the directed graph on the set V of N = n + m vertices, labelled 1, 2, . . . , N, corresponding to the n + m individuals in the population, in which for i = j there is a directed edge from i to j if and only if j belongs to the set of individuals contacted by i. (More precisely, for each i ∈ V , independently realise a symmetric sampling procedure, X i say, on V \ {i} induced by the q k s to determine who i will try to infect if it becomes infected. Then for i = j there is a directed edge from i to j if and only if j ∈ X i .) For distinct i, j ∈ V write i j if and only if there is a chain of directed edges from i to j . Let S 0 and I 0 denote the sets of vertices corresponding to the initial susceptibles and initial infectives, respectively. Then the set of initial susceptibles who are ultimately infected by the epidemic is given by {i ∈ S 0 : j i for some j ∈ I 0 }. Note that this set does not depend on the times of the contacts. Thus if attention is restricted to the final outcome of the epidemic, the collective Reed-Frost epidemic process can be used to study epidemic models in which the contacts of an infective do not all occur simultaneously, a fact first noted in a slightly different setting by Ludwig (1975) and explored in more detail in Pellis et al. (2008).
Before proceeding some special cases of collective Reed-Frost epidemics are outlined.
1. The ordinary Reed-Frost model (see e.g. Bailey 1975, Chapters 8 and 14). In this model a given infective contacts susceptibles independently, each with probability p = 1 − q.
Thus q k = q k (k = 0, 1, . . . ). 2. Extended general epidemic (see e.g. Ludwig 1975or Ball 1986. Suppose that infectives have independent and identically distributed infectious periods, each distributed according to a random variable T I having Laplace transform φ(θ) = E[exp(−θT I )] (θ ≥ 0), during which a given infective contacts a given susceptible at the points of a homogeneous Poisson process with rate β. All the contact processes are mutually independent and independent of the infectious periods. Then q k = φ(kβ) (k = 0, 1, . . . ). 3. General stochastic epidemic (see e.g. Bailey 1975, Chapter 6). This is obtained by setting T I to follow an exponential distribution with mean γ −1 , so q k = γ /(γ + kβ) (k = 0, 1, . . . ). 4. Direct epidemic process (Gertsbakh 1977;Jaworski 1999). In this model each individual makes zero contacts with probability q or precisely one contact, with an individual chosen uniformly at random from the other N − 1 individuals in the population, with probability 1 − q.
Consider the above collective Reed-Frost epidemic and suppose that infectives have independent and identically distributed infectious periods. Fix attention on a given infective. Let T I denote its infectious period and for k = 1, 2, . . . , let A k be the event that the given infective fails to infect anyone in a given set of k susceptibles. Let Let N * denote the total size of the epidemic, i.e. the number of initial susceptibles that are ultimately infected, let S = n − N * be the number of initial susceptibles who survive the epidemic, and let T A denote the severity of the epidemic, i.e. the sum of the infectious periods of all the m + N * infectives in the epidemic. Let be the joint generating function Laplace transform of (S, T A ). Then Picard and Lefèvre (1990) showed that Addy et al. (1991) generalised the extended general epidemic (see special case 2, above) to incorporate outside infection. Specifically, they assumed that each of the n initial susceptibles has probability π of avoiding infection from outside the population during the course of the epidemic, independently of other susceptibles in the population. This generalisation has proved to be very fruitful in analysing the asymptotic final outcome of epidemics among a community of households (see, for example, Ball et al. 1997). The collective Reed-Frost epidemic can be generalised in a similar fashion. For k = 1, 2, . . . , n, let π k be the probability that everyone in any given set of k initial susceptibles avoids infection from outside the population during the course of the epidemic (note that this probability is assumed to be the same for all such sets) and let π 0 = 1. Suppose also that the external and internal infection processes are independent. Letφ n,m (x, θ ) = E[x S exp(−θT A )] be the joint generating function Laplace transform of (S, T A ) for the collective Reed-Frost epidemic with outside infection. Then, conditioning on the number of initial susceptibles who avoid outside infection and using (2.5) yields, after a little algebra, that
The similarity between (2.5) and (2.6), and in particular the way the π i s enter (2.6), suggest that the Gontcharoff polynomials G i (x | U(θ)) (i = 0, 1, . . . ) admit a probabilistic interpretation in terms of the underlying epidemic process. In this paper, a proof of (2.5) is given which exploits such a probabilistic interpretation for the Gontcharoff polynomials and clarifies the origin of (2.6).

Susceptibility Sets
Consider the random graph representation of the collective Reed-Frost epidemic, without outside infection, given in Section 2.3. For A ⊆ V , define the susceptibility set S A of A by S A = {j ∈ V \A : j i for some i ∈ A}, with the convention S ∅ ≡ ∅. Note that if A ⊆ S 0 , so A consists entirely of initial susceptibles, then A avoids infection (i.e. nobody in A is ultimately infected) if and only if S A ∩ I 0 = ∅, hence the terminology. (In the context of epidemic processes on digraphs of random mappings, S A is referred to as the set of all predecessors of A, see e.g. Jaworski 1999.) Let The following lemma is required in the sequel. An equivalent result for the special case of the extended general epidemic model is given in Ball and Neal (2002), Lemma 3.1, but not in terms of Gontcharoff polynomials.
Proof For the purpose of this proof only, label the elements of V \A, 1, 2, . . . , k. For i = 1, 2, . . . , k, let i denote the set {1, 2, . . . , i} and let 0 denote the empty set. Note that, by symmetry and in obvious notation, and that, by first considering the susceptibility set of A amongst the individuals 1, 2, . . . , , Thus, directly from the definition (2.4) of Gontcharoff polynomials, and Lemma 3.1 follows, using (3.2) and (3.3).
Remark 3.1 Note that, after suitable change of notation, the expression (3.1) for the probability mass function of the size of the susceptibility set S A is identical to the probability mass function of the total size of the nonstationary Reed-Frost epidemic defined in Lefèvre and Picard (2005); see equation (2.5) of that paper. In that model, susceptibles may be viewed as being exposed to infectives sequentially and, for k = 1, 2, . . . , the probability that any given susceptible avoids infection from the first k infectives that it is exposed to is denoted byq k . (This probability is denoted by π k in Lefèvre and Picard (2005).) Letq 0 = 1. A realisation of the total size of a nonstationary Reed-Frost epidemic can be obtained by first constructing a realisation of the directed random graph of the collective Reed-Frost epidemic having q k =q k (k = 0, 1, . . . ) and then reversing the direction of all of the edges in that graph. The resulting graph is then interpreted in the usual fashion; i.e. for i = j , if i is infected then it will try to infect j if and only if there is a directed edge from i to j . Observe that if A denotes the set of initial infectives in the nonstationary Reed-Frost epidemic, then the set susceptibles that are ultimately infected in that epidemic is given by the susceptibility set of A in the collective Reed-Frost epidemic, which explains why the two mass functions are the same.

Total Size
Consider the collective Reed-Frost epidemic with outside infection, defined in Section 2.4. Suppose that initially there are m infectives and n susceptibles and let S be the number of initial susceptibles that ultimately remain uninfected.
Proof Fix j ∈ {1, 2, . . . , n} and let A be any fixed set of j initial susceptibles. Then by (2.3) can be chosen in n+m−j ways, all equally likely, of which n−j satisfy S A ∩ I 0 = ∅. Thus, since the internal and external infection processes are independent, Hence, using (3.1) and the substitution i = + j . Equation 3.4 follows using Property 2.2 and noting that G The case j = 0 is immediate, using Property 2.1. Finally, both sides of (3.4) are clearly zero if j > n.

Remark 3.3
In the proof of Proposition 3.1, the probability that A avoids infection can instead be calculated by letting S A be the susceptibility set of A among the initial susceptibles (i.e. with V replaced by S 0 ) and noting that, given S A = , the set A avoids infection if and only if all of the j + individuals in A ∪ S A avoid outside infection and are not contacted by any of the m initial infectives, which happens with probability π +j q m +j . Thus (3.6) may be replaced by .7) and (3.4) follows using (3.1) and (3.5).
Letf n,m (x) = E[x S ] (x ∈ R) be the probability generating function of S.
Proof The Taylor expansion off n,m (x) about x = 1 gives where the third equality follows using Proposition 3.1.
Remark 3.4 Setting θ = 0 in (3.12) yields a triangular system of equations for the total size distribution of the epidemic. This system was first obtained for the general stochastic epidemic by Whittle (1955), using an algebraic argument. Subsequently, it has been derived for several other special cases of the present model by a number of authors, using a variety of methods.
Remark 3.5 Setting k = 0 in (3.12) yields a Wald's identity for the epidemic, cf. Ball (1986), Theorem 2.1, Picard and Lefèvre (1990), Corollary 3.2, and Addy et al. (1991), Theorem 1. For the collective Reed-Frost epidemic without outside infection, Picard and Lefèvre (1990) derive (3.12) by applying the optional stopping theorem to a suitable family of martingales defined on the epidemic process and then use an Abel expansion of φ n,m (x, θ ) in terms of G i (x | U(θ)) to derive (2.5).

Initial and Non-Initial Infectives Behave Differently
In some settings the behaviour of initial and non-initial infectives may differ. The above results are easily extended to incorportate that phenomenon. For k = 0, 1, . . . and θ ≥ 0, define q k (θ ) for non-initial infectives, as before, and defineq k (θ ) analogously for the initial infectives. The remainder of the notation remains unchanged.
Proof We prove the corresponding result for φ n,m (x, θ ), i.e. for the model without outside infection. The result forφ n,m (x, θ ) then follows by conditioning on the number of initial susceptibles who avoid outside infection.
The result follows immediately using the definition (2.4) of Gontcharoff polynomials when m = 0, so assume that m > 0 and fix n. The final outcome of the epidemic can be obtained by first letting the m initial infectives make their contacts and then conditioning on the number of susceptibles that remain, S * say. Let T * A denote the sum of the infectious periods of the m initial infectives. Then, using Theorem 3.1, . . , n) is the probability mass function of the number of objects not sampled in the symmetric sampling procedure having inclusion probabilities (see Section 2.1) r k = q k (θ )/q 0 (θ ) m (k = 0, 1, . . . , n) and that k as required.  Ball and O'Neill (1999) introduced the notion of general final state random variables for SIR epidemic models, which are defined as sums over all ultimately infected individuals of random quantities of interest associated with an individual. Consider the collective Reed-Frost epidemic with outside infection defined in Section 2.4. Let R = (R 1 , R 2 , . . . , R p ) be a random vector associated with a typical infective. The components of R represent quantities of interest associated with that infective, for example, R 1 could be the length of its infectious period, R 2 could be how many infectious contacts that individual attempts to make, and so on. Some other examples are given below. The components of R may be dependent. The realisations of R for distinct infectives are mutually independent but for any given infective, R may be dependent on its contact behaviour. The realisations of R for the initial infectives are identically distributed, as are those for non-initial infectives, but the two distributions may differ. Let T R denote the sum of the R-vectors over all of the individuals that are ultimately infected by the epidemic, including the initial infectives. Thus if R 1 is the length of the infectious period then the first component of T R is the severity of the epidemic. For ease of exposition it is assumed that the components of R are all nonnegative almost surely. The results continue to hold if this condition is relaxed, though the domains of Laplace transforms may change.
Recall that S denotes the number of intial susceptibles who survive the epidemic. For n, m = 0, 1, . . . , let be the joint generating function Laplace transform of (S, T R ) given that initially there are n susceptibles and m infectives. The following theorem generalises Ball and O'Neill (1999), Theorem 4.2, which considers the extended general epidemic without outside infection and does not distinguish between initial and non-initial infectives.
Remark 3.7 An analogous result to Corollary 3.2 follows immediately from Theorem 3.3.
A couple of examples of the use of general final state random variables follow; both are motivated by extensions of the collective Reed-Frost epidemic model considered in Picard and Lefèvre (1990).

Several Types of Infectives
Following Picard and Lefèvre (1990), Section 2.3, suppose that there are s types of infectives, labelled 1, 2, . . . , s, who may have different infectious period distributions and different degrees of infectivity. Suppose that there is no outside infection and that initially there are m i infectives of type i (i = 1, 2, . . . , s) and n susceptibles. For i = 1, 2, . . . , s, define q (i) k (θ ) (k = 0, 1, . . . ; θ ≥ 0) for a typical type-i infectives analogously to q k (θ ) (k = 0, 1, . . . ; θ ≥ 0) in Section 2.3. Each infected susceptible independently becomes an infective of type i with probability α i , irrespective of the type of its infector, where s i=1 α i = 1. Distinct infectives behave independently of each other and initial infectives of a given type follow the same probabilistic law as non-initial infectives of that type. As explained in Picard and Lefèvre (1990), the carrier-borne epidemic models of Pettigrew and Weiss (1967) and Downton (1968) are special cases of this model.
Remark 3.8 Theorem 3.4 extends Picard and Lefèvre (1990), Proposition 3.3, which gives the joint generating function Laplace transform of (S, T A ), where S = n − s i=1 (m i + Z (i) ) is the number of susceptibles remaining at the end of the epidemic. The latter may be obtained from Theorem 3.4 as follows. Note that where m = s i=1 m i and 1 is the row vector of s ones. Applying Property 2.3 yields G i (1 |Ū(x −1 1, θ ) 0, 1, . . . ). Thus, using Theorem 3.4, which is Picard and Lefèvre (1990), Proposition 3.3. Picard and Lefèvre (1990), Section 3.3, considers models with different types of infectives in which infectives of each type pass through successive stages of infection, having possibly different levels of infectivity, before recovering. For simplicity, we consider here the special cases where there is only one type of infective described in Lefèvre and Picard (1995). Thus suppose that there are L stages of infection, labelled 1, 2, . . . , L. Consider a typical infective and define the random vector R = (R 1 , R 2 , . . . , R L ), where R i is the total time the the infective spends in infection stage i (i = 1, 2 . . . , L). For θ = (θ 1 , θ 2 , . . . , θ L ) and k = 0, 1, . . . , defineq k (θ) and q k (θ) for initial and non-initial infectives as in Section 3.5. Then the joint generating function Laplace transform of the final number of susceptibles, S, and the total time spent in the L stages of infection by the infectives during the whole course of the epidemic, T R , follows immediately from Theorem 3.3.

Different Degrees of Infectiousness
For a more explicit example, suppose that the progress of an infective through the stages of infection is modelled by a homogeneous continuous-time Markov chain {W (t) : t ≥ 0} having states labelled 0, 1, . . . , L, where W (t) gives the stage that the infective is in t time units after it was first infected and state 0, which is absorbing, corresponds to the infective being recovered. In Lefèvre and Picard (1995), the states are necessarily visited sequentially, i.e. an infective starts off in state L and on leaving a state it moves to the next lower state, but here we allow {W (t) : t ≥ 0} to be any arbitrary but specified transient continuoustime Markov chain with a finite state space and a single absorbing state. Thus the infectious period of an infective follows a phase-type distribution (see Asmussen 1987, Chapter III, Section 6).
Then conditioning on (τ, W (τ )) and using the strong Markov property of {W (t) : t ≥ 0} yields Expressing (3.15) in vector/matrix form and solving yields (3.14). The matrix D(θ ) − Q is non-singular since the eigenvalues of Q all have strictly negative real parts, see Asmussen (1987), page 77.
Return to the epidemic process and suppose that, given {W (t) : t ≥ 0}, the infective contacts distinct susceptibles independently at the points of a Poisson process having rate β W (t) at time t, where β 0 = 0 and, for i = 1, 2, . . . , L, β i is the individual-to-individual infection rate from an infective in state i. Assume that the initial states of initial infectives are independent and identically distributed, with P(W (0) = i) = ξ i0 (i = 1, 2, . . . , L), and let ξ 0 = (ξ 10 , ξ 20 , . . . , ξ L0 ). Define ξ = (ξ 1 , ξ 2 , . . . , ξ L ) similarly for non-initial infectives. Consider a typical initial infective, with associated infection-stage process {W (t) : t ≥ 0}, and note that R defined above is given by R = T . For k = 1, 2, . . . , let A k be the event that that infective fails to infect anyone in a given set of k susceptibles and note that P(A k |T ) = exp(− L i=1 kβ i T i ). Thus, using Lemma 3.3, where β = (β 1 , β 2 , . . . , β L ). A similar argument gives The joint generating function Laplace transform of (S, T R ) now follows immediately using Theorem 3.3.

Introduction and Notation
The above theory extends naturally and easily to multitype epidemics. The proofs parallel in an obvious fashion those of corresponding results for single population epidemics, so only very brief outlines are given. The following notation is used throughout this section. For vectors, x = (x 1 , x 2 , . . . , x J ) and y = (y 1 , y 2 , . . . , y J ) say, belonging to R J , the product J i=1 x y i i is denoted by x y and x ≤ y denotes that x i ≤ y i for all i = 1, 2, . . . , J . The row vectors of J zeros and of J ones are denoted by 0 and 1, respectively. For vectors, k = (k 1 , k 2 , . . . , k J ) and n = (n 1 , n 2 , . . . , n J ) say, belonging to Z J + , define n [k]

Multitype Symmetric Sampling Procedures
Consider a finite population N comprising J types of objects, labelled 1, 2, . . . , J . For i = 1, 2, . . . , J , let N i denote the set of objects of type i, 1, 2, . . . , J ). Let X be a random subset of N and X = (X 1 , X 2 , . . . , X J ), where X i = X ∩ N i (i = 1, 2, . . . , J ). For A ⊆ N , let p A = P(X = A) and r A = P(X ⊇ A). A multitype symmetric sampling procedure is one in which for all A ⊆ N , p A depends only on the numbers of the different types in A, i.e. on a = (a 1 , a 2 , . . . , a J ), where a i = A ∩ N i (i = 1, 2, . . . , J ). It follows that p A = p a / N a , where p a = P(X = a), and also that r A depends only on a, so one can write r A = r a . Further, using the multitype version of (2.2) yields the following formula for the factorial moments of X: (4.1)

Multivariate Gontcharoff Polynomials
The Gontcharoff family of polynomials was extended to the case of several variables by Lefèvre and Picard (1990) as follows. Let U = (u j ∈ R J : j ∈ Z J + ) be a collection of real numbers. Then the Gontcharoff polynomials associated with U , viz. G j (x | U ) (j ∈ Z J + ) where x = (x 1 , x 2 , . . . , x J ) ∈ R J , are defined recursively by n i=0 n [i] Thus, for k ∈ Z J + , the polynomial G k (x | U ) is of degree k 1 , k 2 , . . . , k J in the variables x 1 , x 2 , . . . , x J . For j ∈ Z J + , the partial derivative of G i (x | U ) of order j 1 , j 2 , . . . , j J in x 1 , x 2 , . . . , x J is denoted by G (j ) i (x | U ). Properties 2.1, 2.2 and 2.3 in Section 2.2 generalise to multivariate Gontcharoff polynomials, see Properties 4.2, 4.4 and 4.5, respectively, in Lefèvre and Picard (1990).

Model Definition
The model is defined analogously to the single-population collective Reed-Frost model in Section 2.3 but now there are J types of individuals, labelled 1, 2, . . . , J . The epidemic again evolves in discrete time t = 0, 1, . . . . Let X t = (X t1 , X t2 , . . . , X tJ ) and Y t = (Y t1 , Y t2 , . . . , Y tJ ), where X ti and Y ti are respectively the numbers of susceptible and infectives of type i present at time t. Suppose that (X 0 , Y 0 ) = (n, m), where n = (n 1 , n 2 , . . . , n J ) and m = (m 1 , m 2 , . . . , m J ), so initially there are m i infectives and n i susceptibles of type i (i = 1, 2, . . . , J ). For i = 1, 2, . . . , J and k ∈ Z J + \ {0}, letq (i) k be the probability that any given type-i initial infective fails to contact anyone in any given set of k susceptibles (i.e. comprising k 1 type-1 susceptibles, k 2 type-2 susceptibles, . . . , k J type-J susceptibles). Letq similarly for non-initial infectives. Different infectives make contacts independently of each other. Thus {(X t , Y t ); t = 0, 1, . . . } is a Markov chain. The epidemic ends when there is no infective present in the population.
The model can be extended to incorporate outside infection in the obvious fashion. Specifically, for 0 ≤ k ≤ n, let π k be the probability that everyone in any given set k susceptibles avoids outside infection during the course of the epidemic, with π 0 = 1.

Susceptibility Sets
Let V denote the set of all the individuals in the population and partition , so that initial and non-initial infectives follow the same probabilistic laws. Construct a directed random graph on V , as described in Section 2.3. For A ∈ V , define the susceptibility set S A as in Section 3.1 and let S A = (S A1 , S A2 , . . . , S AJ ), where S Ai = |S A ∩ V i | is the number of type-i individuals in S A (i = 1, 2, . . . , J ). Let j i = |A ∩ V i | (i = 1, 2, . . . , J ), j = (j 1 , j 2 , . . . , j J ) and k = n + m − j , and denote P(S A = ) by P jk (S A = ) (0 ≤ ≤ k). The following lemma is proved analogously to Lemma 3.1.
where U is given by u k = q k (k ∈ Z J + ).

Total Size
Consider the multitype collective Reed-Frost epidemic, including outside infection, described in Section 4.4.1. Let S = (S 1 , S 2 , . . . , S J ), where S i is the number of type-i susceptibles that remain susceptible at the end of the epidemic, and letf n,m (x) = E x S (x ∈ R J ) denote the joint probability generating function of S.
where U is given by u k = q k (k ∈ Z J + ).
Proof Arguing as in the proof of Proposition 3.1, generalised to the multitype epidemic and utilising Remarks 3.3 and 3.6, yields that

General Final State Random Variables
For the above multitype collective Reed-Frost epidemic with outside infection, let R = (R 1 , R 2 , . . . , R p ) be a random vector associated with an infective and T R be the sum of the R-vectors over all individuals that are ultimately infected by the epidemic, including the initial infectives. As in Section 3.5, the realisations of R for distinct individuals are independent and, for ease of exposition, it is assumed that the components of R are all nonnegative almost surely. Examples are numerous and are not considered here but note that if p = J , χ and T I denote respectively the type and infectious period of the given infective, and R i = T I 1 {χ=i} then T R is the (multivariate) severity of the epidemic. Let denote the joint generating function Laplace transform of (S, T R ). Fix attention on a given type-i initial infective, with associated random vector R, and for k ∈ Z J + let A k be the event that this infective fails to infect anyone in a given set of k susceptibles, where A k necessarily occurs if k = 0. For k ∈ Z J + and θ ∈ R J + , letq k (θ) = q n [i] where U (θ) is given by u k (θ) = q k (θ) (k ∈ Z J + ).
Proof Consider first the case when initial and non-initial infectives follow the same probabilstic law so, for k ∈ Z J + ,q k (θ) = q k (θ) (θ ∈ R J + ) andq k = q k . Let Q = (q k : k ∈ Z J + ) and write the joint probability generating functionf n,m (x) asf n,m (x; Q) to show explicitly its dependence on Q. Letq k (θ) = q whereQ(θ) = (q k (θ) : k ∈ Z J + ). Theorem 4.2 for the present case then follows immediately using Theorem 4.1.
For the case when initial and non-initial infectives follow different probabilistic law and there is no outside infection, the final outcome of the epidemic can be obtained by first letting the initial infectives make their contacts, as in the proof of Theorem 3.1. Let S * = (S * 1 , S * 2 , . . . , S * J ), where S * i is the number of type-i susceptibles that remain after the initial infectives have made their contacts, and let T * R be the sum of the R-vectors of the initial infectives. Theorem 4.2 for this case then follows by conditioning on (S * , T * R ), as in the proof of Theorem 3.1. Finally, the result for the corresponding model with outside infection is obtained by conditioning on the numbers of the initial susceptibles that avoid outside infection.
Remark 4.1 Theorem 4.2 is both very general and very powerful. For example, the joint generating function Laplace transform for the total size and severity of the multipopulation collective epidemic model with different types of infectives, given in Proposition 4.3 of Picard and Lefèvre (1990), can be obtained using Theorem 4.2 with a suitable choice for the random vector R. Theorem 4.2 also extends Theorem 5.1 in Ball and O'Neill (1999) to a broader class of models, i.e. to multiype collective Reed-Frost models, which do not necessarily admit a Sellke construction having exponentially distributed tolerances. The multipopulation model underlying Theorem 5.1 in Ball and O'Neill (1999) also allows for infectives to move between the populations but that extra generality can be accommodated using the model in Section 4.4.1 by suitable choice of the random vector R.