Dynamics of Multi-stage Infections on Networks

This paper investigates the dynamics of infectious diseases with a non-exponentially distributed infectious period. This is achieved by considering a multi-stage infection model on networks. Using pairwise approximation with a standard closure, a number of important characteristics of disease dynamics are derived analytically, including the final size of an epidemic and a threshold for epidemic outbreaks, and it is shown how these quantities depend on disease characteristics, as well as the number of disease stages. Stochastic simulations of dynamics on networks are performed and compared to output of pairwise models for several realistic examples of infectious diseases to illustrate the role played by the number of stages in the disease dynamics. These results show that a higher number of disease stages results in faster epidemic outbreaks with a higher peak prevalence and a larger final size of the epidemic. The agreement between the pairwise and simulation models is excellent in the cases we consider.


Introduction
Mathematical models of infectious diseases are known to provide an invaluable insight into the mechanisms driving disease invasion and spread.In many cases, to obtain the first approximation of the spread of a disease it is sufficient to use a version of the classical SIR model (Kermack and McKendrick 1927) [33].However, major outbreaks of avian and swine influenza (Ferguson et al. 2006) [19], SARS (Donnelly et al. 2003) [15], and more recently, ebola (Chowell, Nishiura 2014) [10], have highlighted the need for a more accurate description of the disease dynamics that would provide predictive power to be used for developing measures for disease control and prevention (Keeling, Rohani, 2008) [32].
One of the major simplifying assumptions often used in mathematical models of disease dynamics is the exponential distribution of infectious periods.Effectively, this means that the chance of an individual recovering during any given time period does not depend on the duration of time that individual has already been infected.Whilst such an assumption may provide significant mathematical convenience and be reasonably realistic in some situations, most often it is violated, and this requires the inclusion of the precise distribution of infectious periods in the model (Bailey 1954; Hope-Simpson 1952) [4,23].There are several methods that can be employed to explicitly include a non-exponential distribution, including a multi-stage approach (Anderson and May 1992;Cox and Miller 1965) [2,11], an integro-differential formulation (Hethcote and Tudor 1980;Keeling and Grenfell 1997) [22,30], and a PDE-based formulation akin to that for age-structured models (Anderson and May 1992) [1].In the multi-stage framework, it is assumed that the infectious stage of a disease is characterised by a number K of distinct stages (Cox and Miller 1965;Lloyd 2000;Lloyd Recovery Time (days) 2001) [11,35,36], with the duration of each stage being an independent exponentially distributed random number.Due to the fact that the sum of independent exponentially distributed random variables obeys a gamma distribution (Durrett 2010) [16], one can replace an exponential distribution with the mean infectious period 1/γ by a gamma distribution Γ(K, 1/(Kγ)) that has the same mean infectious period 1/γ.The so-called linear chain trick (Cox and Miller 1965;MacDonald 1978) [11,38] then consists in replacing a single infectious stage with K identical exponentially distributed sub-stages, each having a mean period 1/(Kγ).These multiple stages of infection can be used to represent periods of increased or decreased risk of transmitting the disease (Ma and Earn 2006) [37].The same approach can be extended to models with multiple classes (Keeling and Grenfell 2002;Nguyen and Rojani 2008) [31,43], as well as non-exponentially distributed latency and temporary immunity periods (Blyuss and Kyrychko 2010;Wearing and Rojani 2005) [6,50].Figure 1 illustrates how the gamma distribution of infectious periods arises through the inclusion of multiple stages of infection.Following the methodology of introducing multi-stage of infection to better represent the distribution of infectious periods, we proceed with dividing the infected population into K identical stages I 1 , I 2 , . . ., I K to create the so-called SI K R model (Lloyd 2000) [35], and we denote the total infected population by I = K i=1 I i .One should note that Kγ is now used as the transition rate between successive infectious stages in order to keep the average where S denotes the proportion of susceptible individuals, R is the proportion of recovered or removed individuals, β is the disease transmission rate taken to be the same for all stages of infection, and the disease is assumed to confer a life-long immunity.The importance of including not just the mean infectious period, but the actual distribution of infectious periods, as achieved by the system (1) is further highlighted by the inspection of actual values of epidemiological parameters for several real diseases as presented in Table 1.This table illustrates that whilst the transmission rate and the average infectious period may vary between different diseases, in all of these cases the number of stages that has to be included in order to correctly represent the disease dynamics may also be quite high, this reinforces an earlier observation about the non-exponential nature of infectious period distribution.
In this paper we analyse the dynamics of multi-stage infections with particular emphasis on considering explicit networks.The paper is organised as follows.The next Section contains a brief summary and discussion of earlier results on the properties of the SI K R model (1).In Section 3 we employ the framework of pairwise approximations to derive a multi-stage infection pairwise model and use this to derive analytical expressions for the probability of transmission of infection along an infected edge in a network, a threshold parameter controlling the onset of epidemic outbreaks, and the final size of an epidemic.In Section 4 numerical simulations of the pairwise and the full network models are performed using realistic parameter values from Table 1 to investigate the accuracy of pairwise approximation and to illustrate the role played by the number of stages in the multi-stage distribution in the disease dynamics.The paper concludes in Section 5 with discussion of results and future outlook.

Dynamics of the well-mixed model
As a first step, we consider the SI K R model (1), which has an implicit assumption that every member of the population has a sufficient level of contact so that the infection can be passed from any individual to any other.This is a natural extension of the basic SIR model (Kermack and McKendrick 1927) [33], and as such it has been well-studied in a number of papers (Lloyd 2000;Ma and Earn 2006; van den Driessche and Watmough 2002) [35,37,49].
Perhaps, one of the most important and commonly used parameters characterising the severity of epidemics and stability of the disease-free equilibrium is the basic reproduction number R 0 defined as the expected number of secondary infections caused by a single typical infectious individual in a wholly susceptible population.The value of R 0 is related to the stability of the disease-free equilibrium, and it is an important threshold parameter signifying that an epidemic will spread when R 0 > 1 and die out otherwise.
The basic reproduction number for the system (1) can be found as follows (Hyman and Stanley 1999; Ma and Earn 2006; van den Driessche and Watmough 2002) [26,37,49] which depends on the average duration of infection 1/γ but is independent of the number of stages in the model.A practically important characteristic of an epidemic outbreak is the final epidemic size (Keeling and Rojani 2008) [32].Since the total population size is closed with no inflow or outflow of individuals, i.e.S(t) + I 1 (t) + I 2 (t) + ... + I K (t) + R(t) = 1, at the end of an outbreak we have a burn-out of the epidemic, i.e.I 1 = I 2 = . . .= I K = 0, and hence which coincides with the final epidemic size in the original SIR model (Kermack and McKendrick 1927) [33].Ma and Earn (2006) [37] have recently discussed various aspects related to the derivation and validity of formula (3), and Andreasen (2011) [3] has studied the effects of population heterogeneity on the size of epidemic.A major implication of the above results is the fact that inclusion of possibly more realistic gamma distribution of infectious periods does not alter the threshold of an epidemic outbreak, nor does it affect the final epidemic size.We see that in Fig. 2 the three curves show that considering multi-stage infectious periods has a significant effect of the dynamics of the epidemic.In order to get a better understanding of the distinction in the dynamics of SIR and SI K R models, it is therefore instructive to look at the development of epidemics.Figure 3 illustrates early dynamics of epidemic outbreaks in the case of different numbers of stages; in each case the exponential curves were fitted, which provide an accurate approximation for the initial growth rate of the infection.In the standard SIR model, an outbreak can only take place if R 0 > 1, and at the initial stage, the number of infected individuals can be approximated as I(t) ≈ I(0) exp(λt), where the growth rate is λ = γ(R 0 − 1).In the case of a multi-stage SI K R model, however, the basic reproduction number R 0 does not depend on the number of stages, hence, it cannot by itself be used to determine the exponential growth rate during an early stage of an outbreak.For this model Wearing et al. (2005) [50] have derived the following relation between the basic reproduction number R 0 and the initial growth rate λ which shows that the largest impact of a gamma-distributed infectious period for disease has in this case is the change in the dynamics of the infection for intermediate times, particularly the early growth rate, peak prevalence and overall time frame of the disease.Besides the basic reproduction number, final epidemic size and the initial growth rate of an epidemic, another practically important characteristic of epidemic outbreaks is the peak prevalence defined as the maximum number or proportion of infected individuals that can be achieved during an outbreak.In the case of an SIR model, the peak prevalence can be found as follows (Feng 2007; House and Keeling 2011b) [18,25] Feng (2007) [18] has recently considered an SEIR model with gamma distributed infectious period and derived an expression for the peak of a weighted average of infectious compartments.This result gives some intuition into how the number of stages affects peak prevalence, but it does not provide a closed form expression for the actual peak prevalence in an SI K R model.Numerical results in Fig. 3 suggest that for the same average infectious period, the overall peak prevalence increases with the number of stages included in the model.

Network dynamics with multiple stages
Inclusion of multiple stages of infection in the SI K R model gives a more realistic representation of the infectious period, but the model still has certain limitations due to its underlying assumptions.
In the model (1) it is assumed that the disease is not fatal, and that transitions between different infected classes, or stages of infection, take place at exactly the same rate Kγ.Ma and Earn (2006) [37] have considered a similar model with possibly different rates of transitions between infected classes, while van den Driessche and Watmough (2002) [49] also included disease-induced deaths, whereas Hyman et al. (1999) [26] analysed the dynamics with different transition rates, no death from disease and vital dynamics in the population.
Another major assumption behind model (1) is that the population is well-mixed, i.e. each individual has equal chances of encountering and transmitting a disease to any other individual in  [32].In each plot the solid black line is the numerical solution of the model ( 1) with an appropriate number of stages, and the dashed line is the exponential growth curve with the rate determined by equation ( 4).(a) One-stage model with λ ≈ 1.2055, (b) Two-stage model with λ ≈ 1.4035, (c) Three-stage model with λ ≈ 1.4762,(d) Five-stage model with λ ≈ 1.534.In each case note that in the earliest stages the exponential approximation is virtually identical to the infection curve.a population.Whilst this may be appropriate in the case of outbreaks in small closed communities, for a large number of communicable diseases, such as SARS, influenza and most sexually transmitted infections, this assumption is a gross simplification of the actual dynamics as it overlooks spatial variability, as well as the complexities of a network structure for infections that are transmitted through direct close contact between individuals (House and Keeling 2011a; Keeling and Eames 2005) [24,29].
Modelling complex contact patterns explicitly via networks and considering epidemic dynamics spreading on networks has completely revitalised mathematical epidemiology.This new modelling framework has led to a myriad of models ranging from exact to mean-field and simulation models  [44,12,29,41,7].The many degrees of freedom in modelling offered by networks however, often comes at the price of increasing levels of complexity, where models can be challenging to evaluate analytically and sometimes even numerically.Nevertheless, many valuable paradigm models have been developed which have furthered our understanding of the impact of contact heterogeneity, preferential mixing and clustering on the outbreak threshold and other epidemic descriptors.A particularly useful way of capturing epidemic dynamics on networks is by using the pairwise model (Keeling 1999) [28].This model is based around deriving in a hierarchical way evolution equations for the expected number of nodes, edges, triples and so on.A closure is then employed that curtails the dependence on ever higher order moments.Its premise is simple and quite intuitive, although it can be also shown rigorously (Taylor et al. 2012) [48] that pairwise models before closure are exact.The basic idea of the model is to recognise that changes at node level depend on the status of the neighbours and thus involves edges, e.g. the rate of change in the number of infectious nodes is proportional to the number of S − I links in the network.Similarly, the number of edges can change due to pair interactions and transitions but also due to interactions induced from outside the edge, e.g. the number of S − S links decrease proportionally to the number of S − S − I triples, where infection from the I node destroys the fully susceptible pair.This framework has been used and extended extensively, to asymmetric (Sharkey et al. 2006) [47] and weighted networks (Rattana et al. 2013) [45] for example, and has proved to be a valuable framework.

Pairwise model
As a first step in the analysis of dynamics of multi-stage epidemics on networks, we re-formulate the SI K R model using the framework of pairwise equations, which allows one to analyse the expected values for the number of nodes and links of each type as a function of time (Keeling 1999; House and Keeling 2011a; Taylor et al. 2012) [28,24,48].The particular strength of pairwise models lies in their analytical tractability and the fact that they provide a more accurate description than well-mixed ODE models but do not go to the level of full individual-based stochastic simulations (House and Keeling 2011a) [24].In this formalism of pairwise models, notations [X], [XY ] and [XY Z] are used to denote the expected numbers of individuals in state X, the expected number of links between nodes of type X and Y and the expected number of triples of the form X − Y − Z, respectively.More precisely, given a 'frozen' network with nodes labels X, Y or Z and subscripts indicating nodes i, j and k then where X, Y, Z ∈ {S, I 1 , I 2 , . . ., I K , R}, and G = (g ij ) i,j=1,2,...,N is the adjacency matrix of the network such that g ii = 0, g ij = g ji and g ij = g ji = 1 if nodes i and j are connected and zero otherwise.Moreover, X i returns one if node i is in state X and zero otherwise.The average degree of each node is denoted by n, and the number of nodes in the network by N .The new pairwise SI K R model with a gamma distributed infectious period can then be written as follows, where τ = β/n is the transmission rate per link.Since we consider a closed population, this immediately implies It is worth noting that although a complete description of the SI K R model at the level of pairs would require a system of O(K 2 ) differential equations, it is sufficient to consider the above equations to fully understand the dynamics.
The system (5) is not closed as additional equations describing the dynamics of triples are needed.To eliminate this dependence on higher moments and close the system, we will use the classical moment closure approximation which assumes that short loops and clusters are excluded from the network and that there is no correlation between nodes with a common neighbour (Keeling 1999) [28].

The probability of transmission across an infected edge
When one considers a stochastic network-based simulation, an important quantity characterising the disease dynamics is the probability τ of disease transmission across a given S−I link.In a simple one-stage model, where both infection and recovery are assumed to be distributed exponentially, the probability of no infection event occurring during time t is given by p 0 (t) = e −τ t ; hence 1 − p 0 (t) is the probability that infection does take place over the same time period.Averaging this via integration for all possible recovery times yields the probability that the susceptible node becomes infected.In a standard SIR model with exponentially distributed infectious and recovery period, this probability is, therefore (Danon et al. 2011; Diekmann De Jong and Metz 1998) [12,13] In the case of an SI K R model, the duration of infection is described by the density function of the appropriate gamma distribution The implication of this fact is the following result for the probability of transmission across an edge.Lemma1.For the stochastic SI K R model with the period of infection following the gamma distribution (8), the probability of disease transmission across a given S − I link is given by The proof of this lemma is given in Appendix A.
By rewriting expression (9) in the form and using the fact that e x = lim n→∞ (1 Figure 4 illustrates the dependence of τ on the number of stages K, as well as a limiting behaviour as K → ∞.This figure illustrates that while τ is growing with the increasing number of stages K, it eventually saturates at a level determined by Eq. (10).In fact, this saturation at higher K is observed not only in the probability of transmission, but also in the peak prevalence rate, as well as in the early growth rate.When compared to an exponential distribution, it is these substantial changes in τ observed for smaller values of K that explain the changes in the profile of the infection curves.As will be shown later, τ is a very important quantity that controls various properties of epidemic dynamics, such as the threshold for an outbreak and the final size of an epidemic.9) for different mean infectious periods with τ = 0.166.Crosses, circles and diamonds correspond to integer values of K on each curve.

R 0 -like threshold parameter
Unlike epidemic models in well-mixed populations, defining an appropriate R 0 for pairwise models is more challenging.This is in part due to the difficulty of identifying the typical infectious individual.In order to derive a value for R 0 , one needs to consider and correctly account for the correlation between susceptible and infected nodes and measure R 0 when this has stabilised, see Keeling (1999) [28] and Eames (2008) [17].Intuitively, this means that the epidemic is allowed to spread in order to become established in the network.This allows for typical' infectious individuals to develop and for R 0 to be measured.In large networks this regime can still be considered to be close to or only a small perturbation away from the disease-free steady state.
We now proceed to derive an R 0 -like threshold parameter R which can be used to predict when the epidemics occur, by allowing outbreaks only when R > 1 (Rattana et al. 2013) [45].To this end, we linearise the model ( 5) with a classic closure (6) near the disease-free equilibrium which has the form [S] = N , [SS] = nN , and all other quantities being zero.As in the standard approach, the condition necessary for the initial growth of an epidemic is that the dominant eigenvalue λ max of the resulting characteristic polynomial is positive, and a threshold parameter is obtained as a condition on system parameters that ensure the stability change, i.e. λ max = 0.In the Appendix B it is shown that the characteristic equation for eigenvalues λ of the linearised system near the disease-free steady state for a K-stage model ( 5) is given by In Appendix B we prove that the largest eigenvalue λ satisfying this equation goes through zero, i.e. λ max = 0, when R := (n − 1)τ = 1.(11) This defines a new R 0 -like threshold parameter with τ introduced in (9).A closer inspection shows that this parameter R describes the probability of spreading the disease across a given link multiplied by the likely number of susceptible contacts of the individual assuming that they are the earliest people being infected, which perfectly agrees with the standard definition of R 0 as the average number of secondary cases produced in a fully susceptible population by a single typical infectious individual.Whilst R does not quantify the early growth rate of an epidemic, through its dependence on τ and K it allows one to better predict epidemic outbreaks in the case of a more realistic gamma distribution of infectious period, where in the case of an exponential distribution with the same mean infectious period.We also note that whilst in the implementation of the classic SI K R model there was no effect of changing the number of stages on R 0 , this more sophisticated model results in a threshold which implicitly accounts for multi-stage infectious periods.

The final size of an epidemic
Since the pairwise model ( 5) is a network representation of an epidemic with life-long immunity and fixed population size, eventually an epidemic will burn out, leaving some proportion of the population unaffected and still susceptible to the disease.
the final size of an epidemic is given by the proportion of people in the removed class, i.e.
[R] ∞ = N − [S] ∞ .As we saw earlier for the SI K R model ( 1) in well-mixed population, the final size of a single epidemic does not change with the number of stages.However, the same conclusion no longer holds for the pairwise model ( 5) with the closure ( 6), in which case we have the following result.
Theorem 1 For a single epidemic outbreak in a closed population with a vanishingly small starting level of infection, the final size of an epidemic in the pairwise model ( 5) with the classical closure ( 6) is given by and τ is defined in (9).
Proof.To prove this statement we extend the methodology developed by Keeling (1999) [28] for one-stage epidemics.We first introduce some new variables and parameters and From ( 5) and the easily derived function it follows that these new variables satisfy the following system of equations Since Integrating the first equation in (15) gives where in the last step we have used the fact that G(0) = 0 and the relation derived in Appendix C together with another relation Substituting these two relations into Eq.( 16) and using the fact that [S](0) = N yields Introducing the fraction of susceptible individuals as S ∞ = [S] ∞ /N , the above equation can be rewritten as follows, Since Using the fact that θ = S a ∞ , equation ( 19) can be rewritten in the form where in the last step we have used the relation a = (n − 1)/n.This completes the proof.
We note that our result in Theorem 1 is functionally identical to the result achieved by Keeling (1999) [28], and it generalises the final size equation by replacing τ /(τ + γ) with the parameter τ .In the case K = 1 these two values are equivalent, thus we have perfect agreement with the existing theory.An equivalent relation has been derived for a static configuration network model with an arbitrary degree distribution (Miller 2012) [40].Figure 5 illustrates Theorem 1 by showing how the final size of an epidemic on a network depends on the number of infectious stages and, hence, the shape of the distribution of infectious period, which makes it different from earlier analytical results for a well-mixed population (Ma and Earn 2006) [37].This suggests that inclusion of a more realistic population structure has effect not only on the intermediate disease dynamics, but also on the final proportion of the population that will be affected by the disease.Furthermore, this Figure suggests that for the same mean infectious period, the final size of an epidemic is increasing with the increasing number of stages K.One should note that the number of stages K has the largest effect on the final size of an epidemic for sufficiently low values of K, and then this dependence saturates.As expected, the average node degree n plays an important role, with the minimum value of τ or K required for an epidemic outbreak decreasing with increasing n in perfect agreement with an earlier result in Eq. (11).Stochastic simulations (not shown) demonstrate excellent agreement with the results in Fig. 5, especially for denser networks.The conclusions of Theorem 1 highlight the importance of collecting accurate and reliable data about the infectivity profile of a disease for predicting the scale of an outbreak.

Impact of a realistic infectious period distribution: case studies
In order to test the accuracy of the pairwise model (5) and to illustrate the role played by the distribution of infectious period, we consider the examples of outbreaks of several diseases mentioned in Table 1 in a population that is initially fully susceptible.We concentrate on two common and fairly simple network structures, namely, homogeneous and Erdős-Rényi networks (Newman 2010) [42], with stochastic simulations being performed using a Gillespie algorithm (Gillespie 1977; Chen and Bokka 2005) [21,9].We restrict our attention to these network types as we have a homogeneous pairwise model and we would not expect it to work well for other networks.Following the derivation of the pairwise model, the per-link transmission rate is taken to be τ = β/n, and we now perform the comparison of an average of 250 stochastic outcomes of serious epidemics on a homogeneous and Erdős-Rényi networks against the results of a pairwise model with gamma distributed infectious period.To highlight the impact of including a realistic distribution for the infectious period, we compare the results of simulations with realistic values of parameters from Table 1 against those obtained using an exponentially distributed infectious period as assumed in many existing models.
Severe Acute Respiratory Syndrome (SARS) is a viral disease characterised by flu-like symptoms which is primarily spread through close contacts with infected individuals that makes it a perfect candidate for deducing some basic parameters from epidemiological observations.Figure 6 illustrates the comparison of SARS dynamics on homogeneous and Erdős-Rényi networks with a pairwise approximation.One can observe that the effects of including more stages in the disease model on intermediate behaviour are similar to those seen earlier, namely, that gamma distribution of infectious period shortens the overall duration an epidemic and increases peak prevalence.It is also worth noting that, in accordance with Theorem 1, the final size of an epidemic also increases with K.
The second example we consider is smallpox, a viral disease that has been eradicated globally except for several stockpiles being used for further research.Several papers have modelled the effectiveness of smallpox when used as a bio-weapon, as well strategies for its containment during possible outbreaks (Ferguson et al. 2003;Kaplan Craft and Wein 2002;Meltzer et al. 2001) [20,27,39].Due to a profound impact smallpox has had on a human population over several centuries, an extensive and quite accurate data has been collected about its transmission.Smallpox is spread through a contact with the mucus of an infected individual, which implies that a close contact is essential for a successful disease transmission.In Fig. 7 we show the simulations of smallpox outbreaks on homogeneous and Erdős-Rényi networks using parameter values from Table 1 compared to results of the numerical solution of the corresponding pairwise model (5).The first important observation that the higher severity of epidemics outbreaks as suggested by these data makes the pairwise model more accurate, as expected.The effect of including the realistic distribution of infectious period is more pronounced in this case as compared to the SARS simulations, which can be attributed to the fact that smallpox model includes four stages of infection, while the SARS model had only three stages.Despite changes in the intermediate behaviour for smallpox being more pronounced compared to SARS, the final size of an epidemic as given by the pairwise model only increases from 96.34% to 97.89%, which is consistent with an earlier observation that the effect of increasing the number of stages on the final epidemic size is less noticeable for higher K.
Figure 8 illustrates the comparison of a pairwise model (5) with the closure (6) and a stochastic simulation on the example of influenza data with different number of stages of infection.Comparison of figures (a) and (b) shows that the heterogeneity introduced by the degree distribution makes the pairwise model less accurate due to the fact that this model only takes into account the mean degree n.This suggests that whilst our model is very helpful for understanding general features of multi-stage disease dynamics on networks, it has to be extended further to deal effectively with wider and more realistic node degree distributions.One should note that the effects of increasing the number of stages on peak prevalence and the duration of epidemics reduce for higher values of K, as can be observed by comparing the minor changes between temporary profiles of the threeand five-stage influenza epidemics presented as shown in Fig. 8.

Discussion
In this paper we have analysed the behaviour of multi-stage infections with particular emphasis on contact networks.Unlike the well-mixed models, for which the number of stages modifies the temporary profile of an outbreak but does not affect the final epidemic size or the condition for disease outbreak, in the case of disease spread on a network, the number of stages, i.e. the precise distribution of infectious period, plays a much more prominent role.
In order to make analytical progress with the analysis of disease dynamics on networks, we have employed the framework of pairwise approximation.This has allowed us to determine the probability of disease transmission across a network edge, find an R 0 -like threshold that controls the onset of epidemics, as well as derive an analytical expression for the final size of an epidemic.All of these quantities depend not only on the basic disease characteristics, such as, the transmission rate and the average infectious period, but also on the distribution of the infectious period as represented by the number of stages in the model.The importance of this result lies in the fact that unlike earlier studies of multi-stage models in well-mixed populations (Anderson and Watson 1980; Ma and Earn 2006) [2,37], for the same average duration of the infection period, the final epidemic size is not constant but increasing with the number of stages.Numerical simulations of epidemic outbreak for several different multi-stage infections demonstrate that while the pairwise model provides a reasonably good approximation of the network dynamics, the agreement with stochastic simulations is affected by inhomogeneity in the node degree distribution, as should be expected from the fact that the pairwise closure only depends on the average node degree.
There are several directions in which the approach presented in this paper could be extended.These include the analysis of SIS and SEIR models, as well as inclusion of multiple stages for both the latent and infected classes (Nguyen and Rojani 2008) [43].Another interesting and important problem would be the consideration of network dynamics for epidemic models with temporary im- munity (Blyuss and Kyrychko 2010) [6].Allowing the level of infectiousness of different nodes to vary depending on the stage of infection they belong to would result in even more realistic models of multi-stage diseases on networks.One of the challenging but practically important generalisations of the present framework would be an extension of a pairwise model that would account for heterogeneity in node degree distribution (House and Keeling 2011a) [24].This would provide deterministic models potentially amenable to analytical treatment that would more accurately represent stochastic disease dynamics.
Since the probability of infection taking place for a given S − I link during time t is given by 1 − e −τ t , the probability of transmission across this link in a K-stage is given by where the final equality is obtained by noting that the first integral is simply the integral of the gamma distribution function over R + , and, hence, is equal to one.Integration by parts yields a recursive relation

Appendix B
Linearisation of the pairwise model (5) with the closure (6) at the disease-free equilibrium yields the stability condition for eigenvalues λ as a (2K + 2)× (2K + 2) matrix.It is useful to first consider it in a block form as follows, where C is a zero (K + 1) × (K + 1) matrix, and the matrix A is lower-diagonal, and therefore, its determinant is the product of the diagonal terms.Hence, the characteristic equation can be written as

Figure 1 :
Figure 1: Comparison between the recovery times for 1000 individuals entered into the first of 5 stages of infection using the mean recovery time of 1/γ = 2.228 days for each stage.The black line representing a fitted gamma distribution with parameters Γ(4.897, 0.091) gives an excellent approximation for the expected distribution Γ(5, 0.0909).

Figure 2 :
Figure2: A comparison of infection dynamics for a one-, three-and five-stage SI K R models with data from (Keeling and Rojani 2008)[32].Each curve represents the sum of all I i in the model.Adding extra stages causes epidemics to occur earlier and result in a higher peak of epidemic, although R 0 and the final size are identical for each curve.The parameter values are N = 1000, β = 1.66/day, γ = 0.4545/day.

Figure 3 :
Figure 3: Proportion of infected individuals during a boarding school influenza outbreak with β = 1.66/day and γ = 0.4545/day (Keeling and Rojani 2008)[32].In each plot the solid black line is the numerical solution of the model (1) with an appropriate number of stages, and the dashed line is the exponential growth curve with the rate determined by equation (4).(a) One-stage model with λ ≈ 1.2055, (b) Two-stage model with λ ≈ 1.4035, (c) Three-stage model with λ ≈ 1.4762,(d) Five-stage model with λ ≈ 1.534.In each case note that in the earliest stages the exponential approximation is virtually identical to the infection curve.

Figure 4 :
Figure4: Dependence of the probability of transmission across an S − I edge τ on the number of stages K as given by Eq. (9) for different mean infectious periods with τ = 0.166.Crosses, circles and diamonds correspond to integer values of K on each curve.

Figure 6 :
Figure 6: Simulation of a SARS outbreak using data from Table 1 with n = 10 and N = 1000.Lines correspond to a numerical solution of the pairwise model (5) (K = 1 solid line, K = 3 dashed line), while symbols represent the average of 250 serious outbreaks (K = 1 filled circles, K = 3 triangles).(a) Homogeneous network.(b) Erdős-Rényi random graph.

Figure 7 :
Figure 7: Simulation of a smallpox outbreak using data from Table 1 with n = 10 and N = 1000.Lines correspond to a numerical solution of the pairwise model (5) (K = 1 solid line, K = 4 dashed line), while symbols represent the average of 250 serious outbreaks (K = 1 filled circles, K = 4 triangles).(a) Homogeneous network.(b) Erdős-Rényi random graph.

Table 1 :
Estimates of epidemiological parameters for different infectious diseases.
K R model takes the form (Anderson and May 1992;lowing implicit equation for the final size of an epidemic that determines the proportion of individuals not affected by the disease(Anderson and May 1992;