Precise Estimates of Persistence Time for SIS Infections in Heterogeneous Populations

For a susceptible–infectious–susceptible infection model in a heterogeneous population, we derive simple and precise estimates of mean persistence time, from a quasi-stationary endemic state to extinction of infection. Heterogeneity may be in either individuals’ levels of infectiousness or of susceptibility, as well as in individuals’ infectious period distributions. Infectious periods are allowed to follow arbitrary non-negative distributions. We also obtain a new and accurate approximation to the quasi-stationary distribution of the process, as well as demonstrating the use of our estimates to investigate the effects of different forms of heterogeneity. Our model may alternatively be interpreted as describing an infection spreading through a heterogeneous directed network, under the annealed network approximation.


Introduction
For models of infectious spread in which long-term quasi-stable endemicity is possible, a random variable of particular interest is the persistence time until infection dies out from the population. For a number of such models, it is known (Andersson and Djehiche 1998;Ball et al. 2016;van Herwaarden and Grasman 1995) that as the typical population size N tends to infinity, the expected persistence time τ for an infection that has become endemic in the population satisfies B Damian Clancy d.clancy@hw.ac.uk where we use ∼ to denote that the ratio of the two sides converges to 1 as N → ∞, and A, C are constants whose values depend upon parameters of the process, but not upon N . We assume here, and from now on, that the basic reproduction number R 0 (the average number of new infections caused by a typical infected individual in an otherwise susceptible population) is greater than one, so that the process is supercritical, or 'above threshold'. A pioneering piece of work in this direction was van Herwaarden and Grasman (1995), where it was shown that the relationship (1) holds for a particular susceptible-infectious-removed (SIR) infection model. Evaluation of A required numerical solution of a system of ordinary differential equations, while no method for evaluating C was given. It should be noted that van Herwaarden and Grasman (1995) studied a diffusion approximation to the infection model, but it is now well known (see, e.g., Clancy and Tjia 2018;Doering et al. 2005) that such a diffusion approximation does not, in general, give correct leading-order asymptotics-that is, the value of the constant A computed from the diffusion approximation is not necessarily equal to the correct A value for the underlying discrete state-space model. Using rather different techniques, Andersson and Djehiche (1998) derived a result of the form (1), together with explicit expressions for the constants A, C, for the classic susceptible-infectious-susceptible (SIS) model of Weiss and Dishon (1971).
In recent years, a number of authors have applied techniques from statistical physics to study persistence times for a range of population models, including infection models. For models which are naturally one-dimensional, including the classic SIS model, a relationship of the form (1) together with explicit formulae for A, C can be obtained Meerson 2010, 2017). For multidimensional models (including most infection models), it is usually only possible to establish results of the cruder form lim N →∞ (ln τ )/N = A, and to evaluate the leading-order constant A via numerical solution of a system of ordinary differential equations (Dykman et al. 1994;Elgart and Kamenev 2004;Hindes and Schwartz 2016;Kamenev and Meerson 2008;Lindley et al. 2014). One reason why the technique has not been more widely exploited is that for a k-dimensional model, it is necessary to solve a system of ordinary differential equations in 2k dimensions subject to boundary conditions at times t = −∞ and t = +∞. Progress has been made by a number of authors in the development of efficient numerical procedures (e.g. Forgoston et al. 2011;Lindley and Schwartz 2013), but implementation for models in dimensions k > 2 remains far from trivial.
Much of the above work on infection models assumes that individuals' infectious periods are exponentially distributed. This is not biologically realistic for most infections and is done purely for reasons of mathematical tractability, so that it is of interest to understand the effect of this simplifying assumption upon the resulting persistence time estimates. Ball et al. (2016) extended the result of Andersson and Djehiche (1998) for the classic SIS model to allow for a quite general infectious period distribution, by applying a result on insensitivity in stochastic networks from Zachary (2007), finding that the leading-order constant A takes the same value regardless of the infectious period distribution (provided only that its mean is held constant), but that the prefactor constant C must be appropriately modified.
A different extension of the result of Andersson and Djehiche (1998) has recently been established in Clancy (2018). For an SIS model incorporating heterogeneity in susceptible individuals' levels of susceptibility or infected individuals' levels of infectiousness, an explicit formula was found for the leading-order constant A in the relationship lim N →∞ (ln τ )/N = A. Infectious periods were allowed to follow an Erlang distribution, and the value of A shown to depend only upon the mean of the distribution, under this assumption. In the current paper, we build on the work of Clancy (2018) to establish results of the much more precise form (1), including simple explicit formulae for the prefactor constant C. At the same time, we extend the model of Clancy (2018) in two ways: firstly, we allow for heterogeneity in individuals' infectious period distributions in addition to the heterogeneities in susceptibility and infectiousness of Clancy (2018); secondly, following the approach of Ball et al. (2016), we allow for infectious periods following quite general (not necessarily Erlangian) distributions. Clancy (2018) showed that for a sufficiently large population, greater heterogeneity (in the sense of majorization ordering, see Marshall et al. 2011), whether in susceptibility or infectiousness, leads to a reduction in mean persistence time of infection in the population. Using our more general model, we are also able to investigate the effect of heterogeneity in infectious period distributions; see Sect. 5.2.
We note that the only infection model for which such explicit formulae for A, C have previously been available is the SIS model in a homogeneously mixing population (with general infectious period distribution); in general, the leading-order constant A must be evaluated via numerical solution of a system of ordinary differential equations (a non-trivial exercise, as noted above), while no general method exists to evaluate the prefactor constant C for multidimensional models.
The remainder of the paper is structured as follows. In Sect. 2, we define our model and state our main result, Theorem 1. Section 3 recalls some standard approximations for infection models and general theory that we will require in the sequel. The proof of our results occupies Sect. 4. In Sect. 5, we demonstrate the accuracy of our approximations, both for mean persistence time and for the quasi-stationary distribution of the process; we apply our results to investigate the effects of different forms of heterogeneity; and we outline the application of our results to network models via the annealed network approximation Schwartz 2016, 2017). Finally, in Sect. 6, we present some concluding discussion and suggestions for further work.

The Model and Asymptotic Persistence Time Formulae
Consider a closed population of N individuals divided into k groups, with group i (i = 1, 2, . . . , k) consisting of N i individuals. Denote by f i = N i /N the proportion of the population belonging to group i, so that i f i = 1. When a group i individual becomes infected, it remains so for a time distributed as a random variable T i of mean α i = E [T i ], after which it returns to the susceptible state. During this infectious period, the group i infective makes contacts with each individual in each group j = 1, 2, . . . , k at the points of a Poisson process of rate βλ i μ j /N , where β is some overall measure of infectiousness, λ i represents the relative infectiousness of group i individuals and μ j represents the relative susceptibility of group j individuals. (The assumption that the group i to group j infection rate factorises in this way is sometimes referred to as 'separable mixing'.) Without loss of generality, we scale the λ i , μ j values so that i λ i f i = j μ j f j = 1. These Poisson processes and infectious periods are all mutually independent. If a contacted individual is susceptible, then it becomes infected (and infectious); if the contacted individual is already infected, then the contact has no effect. We denote by I j (t) the number of infected individuals in group j at time t ≥ 0, the corresponding number of susceptible individuals being S j (t) = N j − I j (t), and write I(t) = (I 1 (t), I 2 (t), . . . , I k (t)). We assume throughout that β > 0, and that f i , α i , λ i , μ i > 0 for all i. The basic reproduction number R 0 for our model is given by and we will assume throughout that parameter values are such that R 0 > 1. In order to state our results, we require the following definitions. Define D(λ, μ) to be the unique positive solution of Denote by ϕ i (θ ) = E e θ T i the moment-generating function of T i , and define the function ψ(θ) to be Finally, define ω to be the unique solution in [0, 1) of We are now in a position to state our main result as follows.
Theorem 1 Consider the heterogeneous population SIS infection model defined above, and recall that τ denotes the expected time from quasi-stationarity to disease extinction.
(i) Suppose that heterogeneity is in susceptibility but not infectiousness (so λ = 1), and that R 0 > 1. Then as N → ∞, (ii) Suppose that heterogeneity is in infectiousness but not susceptibility (so μ = 1) and that R 0 > 1. Provided that individuals' infectious periods are exponentially distributed (so the process I(t) is Markovian with transition rates given in Table 1), then as N → ∞, In the next section, we summarise some standard theory that we will require in order to prove Theorem 1; the proof itself occupies Sect. 4.

General Theory
We will make use of two standard approximations to the process I(t), valid during different phases of an outbreak. First, consider the early stages of an outbreak initiated by a small number of infected individuals in a large susceptible population. So long as the number of infected individuals remains small relative to the total population size, the process I(t) may be approximated by a multitype branching process in which each group i individual lives for a time distributed as T i and during this time gives birth to new group j individuals ( j = 1, 2, . . . , k) according to a Poisson process of rate βλ i μ j f j . Denoting by G i j the number of type j offspring of a typical type i individual, the basic reproduction number R 0 is given by the dominant eigenvalue of the mean offspring matrix M with entries m i j = E G i j = βα i λ i μ j f j , leading to formula (2). Denoting by ω i the probability that the branching process initiated by a single group i individual produces only a finite number of offspring (corresponding to a minor outbreak of infection), then for R 0 ≤ 1 we have ω 1 = ω 2 = · · · = ω k = 1, while for R 0 > 1, ω = (ω 1 , ω 2 , . . . , ω k ) is the unique solution in [0, 1) k of see, for instance, Section 3 of Clancy and Pearce (2013).
We can alternatively treat the above multitype branching process as a single-type branching process in which the number of offspring produced by a typical type i individual is distributed as G i where G i follows a mixture distribution: G i = G i j with probability μ j f j for j = 1, 2, . . . , k. For a process initiated by a single infected individual which belongs to group i with probability μ i f i , the minor outbreak probability ω is given by ω = k i=1 μ i f i ω i , and for R 0 > 1, ω is the unique solution in [0, 1) of ω = ψ(ω), where ψ(θ) is defined by Eq. (4).
Given that a major outbreak occurs, then following the initial (branching process) phase, the growth of the epidemic towards an endemic equilibrium and the longterm quasi-equilibrium behaviour may be approximated by a deterministic process. Denoting byF i (u) = Pr(T i > u) the survival function of T i , then the scaled process I(t)/N is approximated by the deterministic process y(t) which evolves according to, for i = 1, 2, . . . , k, System (8) has a disease-free equilibrium point at y = 0, and for R 0 > 1 a unique nonzero equilibrium point y * with components (Nold 1980;Clancy 2015) where D(λ, μ) is given by Eq. (3).
For the remainder of this section, we focus upon the case that infectious periods T i are exponentially distributed. The process {I(t) : t ≥ 0} is now a continuous-time Markov chain on the finite state space S = k i=1 {0, 1, . . . , N i } with transition rates given in Table 1. The moment-generating function of T i (i = 1, 2, . . . , k) is ϕ i (θ ) = (1 − α i θ) −1 , and it follows that the minor outbreak probability ω is where D(μ, λ) is given by Eq.
(3) with the roles of λ, μ interchanged. The process I(t) is a density-dependent process in the sense of chapter 11 of Ethier and Kurtz 2005; that is, the transition rates are of the form for some functions W l : R k → R + , where L is the set of possible jumps from each state x ∈ S. The scaled process I(t)/N converges almost surely over finite time intervals (Ethier and Kurtz 2005, Theorem 11.2.1), as N → ∞, to the solution y(t) of the ordinary differential equation system That is, Eq. (8) may, in the case of exponentially distributed infectious periods, be written in the form For R 0 > 1, the disease-free equilibrium point y = 0 is unstable and the endemic equilibrium point y * given by (9) is globally asymptotically stable (Lajmanovich and Yorke 1976). Denote by Q the transition rate matrix of the process, with entries given in Table 1. The state space S is made up of an absorbing state at 0 and a single transient communicating class C, and we denote by Q C the restriction of Q to C. The infection will almost surely die out within finite time, and (Darroch and Seneta 1967) there exists a unique quasi-stationary distribution q with elements The distribution q may be found as the unique solution of where −(1/τ ) is the eigenvalue of Q C with largest real part, and the time to extinction from quasi-stationarity is exponentially distributed with mean τ . For a process with transition rates of the form (11), Eq. (13) may be written as with Writing y = x/N , then, following the methodology described in Assaf and Meerson (2017) and references therein, we adopt the WKB (Wentzel, Kramers, Brillouin) ansatz that for some functions V ( y), V 0 ( y) that do not depend upon N , and some K N that does not depend upon y. Without loss of generality, we set V ( y * ) = V 0 ( y * ) = 0. Then and similarly Substituting into Eq. (14), and assuming that τ is sufficiently large for the right-hand side to be neglected, we obtain Collecting together leading-order terms from Eq. (17), we have Collecting together second-order terms in Eq. (17) gives a first-order linear partial differential equation to be solved for V 0 ( y), once V ( y) has been found from Eq. (18). Equation (18) has been previously studied in numerous specific applications (e.g. Assaf and Meerson 2010;Clancy 2018;Dykman et al. 1994;Elgart and Kamenev 2004;Hindes and Schwartz 2016;Kamenev and Meerson 2008;Lindley et al. 2014). Equation (19) has been analysed in some detail in the k = 1dimensional case (Assaf and Meerson 2017), but does not seem to have been much studied in the context of multidimensional problems. With the above standard results and general theory in hand, we now proceed to the proof of Theorem 1.

Heterogeneous Susceptibilities and Exponentially Distributed Infectious Periods: The Main Body of the Quasi-Stationary Distribution
We will start by proving Theorem 1(i), corresponding to the case of heterogeneity in susceptibility but not infectiousness, so λ = 1, and focus initially on the case of exponentially distributed infectious periods. For the Markov chain model with transition rates given in Table 1, we aim to express the asymptotic form of the main body of the quasi-stationary distribution in the form (16). We thus need to find the constant K N and the functions V ( y), V 0 ( y). In this section and Sect. 4.2, we follow the approach described in Assaf and Meerson (2010) for k = 1-dimensional processes, adapted to our multidimensional situation. For the case λ = 1, the solution to Eq. (18) with boundary condition V ( y * ) = 0 is known, by a slight generalisation of formula (20) of Clancy (2018), to be Next, to evaluate K N , we consider the Taylor series expansion of formula (16) for y in the vicinity of y * . Differentiating Eq. (20), we obtain so that in particular, ∂ V ∂ y y= y * = 0. This is as one would expect, since for large N the quasi-stationary distribution has its mode at x = N y * . Recalling the boundary , Taylor series expansion of (16) yields Denoting by S the matrix with entries then (22) represents a multivariate Gaussian distribution with variance matrix N S −1 , normalisation of which requires that To evaluate the determinant of S, we differentiate Eq. (21) to obtain where δ i j is the Kronecker delta. In particular, recalling that the components of y * are given by formula (9) with λ = 1, then S has elements It follows that the last line above following from Eq. (3) with λ = 1. Substituting from (25) into (23), we obtain To find V 0 ( y), substitute the derivatives (21, 24) into Eq. (19) to obtain, after some simplification, the partial differential equation Equation (27) will be satisfied if we can find V 0 ( y) such that and hence we find that the solution of (27) subject to the boundary condition V 0 ( y * ) = 0 is Our analysis of the WKB approximation (16) is now complete, in that we have found explicit formulae for V ( y), K N and V 0 ( y) in Eqs. (20,26,28), respectively, giving an approximation to the main body of the quasi-stationary distribution. The WKB ansatz is thus justified, in that we have been able to exhibit a solution of the assumed form (16). However, V 0 ( y) → −∞ as y → 0, and the asymptotic form (16) is not valid in the tail of the distribution, where |x| = O(1). We address this in Sect. 4.2.

Heterogeneous Susceptibilities and Exponentially Distributed Infectious Periods: The Asymptotic Persistence Time Formula
For the SIS model with heterogeneous susceptibilities and exponentially distributed infectious periods, the mean extinction time τ from quasi-stationarity is given, from , where e i denotes the unit vector with ith element equal to 1. We are not yet in a position to evaluate the quasi-stationary probabilities q e i , since expression (16) is not valid in the range |x| = O(1). We will therefore derive an alternative asymptotic formula for q x that is valid in the range |x| = O(1), but is un-normalised. Normalisation may be achieved by matching our two approximations in the region where their domains of validity overlap. In order that we can carry out this matching, we first consider the Taylor series expansion of our existing approximation (16) about y = 0. Since V 0 ( y) diverges at y = 0, this is quite technical, as we shall see. To deal with this, we define is well behaved at y = 0, and (16) may be rewritten as Taylor series expansion in the range |x| = o( √ N ), together with substitution for K N from (26), gives A further difficulty arises here, since the derivatives ∂ V /∂ y i given by (21) are not well defined at y = 0. We therefore consider approach to y = 0 along a specific trajectory We now seek an approximation for q x valid for |x| = O(1). For the SIS model with heterogeneous susceptibilities, with the convention that q x = 0 for x / ∈ C, the exact balance equation (13) may be written as Assuming, as before, that τ is sufficiently large for the right-hand side of Eq. (30) to be neglected, and taking the linear approximation to the left-hand side, which is valid in the required range |x| = o( √ N ), we obtain the asymptotic balance equation for x ∈ C. Equation (31) corresponds to the (linear) branching process approximation discussed in Sect. 2, and we seek an un-normalised stationary solution. One component of the solution may be found by solving the detailed balance equations (Kelly 2011, chapter 1) corresponding to (31), and the other component by analogy with the solution for the case k = 1 given in Assaf and Meerson (2010). The solution thus obtained may be written in the form where the normalising factor τ remains to be found. For |x| large, noting that D(μ, 1) > 0 and applying Stirling's formula to the factorial terms, we obtain Along the previously considered trajectory with x i =xξ i , expression (32) reduces to and we can now match expressions (29) and (33) to obtain τ ∼ 1  D(1, μ)D(μ, 1) 2π Noting from formula (20) (1, μ)) − (D(1, μ)/β), and recalling that the minor outbreak probability in the case of exponentially distributed infectious periods is given by Eq. (10), we have now established Theorem 1(i) for the case of exponentially distributed infectious periods. Note that our assumption that τ is sufficiently large for the right-hand side of Eq. (14) to be neglected is thus justified in retrospect. Next, in Sect. 4.3, we extend the result to cover the case of heterogeneous susceptibilities with more general infectious period distributions.

Heterogeneous Susceptibilities and General Infectious Period Distributions
To allow for infectious period distributions more general than exponential, we will follow the approach developed in Ball et al. (2016) for the classic SIS model (the case k = 1). That is, we consider a restarted version of our model and apply an insensitivity result for stochastic networks from Zachary (2007). Consider our model of Sect. 2 in the case λ = 1. Following Hernández-Suárez and Castillo-Chavez (1999), we introduce a regeneration step as follows. Whenever the process reaches the state I = 0, it remains there for an exponentially distributed period of mean 1, after which a randomly chosen individual becomes infected, and the process then continues as before. The newly infected individual at the regeneration step is chosen to belong to group i with probability ρ i for some distribution ρ = (ρ 1 , ρ 2 , . . . , ρ k ).
In the terminology of Zachary (2007), each occasion when an individual in group i becomes infected corresponds to a class i arrival with associated workload distributed as T i /α i . When the process is in state x ∈ S, individuals arrive in class i (i = 1, 2, . . . , k) at rate While x i > 0, the workload of each class i individual reduces at rate The framework of Zachary (2007) allows for movement of individuals between classes, which we do not require here, so that in the notation of Zachary (2007) we take, for i = 1, 2, . . . , k, That is, on completion of its workload, an individual of class i leaves the system with probability φ i0 (x)/φ i (x) = 1 and moves to class j = 1, 2, . . . , k with probability In Theorem 2 of Zachary (2007), it is shown that if a distribution π(x) satisfies a certain partial balance condition (Equations (10) of Zachary 2007), together with an integrability condition (Equation (11) of Zachary 2007), then π(x) is the stationary distribution of the numbers of individuals present in each class, regardless of the distributions of T 1 , T 2 , . . . , T k . In order to find the stationary distribution π(x), we return to the case of exponentially distributed infectious periods. If the restarted process is reversible, then it is straightforward to find π(x) from the detailed balance equations (Kelly 2011, chapter 1). In order that the restarted process be reversible, consider Kolmogorov's criterion (Kelly 2011, chapter 1) applied to the sequence of states 0 → e i → e i +e j → e j → 0 for i = j. That is, we require the product of the transition rates in one direction around the loop to equal the corresponding product of transition rates in the opposite direction. This may be achieved by taking ρ i = μ i f i for i = 1, 2, . . . , k. We can then solve the detailed balance equations, thereby verifying that the restarted process with this choice of ρ i values is indeed reversible, with stationary distribution π(x) satisfying so that It is now straightforward to check that the balance equations (10) of Zachary (2007) are satisfied by π(x) given by (35). The integrability condition (11) of Zachary (2007) is trivially satisfied since our state space is finite. It therefore follows from Theorem 2 of Zachary (2007) that π(x) given by (35) is stationary for the numbers of infected individuals in our restarted process, whatever the distributions of the infectious periods T i . Now π(0) is the expected proportion of time spent in state I = 0 in the long term, which is equal to the expected proportion of time spent in state I = 0 during one regenerative cycle. Denoting by τ 0 the expected regeneration time, being the time from one entry into state I = 0 until the following entry into state I = 0, then τ 0 = 1/π(0). For i = 1, 2, . . . , k, denote by τ i the expected time for the process to hit state I = 0 after having been initiated with a single newly infected individual in group i. Notice that the values of τ 1 , τ 2 , . . . , τ k are the same for the restarted process as for the original process and that Recall from Sect. 3 that in the large population limit, the initial stage of an outbreak initiated by a single newly infected individual may be approximated by a multitype branching process. Recall that ω i (i = 1, 2, . . . , k) denotes the probability that this branching process, initiated by a single group i individual, produces only a finite number of progeny. Denote by ζ i the expected time to extinction of the process conditional upon a finite number of progeny being produced, and by σ i the expected time taken to attain quasi-stationarity, starting from a single group i individual, given that a major outbreak occurs. Then, in the large population limit, we have Now ζ i , σ i are negligible in comparison with τ , and so The above somewhat heuristic argument is made rigorous for the case k = 1 in Appendix B of Ball et al. (2014).
For the homogeneous population case (k = 1), Ball et al. (2016) approximated the sum (38) directly, and from this derived an asymptotic approximation to the mean persistence time τ . For k > 1, it does not seem straightforward to approximate (38) directly, so that we have proceeded more indirectly via the methods of Sects. 4.1-4.2.

Heterogeneous Infectiousness and Exponentially Distributed Infectious Periods
As Clancy (2018) points out, see also Wilkinson and Sharkey (2013), it follows from the network duality results of Harris (1976) and Holley and Liggett (1975) that provided infectious periods are exponentially distributed, the value of τ is unchanged if we interchange the roles of λ, μ. Theorem 1(ii) then follows immediately from Theorem 1(i). Figure 1 shows values of ln τ computed from the exact formula (13) compared with values computed from our asymptotic formulae (6, 7), for the case of exponentially distributed infectious periods with heterogeneous susceptibility or infectiousness. We see that the approximation is quite accurate for the range of parameter values considered, and (although the effect is a little hard to make out from Fig. 1) the accuracy of the approximation improves as population size N increases, as expected. It is also apparent that the approximation performs better as R 0 increases. Since our methods are valid under the condition that R 0 > 1, it is perhaps not surprising that the accuracy of the approximation decreases as R 0 approaches 1. We note that the approximation appears to consistently err on the side of slightly underestimating the mean persistence time.

Performance of the Mean Persistence Time Approximation
The effects of different infectious period distributions are illustrated in Fig. 2, where we consider both exponentially distributed infectious periods and infectious periods which are constant (non-random). For the case of exponentially distributed infectious periods, exact values of ln τ are again computed from formula (13). For the case of constant infectious periods, we compare our asymptotic formula (6)  indicate the dependence of the expected persistence time upon parameters, we write τ = τ α (λ, μ) and denote byτ α (1, μ) the approximation to τ α (1, μ) given by the righthand side of formula (6). For the case of heterogeneous susceptibility, we allowed a burn-in period of t 0 = 0.3τ α (1, μ) for the process to attain quasi-stationarity, after which the process was allowed to continue until either infection became extinct or time t max = 1.8τ α (1, μ) was reached. For the case of heterogeneous infectiousness, we took t 0 = 0.3τ α (1, λ) and t max = 1.8τ α (1, λ). We then computed the maximum likelihood estimate of ln τ as described in Section 6 of Clancy (2018). Histograms of the observed extinction times were compared visually with the probability density function of the exponential distribution with rate parameter estimated by maximum likelihood and seen to fit reasonably well, providing reassurance that the burn-in period t 0 was sufficient. Although the network duality results of Harris (1976) and Holley and Liggett (1975) apply only to exponentially distributed infectious periods, we observe that, as noted in Clancy (2018), Monte Carlo estimates of τ are essentially identical for the cases of heterogeneous infectiousness and (corresponding) heterogeneous susceptibility, even with constant infectious periods. We observe that the error in our approximating formula is much the same when infectious periods are constant as when infectious periods are exponentially distributed, with mean persistence time again being consistently somewhat underestimated by our asymptotic formula (6).

Superspreaders and the Effects of Different Forms of Heterogeneity
For many outbreaks of infection, it is thought that a small number of infected individuals, often referred to as 'superspreaders', are responsible for a disproportionate amount of pathogen transmission (e.g. Lau et al. 2017;Plowright et al. 2017;Yates et al. 2006). This may arise through a variety of mechanisms. Two possibilities are that some individuals infect at a much higher rate than others (represented in our model by heterogeneity in λ); or that some individuals remain infectious for much longer than others (represented by heterogeneity in α). Our results allow us to study the effect of either of these forms of superspreading, comparing alternative forms of heterogeneity to one another, and also comparing with a matched (having the same value for R 0 ) homogeneous population, as follows.
That is, for sufficiently large N , heterogeneity in levels of infectiousness leads to a shorter expected persistence time than corresponding heterogeneity in the lengths of infectious periods. It was shown in Theorem 2(i) of Clancy (2018) that when α = μ = 1, the leadingorder constant A in formula (1) is maximised, for a given value of R 0 , when λ = 1. This implies that for sufficiently large N , τ 1 (1, 1) ≥ τ 1 (η, 1), and since D 1 (1, η) = D η (1, 1) it also follows that for sufficiently large N , τ 1 (1, 1) ≥ τ η (1, 1). That is, for a sufficiently large population, heterogeneity in either levels of infectiousness or infectious period durations reduces the expected persistence time of infection in the population, compared to a corresponding homogeneous population.
These effects are illustrated in Fig. 3, in which we take 10% of the population to generate up to 50 times more potentially infectious contacts per infectious period than the remaining 90% of the population. The leftmost point of each curve, at η = (1, 1), corresponds to a homogeneous population. We see that as the degree of heterogeneity increases, the mean persistence time decreases, while the difference between the effects of the two types of heterogeneity increases. With maximal heterogeneity represented  (7) with α = (1, 1), λ = η; dashed line computed from formula (7) with α = η, λ = (1, 1) byη = (1/ f 1 , 0), the limiting ratio is given byτη(1, 1)/τ 1 (η, 1) = (β−1)/Dη(1, 1) = 1/ f 1 . It is worth noting that when heterogeneity is in infectiousness, the fact that we have held R 0 constant across Fig. 3 implies that the endemic equilibrium point y * also remains fixed. In fact, from Eqs. (3,9), it is immediate that when α = μ = 1, we have y * i = (1 − (1/R 0 )) f i for i = 1, 2, . . . , k, for any λ. Now Theorem 11 of Clancy and Pearce (2013) demonstrates, via a multivariate normal approximation, that greater heterogeneity in infectiousness corresponds to greater variability in the quasi-stationary distribution. Thus, in this case, the decrease in mean persistence time observed across Fig. 3 (solid curve) corresponds to an increase in variability of the quasi-stationary distribution leading to larger fluctuations (around the same equilibrium point) and hence faster extinction of infection. However, the solid line in Fig. 3 could equally be interpreted as corresponding to heterogeneous susceptibilities, with exponentially distributed infectious periods and α = λ = 1, μ = η in formula (6). When heterogeneity is in susceptibilities, it has been shown (Clancy and Pearce 2013, Theorem 10) that the overall endemic prevalence level y * = k i=1 y * i decreases with increasing heterogeneity. That is, when heterogeneity is in susceptibilities, the decrease in persistence time observed across Fig. 3 (solid curve) accompanies a corresponding decrease in overall endemic prevalence level. Furthermore, from formulae (3,9), it is apparent that the endemic equilibrium point y * for the case α = η, λ = μ = 1 is the same as for the case μ = η, α = λ = 1. Consequently, when heterogeneity is in infectious period durations, the decrease in persistence time observed across Fig. 3 (dashed curve) again accompanies a corresponding decrease in overall endemic prevalence level.

Approximating the Quasi-Stationary Distribution
En route to our analysis of mean time to extinction, obtained via approximation of the tail of the quasi-stationary distribution, we also obtained an approximation for the main body of the quasi-stationary distribution, at least in the case of heterogeneous susceptibilities (λ = 1) and exponentially distributed infectious periods, from the WKB formula (16) with K N , V ( y), V 0 ( y) given by Eqs. (26,20,28), respectively. This new approximation may be regarded as a refinement of the multivariate Gaussian approximation (22), previously derived via an approximating diffusion (Ornstein-Uhlenbeck) process in Section 6 of Clancy and Pearce (2013). Figure 4 shows contour plots of the exact quasi-stationary distribution q, obtained from Eq. (13), and the two approximations (16, 22) for a population of size N = 200 consisting of two equalsized groups, with group 1 individuals being three times as susceptible to infection as group 2 individuals. Our new approximation is clearly a great improvement upon the Gaussian approximation, particularly away from the endemic equilibrium point N y * , the contours of the WKB approximation being indistinguishable from those of the exact solution.
Note that unlike the mean persistence time τ , the quasi-stationary distribution q for the case of heterogeneous infectiousness cannot be obtained from the heterogeneous susceptibilities solution simply by interchanging the roles of λ, μ. In fact, even the location of the endemic equilibrium point N y * , corresponding to the mode of the quasi-stationary distribution, is not maintained under this transformation (Clancy and Pearce 2013). On the other hand, for the case of heterogeneous susceptibilities, the WKB approximation (16) approximates the stationary distribution of the restarted process studied in Sect. 4.3 and hence may be used to approximate the quasi-stationary distribution of numbers of infected individuals even when infectious periods are not exponentially distributed.

Network Interpretation
The model of Sect. 2 may be interpreted as describing an infection spreading through a directed network under the so-called annealed network approximation, as outlined in Clancy (2018). Briefly, we suppose that individuals are assigned an in-degree d in and out-degree d out according to some joint probability mass function We suppose that there are a finite number k of (d in , d out ) pairs having nonzero probability. We further assume that the network is uncorrelated, that is, with no correlations between degrees of neighbouring individuals. We define a bijective function j (d in , d out ) : Z 2 + → {1, 2, . . . , k}, so that any individual having degrees (d in , d out ) belongs to group j (d in , d out ). Denote by d in ( j), d out ( j) the in and out degrees, respectively, of a group j individual, and by κ the rate at which infection transmits along each link from an infectious individual to a susceptible individual. This network model may be approximated by our multigroup model of Sect. 2 by setting, for j = 1, 2, . . . , k, The undirected version of the above annealed network approximation (with λ = μ) has been studied in Schwartz (2016, 2017) in terms of the leadingorder constant A in expression (1), evaluated via numerical solution of the Hamilton-Jacobi equation (18). Our results (6, 7), as well as being much quicker and more straightforward to evaluate, are thus considerably more precise, although applicable only to rather restricted classes of directed networks. Specifically, the assumption that λ = 1 corresponds to every individual having the same out-degree, whereas μ = 1 corresponds to every individual having the same in-degree. Nevertheless, it is remarkable to be able to obtain such simple and precise results as formulae (6, 7) even for such a restricted class of networks.

Discussion and Further Work
The main contribution of this paper has been to provide simple explicit formulae (6, 7) for the mean persistence time, in the large population limit, of the heterogeneous population SIS infection model described in Sect. 2. The only infection model for which such a result has previously been available is the homogeneous population version of this same model, corresponding to the case k = 1. Explicit formulae are particularly valuable here since numerical solution of the Hamilton-Jacobi equation (18) generally requires the solution of a high-dimensional system of ordinary differential equations subject to boundary conditions at times t = −∞ and t = +∞, so that even to obtain the leading-order constant A in formula (1) can be very challenging, in the absence of an explicit formula. We have shown, in Sect. 5.2, how such explicit formulae may be used to study qualitative features such as the effects of different types of heterogeneity upon the persistence time of infection. Additionally, in the course of our analysis, we have obtained a new and accurate approximation to the quasi-stationary distribution of the process, which determines long-term behaviour prior to eventual extinction of infection; see Sect. 5.3. Our model may also be interpreted as an approximate model for infection spreading on a directed network, as described in Sect. 5.4. While our results thus represent substantial progress, many open questions remain.
Firstly, while Theorem 1(i) for the case of heterogeneous susceptibilities allows for any infectious period distribution, Theorem 1(ii) for heterogeneous infectiousness applies only with exponentially distributed infectious periods. From numerical work, including that presented in Fig. 2, it seems likely that a result corresponding to formula (6), generalising formula (7) to allow for any infectious period distribution, does indeed apply when heterogeneity is in infectiousness, but we have not been able to prove this because the network duality results of Harris (1976) and Holley and Liggett (1975) apply only provided that infectious periods are exponentially distributed.
Secondly, although our model of Sect. 2 allows for heterogeneity in susceptibilities and infectiousness simultaneously, Theorem 1 requires that only one of these forms of heterogeneity be present. It would be of great interest to find a corresponding formula allowing for both forms of heterogeneity simultaneously. In particular, this would allow our result to be applied to infections spreading on a much more general class of networks, including undirected networks, as studied via the annealed network approximation in Hindes and Schwartz (2016), requiring λ = μ. More generally, one could allow some quite general matrix of contact rates β i j , rather than restricting as we have to contact rates that factorise as β i j = βλ i μ j , in order to study phenomena such as assortative/disassortative mixing (Clancy and Pearce 2013). Unfortunately, there is no reason to expect that explicit formulae such as (6, 7) exist at all in such cases, even under the assumption of exponentially distributed infectious periods. In particular, we have only been able to find an explicit solution V ( y) to the Hamilton-Jacobi equation (18) in the case λ = 1 (formula (20); see also Clancy 2018). Consequently, to evaluate the leading-order constant A in formula (1) generally requires numerical solution of Eq. (18), as implemented for the case λ = μ in Hindes and Schwartz (2016). Further, without an explicit formula for V ( y), Eqs. (19, 23) cannot be used to find explicit expressions for V 0 ( y), K N . One could, in principle, evaluate K N and V 0 ( y) numerically, as was done in Black and McKane (2011) for a particular SIR infection model, and thereby obtain the WKB approximation for the main body of the quasi-stationary distribution, corresponding to our result illustrated in Fig. 4. However, our asymptotic formulae for mean persistence time depend upon approximating the tail of the quasi-stationary distribution, and it is not clear how the matching procedure of Sect. 4.2 could be carried through numerically, without explicit formulae to match.
Finally, it would be of great interest to obtain explicit formulae such as (6, 7) for infection models incorporating features such as disease-induced immunity (SIR models), latent periods and demographic processes of birth, death and migration. For such more sophisticated models, a common strategy has been to resort to approximating the quasi-stationary distribution by a multivariate Gaussian distribution (obtained as the stationary distribution of an approximating Ornstein-Uhlenbeck diffusion process) and then substitute this Gaussian approximation into the right-hand side of Eq. (15) to obtain an approximation to the mean persistence time τ . For instance, Nåsell (1999) made use of this approach in studying an infection model incorporating demographic processes and disease-induced immunity (with exponentially distributed infectious periods), following on from which Andersson and Britton (2000) extended the model to include latency, with latent periods and infectious periods each being allowed to follow Erlang distributions. Unfortunately, while this approach can give some rough qualitative indication of the effect of model parameters upon persistence times, the numerical approximation to τ thus obtained is known to be extremely inaccurate (Clancy and Tjia 2018). Indeed, as pointed out in Nåsell (1999), this approximation does not yield correct N -dependence in the large population limit; specifically, the approximation which appears as equation (2.15) of Nåsell (1999) takes the form τ ≈ c √ N exp(a N ) for some constants a, c, in contrast to the asymptotic form (1) obtained in van Herwaarden and Grasman (1995). In fact, even the value of lim N →∞ (ln τ )/N , given by the leading-order constant A in formula (1), is not correctly reproduced via this approach (Clancy and Tjia 2018;Doering et al. 2005). For these reasons, Andersson and Britton (2000) noted that their approximation 'should only serve as a qualitative guidance and not be relied on in detail'. In view of this failure of the Ornstein-Uhlenbeck approximating diffusion approach, the approach that we have employed, via WKB approximation, can be seen to be of great potential value, yielding as it does the correct asymptotic behaviour. However, it is much more difficult to obtain explicit formulae via this approach, and indeed there seems no reason to expect that explicit formulae such as (6, 7) will exist in general. Consequently, much of the work to date employing this approach for models in dimensions k > 1, by many authors, has consisted essentially of numerical evaluation of the leading-order constant A in formula (1). One exception is the SEIS model in a homogeneous population-that is, the classic SIS model of Weiss and Dishon (1971), extended to allow for a latent ('exposed') period. It was shown in Clancy and Tjia (2018) that for this SEIS model, with latent periods and infectious periods each allowed to follow Erlang distributions, the value of lim N →∞ (ln τ )/N is given by A = (1/R 0 ) − 1 + ln R 0 , exactly as for the classic SIS model (Andersson and Djehiche 1998). Thus, while the presence of a latent period may impact substantially upon the mean persistence time, this impact is restricted to the prefactor constant C in formula (1), at least for this particular model. In general, while explicit formulae for the constants A, C in (1) may be too much to hope for, in cases where the leading-order constant A can only be evaluated numerically a natural next step may be to seek general methods for evaluating the prefactor constant C numerically. Even here, as mentioned in the previous paragraph, the difficulties to be overcome remain substantial.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.