We now derive estimates of R
0 based on specific models and compare these with the previously mentioned approximations, which we denote \(R_0^ + = {e^{r{T_G}}}\) and \(R_0^ - = 1 + r{T_G}\). We need to emphasise that R
0 is independent of timescale, whereas T
G
has dimension time and r has dimension time−1. We also need to emphasise that we do not know r or T
G
, we assume that they have been estimated from data in some way, for example by estimating the doubling time of the incidence, D, and writing r = log(2)/D. We do not write ^r or ^T
G
for these estimates, as this would result in far too many hats in one paper.
Assume for simplicity that the population is mixing homogeneously. The incidence of an emerging infection may be calculated from
$$i(t) = \delta (t) + {{S(t)} \over N}\int\limits_0^\infty {A(\tau )i(t - \tau )d\tau } $$
where δ(t), a unit spike, is the incidence of infection at time zero, the kernel A(τ) is the expected infectivity of an infected as a function of τ, the time since exposure to infection [1,6,19]. The number in the population susceptible at time t is
$$S(t) = N - \int\limits_0^t {i(u){\rm{d}}u} $$
For an emerging infection we assume the entire population to be susceptible at time zero. If this is not the case, we take N to be the size of the susceptible population prior to infection. As a first step towards developing a model, we specify the general form of the kernel A(τ). We write A(τ) = R
0
f(τ), where R
0 is the basic reproduction number that we wish to estimate, and f(τ) is the infectivity kernel, which is also the probability distribution of the generation interval.
For an emerging infection we have little information about f. We may have observations of the latent period (the time from exposure to infection to becoming infectious, T
E
); the incubation period (the time from exposure to infection to the onset of symptoms) which we may in some cases assume to equal T
E
; or the infectious period T
I
. Given these we may wish to impose a particular form on the kernel, and use our limited knowledge to estimate parameter values for the distribution. These estimates may be revised as more information becomes available.
One quantity of interest is the mean generation interval of the epidemic, which is taken here to be the mean time from an individual’s exposure to infection to exposing others to infection (see [10], for an insightful exposition). We refer not to the time to the first occurrence of a secondary infection, but to the average time to all secondary infections. Alternatively, and equivalently, it can be defined as the expected duration of the primary infection at the time that a secondary infection occurs (see [22]). The mean generation interval may be determined from the formula
$${T_G} = \int\limits_0^\infty {tf(t){\rm{d}}t} $$
Given a probability distribution for the generation interval, f(τ), and an estimated initial rate of exponential increase for the epidemic, r, we approximate the initial stages of the epidemic by i(t) = e
rt with S(t) ≃ N. Equation (1) then leads to a model-consistent estimate of the basic reproduction number via the formula
$${R_0}\int\limits_0^\infty {{e^{ - rt}}f(t){\rm{d}}t} = 1$$
(see [6]). If f(t) were a delta function, then Eqs. (2, 3) would lead to the estimate R
0 = R
+0
. For the SIR model, where f(t) = γe
−γt, Eqs. (2, 3) lead to the estimate R
0 = R
−0
. We now computeR
0 for three distribution functions which may be used as kernels: those with a fixed, exponentially or trapezoidally distributed infectious period (see Fig. 1). We refer to these as R
fix0
, R
exp0
and R
trap0
, respectively. We also compute R
0 for the model with latent and infectious periods that each have gamma distributions, referred to as R
(m,n)0
. We have R
exp0
= R
(1,10
)} and R
fix0
= lim
m,n→∞
R
(m,n)0
.
2.1 Fixed infectious period
Given fixed latent and infectious periods, T
E
and T
I
respectively, and assuming f constant when non-zero, we have f(τ) = 1/T
I
for T
E
< τ < T
E
+ T
I
and f(τ) = 0 otherwise. For this distribution T
G
= T
E
+ T
I
/2 and
$${R_0} = R_0^{{\rm{fix}}} = {{r\left( {{T_G} - {T_E}} \right)} \over {\sinh r\left( {{T_G} - {T_E}} \right)}}{e^{r{T_G}}}$$
R
+0
is useful as an estimator for R
fix0
when the latent period may be regarded as fixed and the infectious period is short relative to the timescale 1/r (rT
I
is small). As sinh x >
x whenever x > 0, and \({\lim _{x \to 0}}{{\sinh x} \over x} = 1\) we have R
fix0
≤ R
+0
, and \({\lim _{{T_E} \to {T_G}}}R_0^{{\rm{fix}}} = R_0^ + \). Wallinga and Lipsitch [23] showed that R
+0
is an upper bound on estimates of R
0 for any distribution f(t).
2.2 Trapezoidal infection kernel
Consider the kernel
$$f(\tau ) = \left\{ {\matrix{ {{1 \over {{T_I}}}{{\tau - {\tau _a}} \over {{\tau _b} - {\tau _a}}}} & : & {\tau \in \left( {{\tau _a},{\tau _b}} \right)} \cr {{1 \over {{T_I}}}} & : & {\tau \in \left( {{\tau _b},{\tau _c}} \right)} \cr {{1 \over {{T_I}}}{{{\tau _d} - \tau } \over {{\tau _d} - {\tau _c}}}} & : & {\tau \in \left( {{\tau _a},{\tau _b}} \right)} \cr 0 & : & {{\rm{otherwise}}} \cr } } \right.$$
This is a suitable approximation to an infectivity function where nobody is infectious before τ
a
time units or after τ
d
time units post-exposure, maximum infectivity occurs between τ
b
and τ
c
time units after exposure, and contact rates are constant. The distribution is consistent with a mean latent period of \({T_E} = {{{\tau _a} + {\tau _b}} \over 2}\), a mean infectious period of \({T_I} = \left( {{\tau _d} + {\tau _c} - {\tau _b} - {\tau _a}} \right)/2\) and a mean generation interval of \({T_G} = {T_E} + {{{T_I}} \over 2} + {{{{\left( {{\tau _d} - {\tau _c}} \right)}^2} - {{\left( {{\tau _b} - {\tau _a}} \right)}^2}} \over {12\left( {{\tau _d} + {\tau _c} - {\tau _b} - {\tau _a}} \right)}}\) Hence, if the trapezium is symmetric \({T_G} = {T_E} + {{{T_I}} \over 2}\), which is the same relationship as that for the fixed infectious period. The basic reproduction number solves R
trap0
¯f(r) = 1, where ¯f(s) is the Laplace transform of f(t) (see Appendix 1)
2.3 SEIR differential equation models
In an extended SEIR differential equation model the population of size N is made up of S susceptibles, E that have been exposed to infection but are not yet infectious, I infectious and R that have been infected and recovered. If the epidemic processes have a much faster timescale than the demographic processes, we obtain the equations
$$\eqalign{ & {{d{E_1}} \over {dt}} = \beta {S \over N}\sum\limits_{j = 1}^n {{I_j} - mv{E_1}} \cr & {\rm{for }}i = 2,...,m{\rm{ }}{{d{E_i}} \over {dt}} = mv{E_{i - 1}} - mv{E_i} \cr & {{d{I_1}} \over {dt}} = mv{E_m} - n\gamma {I_1} \cr & {\rm{for }}j = 2,...,n{\rm{ }}{{d{I_j}} \over {dt}} = n\gamma {I_{j - 1}} - n\gamma {I_j} \cr & {{dR} \over {dt}} = n\gamma {I_n} \cr} $$
The exposed and infectious classes have been subdivided E = Σ
m
i=1
E
i
and I = Σ
n
j=1
I
j
, respectively. The times spent in the exposed and infectious classes are gamma distributed with means T
E
= 1/ν and T
I
= 1/γ, respectively, andR
0 = β/γ. The mean generation interval is \({T_G} = {T_E} + {{n + 1} \over {2n}}{T_I}\) (see Appendix 2). If the initial rate of exponential increase of the epidemic is r, then
$$R_0^{(m,n)} = {{{{2nr} \over {n + 1}}\left( {{T_G} - {T_E}} \right){{\left( {1 + {r \over m}{T_E}} \right)}^m}} \over {1 - {{\left( {1 + {{2r} \over {n + 1}}\left( {{T_G} - {T_E}} \right)} \right)}^{ - n}}}}$$
This result is derived in Appendix 2, where it is also shown that given values of r, T
E
and T
G
, R
(m,n)0
is an increasing function of both m and n.
2.4 Exponentially distributed infectious period
The well-known SEIR differential equation model is the special case of Eqs. (6) with m = n = 1. For this model the times spent in the exposed and infectious classes are exponentially distributed with means T
E
= 1/ν and T
I
= 1/γ respectively, and the appropriate kernel function in Eqs. (2, and 3) is \(f(\tau ) = {{\gamma v} \over {\gamma - v}}\left( {{e^{ - v\tau }} - {e^{ - \gamma \tau }}} \right)\) (see [6]). The mean generation interval is \({T_G} = {T_E} + {T_I}\), and given r we have
$$R_0^{\exp } = R_0^{(1,1)} = 1 + r\left( {{1 \over v} + {1 \over \gamma }} \right) + {{{r^2}} \over {v\gamma }} = 1 + r{T_G} + {r^2}{T_E}\left( {{T_G} - {T_E}} \right)$$
The approximation \(R_0^ - = 1 + r{T_G} \le R_0^{\exp }\) is appropriate for the SIR model, for which ν → ∞, T
E
→ 0 and T
G
→ T
I
= 1/γ. Hence R
−0
performs best as an estimate when either the latent period T
E
or the infectious period T
I
is small compared to T
G
, and performs worst when they are equal.