1 Introduction

Mathematical descriptions of infectious disease outbreaks are fundamental to forecasting and simulating the dynamics of epidemics, as well as to understanding the mechanics of how transmission occurs. Epidemiological quantities of interest include incidence (the number of new infections at a given time point), cumulative incidence (the total number of infections up to a given time point) and prevalence (the number of infected individuals at a given time point). Taking a somewhat reductive perspective, it can be said that two main popular frameworks co-exist when modelling an infectious disease outbreak, namely, individual-based models juxtaposed with governing equations. Individual-based models are not only simple to understand in terms of their fundamental assumptions but have also proven extremely impactful (Ferguson et al. 2020). However, mathematical tractability is limited, reliable estimates of expectations may require millions of simulations given the fat-tailed, multiplicative nature of epidemics, and inference can be challenging, with parameter inter-dependence making sensitivity analysis unreliable. In contrast, governing equations tend to have a stronger physical interpretation, are easier to perform inference over, and can be embedded in complex models easily (Flaxman et al. 2020).

The most widely known set of governing equations was presented in the seminal work of Kermack and McKendrick (1927), where they studied the number and distribution of infections of a transmissible disease as it progresses through a population over time. They constructed classes, called compartments, and modelled the propagation of infectious disease via interactions among these compartments. The result is the popular susceptible–infected–recovered (SIR) model, variants of which are widely used in epidemiology. Stochastic versions of SIR models, formulated either as stochastic differential equations or continuous-time Markov chains, are popular when modelling small populations or stochastic environments (Allen 2017). Deterministic and stochastic SIR models provide an intuitive mechanism for understanding disease transmission, and in the original derivation of Kermack and McKendrick (1927), they were noted to be similar to the Volterra equation (Polyanin and Manzhirov 1998). The Volterra equation (of the second kind), or more commonly, the renewal equation, is another popular governing equation (Cauchemez et al. 2016; Cori et al. 2013; Fraser 2007; Nouvellet et al. 2018). A large body of work in infectious disease epidemiology is based around the renewal equation and many modifications exist (Aldis and Roberts 2005; Champredon et al. 2018; Fraser et al. 2004; Roberts 2004). There is a connection between specific compartmental models and renewal equations (Champredon et al. 2018; Rizoiu et al. 2017) but this link has not been established in full generality. The vast majority of renewal frameworks model only incidence, and the explicit link between prevalence and incidence often requires the use of a latent process for incidence (Brookmeyer and Gail 1988).

Between individual-based and governing equation models are stochastic branching processes. Branching processes are applied in the modelling of epidemics by first constructing a stochastic process where infected individuals transmit disease according to simple rules, and then deriving a governing equation for the average behaviour. For example Galton–Watson processes, where individuals infect other individuals at generations specified by a fixed time, provide a tractable and intuitive way of modelling the spread of an infectious disease (Bartoszynski 1967; Getz and Lloyd-Smith 2006). In 1948, Bellman and Harris (1948) elegantly captured a more complex underlying infection mechanism by formulating an age-dependent branching process, where the age-dependence alludes to individuals who infect other individuals after a random interval of time. Interestingly, the expectation of the Bellman–Harris process (Bellman and Harris 1948) follows a renewal equation, whereby their framework links the two worlds of individual-based modelling and governing equations (Fig. 1). The age-dependence assumption of Bellman and Harris allows, in particular, for the variable time between exposure to a pathogen and subsequent transmission to be modelled more realistically, and provides a framework encoding useful biological characteristics of the infecting pathogen, such as incubation periods and non-monotonic infectiousness. Crump and Mode (1968, 1969) and (independently) Jagers (1975) further extended the Bellman–Harris process to a general branching process where individuals not only can infect at random times, but can do so randomly over the duration of their infection (as opposed to the Bellman–Harris process where all subsequent infections generated by each infected individual happen at a single random time).

Fig. 1
figure 1

Simulation of an age-dependent Bellman–Harris branching process in terms of prevalence. Left plot shows the Monte Carlo mean (red) alongside the theoretical mean (green). Right plot shows both the Monte Carlo and theoretical mean, overlaid on the underlying 1,000 simulated trajectories (translucent black lines). In this example, the time-varying reproduction number is given by \(R(t) = 1.15 + \text {sin}(0.15\,t)\), while the generation interval follows the \( \text {Gamma}(3,1)\) distribution. Algorithm 2, given below, is used to compute the theoretical mean (color figure online)

The original formulation by Bellman and Harris (1952), along with subsequent work by Harris (1963), the work of Crump and Mode (1968, 1969), Jagers (1975) as well as the perspective of Bharucha-Reid (1956), with specific application to epidemics, all focused on the simple case of a constant/basic reproduction number \(R_0\). The form of this renewal equation when only considering \(R_0\) is exactly what is commonly used in epidemic modelling where the incidence of infections \(\textrm{I}(t)\) follows a renewal equation given by

$$\begin{aligned} \textrm{I}(t) = R_0\int _0^\infty \textrm{I}(t-u)g(u) \textrm{d}u, \end{aligned}$$

where \(g(\cdot )\) is the probability density function (PDF) of the generation interval. Introducing a time-varying reproduction number R(t) within the Bellman–Harris process in general does not simply entail replacing \(R_0\) with R(t) in the renewal equation. This is not possible because a history of how many secondary infections are created is needed. While justifications based on heuristic arguments such as Lotka’s (1907) (used in tracking the numbers of females in an age-structured population) or the one given by Fraser (2007) are valid within their respective contexts, these arguments lose their validity when considering a stochastic age-dependent branching process with a time-varying reproduction process (Berah et al. 2021; Kimmel 1983; Kimmel and Axelrod 2002). Indeed, we will demonstrate that these arguments are only valid for the specific case of incidence, not for prevalence or cumulative incidence. Furthermore, to our knowledge, no one has previously investigated a fully time-varying reproduction process under the more general Crump–Mode–Jagers framework. Besides the work by Kimmel (1983) on the time-varying Bellman–Harris process, a generation-dependent life length distribution within the Bellman–Harris process has been studied by Fildes (1972) and a generation-dependent offspring distribution by Fearn (1976). Moreover, Edler (1978) and later Biggins and Götz (1987) have analysed a generation-dependent reproduction process in the Crump–Mode–Jagers setting.

In this paper, we introduce an outbreak model based on a time-varying version of the Crump–Mode–Jagers process, which we formulate using random characteristics (Kimmel and Axelrod 2002). Notably, Bellman–Harris, Galton–Watson and Markov branching processes are all special cases of this process. In our novel time-varying Crump–Mode–Jagers process, we specifically allow the statistical properties of infections, i.e., “offspring”, generated by each individual to vary over time. Building on this model, we lay down a general, stochastic process foundation for incidence, cumulative incidence and prevalence, and characterise the renewal-like integral equations they follow. We show that the equations for prevalence and incidence are consistent with the well-known back-calculation relationship (Brookmeyer and Gail 1988; Crump and Medley 2015) used in infectious disease epidemiology. We also show that the common renewal equation used ubiquitously for modelling incidence (Cori et al. 2013; Fraser 2007) is in fact, under specific conditions, equivalent to the integral equation for incidence in our framework. Additionally, we formulate a novel reproduction process where infections occur randomly over the duration of each individual’s infection according to an inhomogeneous Poisson process. The model thus eschews the common assumption that infections happen instantaneously at a random time, as in the Bellman–Harris process, but still admits analytically tractable integral equations for prevalence and incidence. Finally, we introduce an efficient discretisation algorithm for our newly derived integral equations and use this scheme to estimate rates of transmission from serological prevalence of SARS-CoV-2 in the UK and historical incidence data on Influenza, Measles, SARS and Smallpox.

2 Model and theoretical results

2.1 Time-varying Crump–Mode–Jagers outbreak model

Throughout the paper, we shall work with an infectious disease outbreak model based on the Crump–Mode–Jagers (CMJ) branching process, which we extend to allow transmission dynamics to vary over time. Our formulation is inspired by Vatutin and Zubkov (1985, 1993), who give an exposition of the corresponding time-invariant CMJ process using random characteristics. In our time-varying CMJ outbreak model, the initial infection occurs at non-random time \(\tau \ge 0\). All subsequent infections are “progeny” of this index case, and we shall denote the set of these infected individuals by \({\mathcal {I}}^*\). We denote the set of all infected individuals (i.e., including the index case) by \({\mathcal {I}}\).

The index case corresponds to an individual endowed with a collection of random elements indexed by the infection time,

$$\begin{aligned} \{L^\tau , \, \chi ^\tau (\cdot ), \, N^\tau (\cdot )\}_{\tau \ge 0}, \end{aligned}$$

where, for any \(\tau \ge 0\),

  • \(L^\tau \) is a (strictly) positive random variable representing the amount of time the individual remains infected,

  • \(\chi ^\tau (\cdot )\) is a stochastic process on \([0,\infty )\) which we shall call the random characteristic of the individual, and

  • \(N^\tau (\cdot )\) is a counting process on \([0,\infty )\) keeping track of the new infections, i.e., “offspring”, generated by the individual.

For completeness, we set \(N^\tau (u) {:}{=}0 {=}{:}\chi ^\tau (u)\) for \(u<0\). (We will explain the precise roles of \(\chi ^\tau (\cdot )\) and \(N^\tau (\cdot )\) shortly.) The objects \(L^\tau \), \(\chi ^\tau (\cdot )\), and \(N^\tau (\cdot )\) are typically interdependent, as we shall see below, whilst the interdependence of \((L^\tau ,\chi ^\tau (\cdot ), N^\tau (\cdot ))\) and \((L^{\tau '},\chi ^{\tau '}(\cdot ), N^{\tau '}(\cdot ))\) for different \(\tau \) and \(\tau '\) is in fact immaterial and will be glossed over. We shall moreover endow each individual \(i \in {\mathcal {I}}^*\) with \(\{L^\tau _i, \, \chi ^\tau _i(\cdot ), \, N^\tau _i(\cdot )\}_{\tau \ge 0}\), which is an independent copy of \(\{L^\tau , \, \chi ^\tau (\cdot ), \, N^\tau (\cdot )\}_{\tau \ge 0}\). (By an independent copy we mean a new random element which is equal in distribution to the original one and independent of it.)

Suppose now that individual \(i \in {\mathcal {I}}\) is infected at (possibly random) time \(\tau _i \ge \tau \). Intuitively, the infection time \(\tau _i\) then “selects” \(L^{\tau _i}_i\), \(\chi ^{\tau _i}_i(\cdot )\), and \(N^{\tau _i}_i(\cdot )\) from \(\{L^\tau _i, \, \chi ^\tau _i(\cdot ), \, N^\tau _i(\cdot )\}_{\tau \ge 0}\), which the subsequent infection dynamics of this individual will “follow.” (Note that the collection \(\{L^\tau _i, \, \chi ^\tau _i(\cdot ), \, N^\tau _i(\cdot )\}_{\tau \ge 0}\) is independent of the infection time \(\tau _i\).) More concretely, \(N_i^{\tau _i}(u)\) now stands for the number of new infections generated by the individual i up to time \(u+\tau _i\).

Example 1

(Bellman–Harris process). The Bellman–Harris branching model can informally be characterised, in the context of epidemics, by the principle that each individual generates a random number of new infections which occur simultaneously at a random time. Once these new infections have occurred, the individual immediately ceases to be infectious. Let \(\xi (\cdot )\) be a stochastic process on \([0,\infty )\) with values in \(\mathbb {N}{:}{=}\{0,1,\ldots \}\), independent of \(\{L^\tau \}_{\tau \ge 0}\), and then define

$$\begin{aligned} N^\tau (u) {:}{=}{\left\{ \begin{array}{ll} 0, &{} u < L^\tau , \\ \xi (\tau +L^\tau ), &{} u \ge L^\tau . \end{array}\right. } \end{aligned}$$

This specification gives rise to the time-varying Bellman–Harris branching process studied by Kimmel (1983). When the distributions of \(L^\tau \) and \(\xi (t)\) do not depend on the time parameters \(\tau \ge 0\) and \(t\ge 0\), we recover the classical Bellman–Harris process (Bellman and Harris 1948).

Example 2

(Inhomogeneous Poisson process model). In contrast to the Bellman–Harris process, we can consider a more realistic epidemiological model where each infected individual generates new infections randomly and one by one according to an inhomogeneous Poisson process until they cease to be infectious. This process, with a constant rate of transmission has been previously studied in the context of the generation time (Svensson 2007). The infinitesimal rate at time t of new infections generated by an individual originally infected at time \(\tau \le t\) is specified as

$$\begin{aligned} \rho (t)k(t-\tau ), \end{aligned}$$

where \(\rho (\cdot )\) is a non-negative function that models population-level variation in transmissibility while \(k(\cdot )\) is another non-negative function describing how individual-level infectiousness varies over time (Svensson 2007). For example, specifying k(t) to be low or zero for small t can be used to incorporate an incubation period in the model. Let \(\Phi (\cdot )\) be a unit-rate, homogeneous Poisson process on \([0,\infty )\), independent of \(\{L^\tau \}_{\tau \ge 0}\). Then we can define this model explicitly by

$$\begin{aligned} N^\tau (u) {:}{=}{\left\{ \begin{array}{ll} \Phi \big (\int _0^u \rho (v+\tau ) k(v) \textrm{d}v\big ), &{} u < L^\tau , \\ \Phi \big (\int _0^{L^\tau } \rho (v+\tau ) k(v) \textrm{d}v\big ), &{} u \ge L^\tau . \end{array}\right. } \end{aligned}$$

(If \(\rho (t) \equiv \rho \) and \(k(t) \equiv k\), both constant, then new infections follow a homogeneous Poisson process with rate \(\rho k\) until the individual is no longer infected.).

Example 3

(Lévy and Cox process models). In the inhomogeneous Poisson process model of Example 2, tractability does not hinge on the assumption that \(\Phi (\cdot )\) is a Poisson process. We could in fact replace it with a more general, integer-valued Lévy process (i.e., a process with independent and identically distributed increments), where jumps need not be of unit size (e.g., a compound Poisson process). Similarly, replacing the deterministic function \(\rho (\cdot )\) with a stochastic process, as long as it is independent of \(\Phi (\cdot )\), would be straightforward. In the Poisson case, this would turn \(N^\tau (\cdot )\) into a doubly-stochastic Cox process. However, for simplicity and concreteness, we shall stick to the simpler setting of Example 2.

Fig. 2
figure 2

Schematic of infections generated under a Bellman–Harris process and an inhomogeneous Poisson process model. In a Bellman–Harris process, after a generation interval has elapsed, new infections happen at the same time (instantaneously). In the inhomogeneous Poisson process model, an individual is infectious for a period, over which their infectiousness varies, and they produce infections one by one

Remark 4

(Epidemiological interpretation of \(L^\tau \) and k). In the Bellman–Harris process of Example 1, \(L^\tau \) is directly interpreted as the generation interval (Svensson 2007), that is, the time taken for the secondary cases to be infected by a primary case. In the Bellman–Harris process all infections happen at the same time—for example in Fig. 2 we have \(\xi (\tau + L^\tau )=3\), after \(L^\tau \) time units has elapsed since the index case was infected at time \(\tau \). In contrast, in the inhomogenous Poisson process model of Example 2 (and also the Lévy and Cox process models of Example 3), \(L^\tau \) corresponds to how long an individual remains infected (the duration of infection). During this period, an individual can infect others with rate that depends on \(\rho (\cdot )\), which describes the calendar-time variation of overall infectiousness in the population, and on \(k(\cdot )\), which in turn describes how the infectiousness of each infected individual varies over the course of their infection. The individual’s infectiousness profile \(k(\cdot )\) can be set as constant, i.e., variation in the individual’s infectiousness is only due to calendar-time variation in overall infectiousness. If \(k(\cdot )\) is specified to vary significantly, by contrast, then it is advisable to ensure that infections are most likely to end when infectiousness is low. Concretely, this means that \(k(\cdot )\) should then be paired with \(L^\tau \) such that the bulk of its distribution coincides with low values of \(k(\cdot )\), as in the empirical application in Sect. 3.2 below.

The random characteristic is used merely as a book-keeping device, to keep track of an individual’s infection status in two ways—whether they have been infected in the past or, alternatively, whether they are infected at the moment. It is fundamental to obtaining a unified derivation of both cumulative incidence and prevalence in what follows.

Example 5

(Cumulative incidence and prevalence). The random characteristic (in fact non-random!)

$$\begin{aligned} \chi ^\tau (u) {:}{=}{\left\{ \begin{array}{ll} 0, &{} u < 0,\\ 1, &{} u \ge 0, \end{array}\right. } \end{aligned}$$
(6)

determines whether the individual has been infected by time \(u+\tau \) and is therefore used to derive cumulative incidence. The random characteristic

$$\begin{aligned} \chi ^\tau (u) {:}{=}{\left\{ \begin{array}{ll} 0, &{} u < 0,\\ 1, &{} u \in [0,L^\tau ),\\ 0, &{} u > L^\tau , \end{array}\right. } \end{aligned}$$
(7)

determines whether the individual remains infected at time \(u+\tau \) and is used to derive prevalence.

2.2 Cumulative incidence and prevalence

We will now derive integral equations for cumulative incidence and prevalence under this model. To this end, we study the stochastic process

$$\begin{aligned} Z(t,\tau ) {:}{=}\sum _{i \in {\mathcal {I}}} \chi ^{\tau _i}_i(t-\tau _i),\quad t \ge \tau \ge 0, \end{aligned}$$

recalling that \(\tau \) is the infection time of the index case. For the given random characteristic (6), \(Z(t,\tau )\) counts the number of infections occurred by time t and for (7) the number of infected individuals at time t, respectively. Our goal is to derive an equation for the expectation of \(Z(t,\tau )\), covering both cases.

Before embarking on the derivation of the equation governing \(\mathbb {E}[Z(t,\tau )]\), we shall first introduce technical assumptions ensuring \(\mathbb {E}[Z(t,\tau )]<\infty \), which a fortiori guarantees that \(Z(t,\tau )\) is finite with probability one, a property known as regularity in the branching process literature (Sevastyanov 1967). Regarding \(N^\tau (\cdot )\), we write

$$\begin{aligned} \Lambda ^{\tau }(u) {:}{=}\mathbb {E}[N^\tau (u)], \quad \tau \ge 0, \quad u \ge 0, \end{aligned}$$

and henceforth assume that there is a non-decreasing, right-continuous function \(\overline{\Lambda }: [0,\infty ) \rightarrow [0,\infty )\) such that

$$\begin{aligned} \overline{\Lambda }(0) < 1 \quad \text {and} \quad \Lambda ^\tau (u) \le \overline{\Lambda }(u) \quad \text {for any } \tau \ge 0 \text { and } u \ge 0. \end{aligned}$$
(8)

(We will give sufficient conditions that imply this assumption in the context of Examples 1 and 2 below in Examples 13 and 16, respectively.) Moreover, we assume that the random characteristic \(\chi ^\tau (\cdot )\) satisfies \(0 \le \chi ^\tau (u) \le 1\) for any \(\tau \ge 0\) and \(u \ge 0\), which evidently accommodates both (6) and (7) from Example 5. Under these assumptions, straightforward adaptation of the proof of Lemma 4.2 in Crump and Mode (1968) (cf. the proof of Theorem 2.1 in Edler 1978) yields \(\mathbb {E}[Z(t,\tau )]<\infty \) for any \(t \ge \tau \ge 0\).

Now, singling out the index case, we can write

$$\begin{aligned} Z(t,\tau ) = \chi ^\tau (t-\tau ) + \sum _{i \in {\mathcal {I}}^*} \chi ^{\tau _i}_i(t-\tau _i). \end{aligned}$$
(9)

The key insight in the analysis of (9) is to stratify the infected individuals in \({\mathcal {I}}^*\) according to their (unique) “ancestor” among the individuals infected by the index case. More concretely, let \(i_1,i_2,\ldots \in {\mathcal {I}}^*\) label the “offspring” of the index case in chronological order, i.e., so that \(\tau \le \tau _{i_1} \le \tau _{i_2} \le \cdots \), and let \({\mathcal {I}}_{k} \subset {\mathcal {I}}^*\) for each \(k = 1,2,\ldots \) denote the set consisting of \(i_k\) and its “progeny.” We can then write

$$\begin{aligned} \sum _{i \in {\mathcal {I}}^*} \chi ^{\tau _i}_i(t-\tau _i) = \sum _{k :\, \tau _{i_k} \le t} \underbrace{\sum _{i \in {\mathcal {I}}_k} \chi ^{\tau _i}_i(t-\tau _i)}_{{=}{:}Z_k(t)}. \end{aligned}$$

This is an analogue of the principle of first generation for the Bellman–Harris process (Harris 1963, Theorem 6.1); see also Kimmel (1983, p. 5).

Conditional on the random times \(\tau _{i_1},\tau _{i_2},\ldots \), the random variables \(Z_1(t),Z_2(t),\ldots \) can be shown to be mutually independent, with \(Z_k(t)\) equal in distribution to \(\widetilde{Z}(t,\tau _{i_k})\), where \(\big \{\widetilde{Z}(\cdot ,\tau )\big \}_{\tau \ge 0}\) is an independent copy of \(\{Z(\cdot ,\tau )\}_{\tau \ge 0}\) (independent of \(\tau _{i_1},\tau _{i_2},\ldots \), in particular). Thus,

$$\begin{aligned} f(t,\tau ) {:}{=}\mathbb {E}[Z(t,\tau )] = \mathbb {E}[\chi ^\tau (t-\tau )] + \mathbb {E}\Bigg [ \sum _{k :\, \tau _{i_k} \le t} Z_k(t)\Bigg ], \end{aligned}$$

where, using the law of total expectation,

$$\begin{aligned} \mathbb {E}\Bigg [ \sum _{k :\, \tau _{i_k} \le t} Z_k(t)\Bigg ]{} & {} = \mathbb {E}\Bigg [\mathbb {E}\Bigg [ \sum _{k :\, \tau _{i_k} \le t} Z_k(t)\,\Bigg | \,\tau _{i_1},\tau _{i_2},\ldots \Bigg ]\Bigg ] \\{} & {} = \mathbb {E}\Bigg [ \sum _{k :\, \tau _{i_k} \le t} \mathbb {E}[ Z_k(t)\,| \,\tau _{i_1},\tau _{i_2},\ldots ]\Bigg ] \\{} & {} = \mathbb {E}\Bigg [ \sum _{k :\, \tau _{i_k} \le t} \mathbb {E}\big [\widetilde{Z}(t,\tau )\big ]_{\tau = \tau _{i_k}}\Bigg ]. \end{aligned}$$

Since \(\big \{\widetilde{Z}(t,\tau )\big \}_{\tau \ge 0}\) is equal in distribution to \(\{Z(t,\tau )\}_{\tau \ge 0}\), we get

$$\begin{aligned} \mathbb {E}\Bigg [ \sum _{k :\, \tau _{i_k} \le t} \mathbb {E}\big [\widetilde{Z}(t,\tau )\big ]_{\tau = \tau _{i_k}}\Bigg ]{} & {} = \mathbb {E}\Bigg [ \sum _{k :\, \tau _{i_k} \le t} f(t,\tau _{i_k})\Bigg ] \\{} & {} = \mathbb {E}\Bigg [ \sum _{v \in (\tau ,t]} f(t,v) \Delta N^\tau (v-\tau ) \Bigg ] \\{} & {} = \mathbb {E}\Bigg [ \sum _{u \in (0,t-\tau ]} f(t,u+ \tau ) \Delta N^\tau (u) \Bigg ] \\{} & {} = \mathbb {E}\Bigg [ \int _{(0,t-\tau ]} f(t,u+\tau ) \textrm{d}N^\tau (u) \bigg ] \\{} & {} = \int _{(0,t-\tau ]} f(t,u+ \tau ) \mathbb {E}[\textrm{d}N^\tau (u)] \\{} & {} = \int _{(0,t-\tau ]} f(t,u+\tau ) \textrm{d}\Lambda ^\tau (u), \end{aligned}$$

where \(\Delta N^\tau (u) {:}{=}N^\tau (u)-\lim _{v \rightarrow u-}N^\tau (v)\) denotes the jump size of \(N^\tau (\cdot )\) at time \(u \ge 0\). Therefore, the function \((t,\tau ) \mapsto f(t,\tau )\) is governed by the integral equation

$$\begin{aligned} f(t,\tau ) = \mathbb {E}[\chi ^\tau (t-\tau )] + \int _{(0,t-\tau ]} f(t,u+\tau ) \textrm{d}\Lambda ^\tau (u), \quad t \ge \tau \ge 0. \end{aligned}$$
(10)

For the random characteristic (6), \(f(t,\tau )\) is the cumulative incidence at time t, and we shall denote it by \(\textrm{CI}(t,\tau )\). Since \(\mathbb {E}[\chi ^\tau (t-\tau )] = 1\) in this case for \(t \ge \tau \), the equation (10) transforms into

$$\begin{aligned} \textrm{CI}(t,\tau ) = 1 + \int _{(0,t-\tau ]} \textrm{CI}(t,u+\tau ) \textrm{d}\Lambda ^\tau (u). \end{aligned}$$
(11)

In the case (7), \(f(t,\tau )\) is the prevalence at time t, which we henceforth denote by \(\textrm{Pr}(t,\tau )\). In this case,

$$\begin{aligned} \mathbb {E}[\chi ^\tau (t-\tau )] = \mathbb {P}[t-\tau < L^\tau ] = 1 - G^\tau (t-\tau ),\quad t \ge \tau , \end{aligned}$$

where \(G^\tau (\cdot )\) denotes the cumulative distribution function (CDF) of \(L^\tau \). Writing \(\overline{G}^\tau (\cdot ) {:}{=}1 - G^\tau (\cdot )\) for the survival function associated with \(G^\tau (\cdot )\), we have then

$$\begin{aligned} \textrm{Pr}(t,\tau ) = \overline{G}^\tau (t-\tau ) + \int _{(0,t-\tau ]} \textrm{Pr}(t,u+\tau ) \textrm{d}\Lambda ^\tau (u). \end{aligned}$$
(12)

Example 13

(Bellman–Harris process, cont’d). Let us consider the Bellman–Harris case of Example 1 and write \(R(t) {:}{=}\mathbb {E}[\xi (t)]\) for the (time-varying) reproduction number at time \(t \ge 0\). Let us also denote the indicator function of a set A by \(\textbf{1}_A\). Using the law of total expectation and the independence between \(\xi (\cdot )\) and \(\{ L^\tau \}_{\tau \ge 0}\), we get then

$$\begin{aligned} \Lambda ^\tau (u) = \mathbb {E}[N^\tau (u)]&= \mathbb {E}[\xi (L^\tau +\tau )\textbf{1}_{\{ u \ge L^\tau \}}] \nonumber \\&= \mathbb {E}[ \mathbb {E}[\xi (L^\tau +\tau )\, | \, L^\tau ]\textbf{1}_{\{ u \ge L^\tau \}}] \nonumber \\&= \int _{(0,u]} R(u'+\tau ) \textrm{d}G^\tau (u'). \end{aligned}$$
(14)

We will henceforth assume that the maximal reproduction number \(\overline{R} {:}{=}\sup _{t \ge 0} R(t)\) is finite and that the function \(\widehat{G}(u) {:}{=}\lim _{v \rightarrow u+} \sup _{\tau \ge 0} G^\tau (v)\), \(u \ge 0\), satisfies

$$\begin{aligned} \widehat{G}(0) < \overline{R}^{-1}. \end{aligned}$$
(15)

Intuitively, this condition ensures that the distribution of \(L^\tau \) does not become too concentrated near zero over time. We can then define a non-decreasing, right-continuous function \(\overline{\Lambda }(u) {:}{=}\overline{R} \widehat{G}(u)\), \(u \ge 0\) which, in view of (14) and (15), satisfies the assumption (8). Hence, we deduce \(\mathbb {E}[Z(t,\tau )]<\infty \), and the regularity of the branching process follows. Inserting (14) into (11) and (12), respectively, we obtain

$$\begin{aligned} \textrm{CI}(t,\tau )&= 1 + \int _{(0,t-\tau ]} \textrm{CI}(t,u+\tau ) R(u+\tau ) \textrm{d}G^\tau (u),\\ \textrm{Pr}(t,\tau )&= \overline{G}^\tau (t-\tau ) + \int _{(0,t-\tau ]} \textrm{Pr}(t,u+\tau ) R(u+\tau ) \textrm{d}G^\tau (u), \end{aligned}$$

which agree with Kimmel (1983, Theorem 5.1). When \(G^\tau (\cdot )\) admits a PDF \(g^\tau (\cdot )\), the most relevant case in practice, we can simplify the equations further to

$$\begin{aligned} \textrm{CI}(t,\tau )&= 1 + \int _0^{t-\tau } \textrm{CI}(t,u+\tau ) R(u+\tau ) g^\tau (u) \textrm{d}u,\\ \textrm{Pr}(t,\tau )&= \overline{G}^\tau (t-\tau ) + \int _0^{t-\tau } \textrm{Pr}(t,u+\tau ) R(u+\tau ) g^\tau (u) \textrm{d}u. \end{aligned}$$

Example 16

(Inhomogeneous Poisson process model, cont’d). To analyse the Poisson process model of Example 2, we note first that for any \(u \ge 0\),

$$\begin{aligned} \begin{aligned} \Lambda ^\tau (u) = \mathbb {E}[N^\tau (u)]&= \mathbb {E}\bigg [\Phi \bigg (\int _0^u \rho (v+\tau )k(v) \textrm{d}v\bigg )\textbf{1}_{\{ u < L^\tau \}}\bigg ] \\&\quad + \mathbb {E}\bigg [\Phi \bigg (\int _0^{L^\tau } \rho (v+\tau )k(v) \textrm{d}v\bigg )\textbf{1}_{\{ u \ge L^\tau \}}\bigg ], \end{aligned} \end{aligned}$$

whence

$$\begin{aligned} \Lambda ^\tau (u) \le \mathbb {E}\bigg [\Phi \bigg (\int _0^u \rho (v+\tau )k(v) \textrm{d}v\bigg )\bigg ] \le \int _0^u \sup _{\tau \ge 0}\rho (v+\tau )k(v) \textrm{d}v {=}{:}\overline{\Lambda }(u). \end{aligned}$$

If we assume, say, that \(\rho (\cdot )\) and \(k(\cdot )\) are bounded, then \(\overline{\Lambda }(\cdot )\) is non-decreasing, continuous and satisfies (8), implying \(\mathbb {E}[Z(t,\tau )]<\infty \) and the regularity of the branching process. (The CDF \(G^\tau (\cdot )\) does not play a role in regularity for this model since, unlike in the Bellman–Harris process, the random variable \(L^\tau \) cannot precipitate secondary infections.) To work out an expression for \(\Lambda ^\tau (u)\), we shall further assume that \(G^\tau (\cdot )\) admits a PDF \(g^\tau (\cdot )\), as above. Invoking the independence between \(\Phi (\cdot )\) and \(\{ L^\tau \}_{\tau \ge 0}\) and the law of total expectation, we obtain

$$\begin{aligned} \mathbb {E}\bigg [\Phi \bigg (\int _0^u \rho (v+\tau )k(v) \textrm{d}v\bigg )\textbf{1}_{\{ u< L^\tau \}}\bigg ]&= \mathbb {E}\bigg [\Phi \bigg (\int _0^u \rho (v+\tau )k(v) \textrm{d}v\bigg )\bigg ] \mathbb {P}[u < L^\tau ] \nonumber \\&= \overline{G}^\tau (u) \int _0^u \rho (v+\tau )k(v) \textrm{d}v \end{aligned}$$
(17)

and

$$\begin{aligned} \begin{aligned} \mathbb {E}\bigg [\Phi \bigg (\int _0^{L^\tau } \rho (v+\tau )k(v) \textrm{d}v\bigg )\textbf{1}_{\{ u \ge L^\tau \}}\bigg ]&= \mathbb {E}\bigg [\mathbb {E}\bigg [\Phi \bigg (\int _0^{L^\tau } \rho (v+\tau )k(v) \textrm{d}v\bigg )\textbf{1}_{\{ u \ge L^\tau \}}\bigg | \, L^\tau \,\bigg ]\bigg ] \\&= \mathbb {E}\bigg [\mathbb {E}\bigg [\Phi \bigg (\int _0^{\ell } \rho (v+\tau )k(v) \textrm{d}v\bigg )\bigg ]_{\ell = L^\tau }\textbf{1}_{\{ u \ge L^\tau \}} \bigg ] \\&= \int _0^u \int _0^{u'} \rho (v+\tau )k(v) \textrm{d}v \, g^\tau (u') \textrm{d}u'. \end{aligned} \end{aligned}$$

Integrating (17) by parts,

$$\begin{aligned} \overline{G}^\tau (u) \int _0^u \rho (v+\tau )k(v) \textrm{d}v= & {} \int _0^u \int _0^{u'} \rho (v+\tau )k(v) \textrm{d}v \, \textrm{d}\overline{G}^\tau (u') \\{} & {} +\int _0^u \overline{G}^\tau (u') \rho (u'+\tau )k(u') \textrm{d}u' \\= & {} -\int _0^u \int _0^{u'} \rho (v+\tau )k(v) \textrm{d}v \, g^\tau (u') \textrm{d}u' \\{} & {} + \int _0^u \rho (u'+\tau )k(u') \overline{G}^\tau (u') \textrm{d}u', \end{aligned}$$

since \(\frac{\textrm{d}\overline{G}^\tau (u)}{\textrm{d}u} = -\frac{\textrm{d}G^\tau (u)}{\textrm{d}u} = -g^\tau (u)\). Therefore,

$$\begin{aligned} \Lambda ^\tau (u) = \int _0^u \rho (u'+\tau )k(u') \overline{G}^\tau (u') \textrm{d}u', \end{aligned}$$

whereby the equations for cumulative incidence and prevalence read as

$$\begin{aligned} \textrm{CI}(t,\tau )&= 1 + \int _0^{t-\tau } \textrm{CI}(t,u+\tau ) \rho (u+\tau ) k(u) \overline{G}^\tau (u) \textrm{d}u, \end{aligned}$$
(18)
$$\begin{aligned} \textrm{Pr}(t,\tau )&= \overline{G}^\tau (t-\tau ) + \int _0^{t-\tau } \textrm{Pr}(t,u+\tau ) \rho (u+\tau )k(u) \overline{G}^\tau (u) \textrm{d}u, \end{aligned}$$
(19)

respectively, in this case.

Example 20

(Alternative Poisson process model). An alternative version of the inhomogeneous Poisson process model, suggested by an anonymous referee, can be formulated by assuming that an infected individual’s infectiousness evolves at a time scale determined by \(L^\tau \) via

$$\begin{aligned} \tilde{k}\bigg (\frac{u}{L^\tau }\bigg ), \quad u \ge 0, \end{aligned}$$

where \(\tilde{k}(\cdot )\) is a continuous function, (strictly) positive on the interval [0, 1] and zero elsewhere. Concretely, we then define

$$\begin{aligned} N^\tau (u)\, {:}{=}\, \Phi \bigg ( \int _0^u \rho (v+\tau ) \tilde{k}\bigg (\frac{v}{L^\tau }\bigg ) \textrm{d}v \bigg ), \quad u \ge 0, \end{aligned}$$

where \(\rho (\cdot )\) and \(\Phi (\cdot )\) are as in Examples 2 and 16. Note that, by the properties of \(\tilde{k}(\cdot )\), we have \(N^\tau (u) = N^\tau (L^\tau )\) for \(u \ge L^\tau \). A straightforward computation shows that

$$\begin{aligned} \Lambda ^\tau (u) = \int _0^u \rho (u'+\tau ) \tilde{g}^\tau (u') \textrm{d}u', \end{aligned}$$

where

$$\begin{aligned} \tilde{g}^\tau (u)\, {:}{=}\, \int _0^\infty \tilde{k}\bigg (\frac{u}{v}\bigg ) g^\tau (v) \textrm{d}v, \quad u \ge 0. \end{aligned}$$

Since \(\tilde{k}(\cdot )\) is necessarily bounded, the regularity of the resulting branching process is guaranteed. Moreover, we note that cumulative incidence and prevalence for the model can be analysed along the lines of the original Poisson process model simply by substituting \(k(u) \overline{G}^\tau (u)\) with \(\tilde{g}^\tau (u)\) in the integral equations (18) and (19).

Remark 21

(Probability generating functions). In the Bellman–Harris case of Examples 1 and 13, we can also analyse the distribution of \(Z(t,\tau )\) via its generating function \(\phi (s;t,\tau ) \, {:}{=}\, \mathbb {E}[s^{Z(t,\tau )}]\), \(s \in [-1,1]\), letting us study, e.g., higher moments. Concretely, one can show that \(\phi (\,\cdot ;t,\tau )\) satisfies the integral equations

$$\begin{aligned} \phi (s;t,\tau )&= s \,\overline{G}^\tau (t-\tau ) + s\int _{(0,t-\tau ]} \psi \big (\phi (s;t,u+\tau );u+\tau \big ) \textrm{d}G^\tau (u), \\ \phi (s;t,\tau )&= s \,\overline{G}^\tau (t-\tau ) + \int _{(0,t-\tau ]} \psi \big (\phi (s;t,u+\tau );u+\tau \big ) \textrm{d}G^\tau (u), \end{aligned}$$

for random characteristics (6) and (7), respectively, where \(\psi (s;t) \, {:}{=}\, \mathbb {E}[s^{\xi (t)}]\), \(s \in [-1,1]\). These are special cases of Kimmel (1983, Equations (3.3) and (3.4)), whilst self-contained re-derivations in the case where \(G^\tau (\cdot )\) does not depend on the infection time \(\tau \) are given in Bellman and Harris (1948).

Remark 22

(Relationship between \(\rho (t)\) and R(t)). The quantity R(t) in the context of the Bellman–Harris process (Examples 1 and 13) is more precisely the instantaneous reproduction number, i.e., the expected number of secondary cases arising from a primary case when those infections occur at time t. In the context of a real-time epidemic, R(t) is generally interpreted as the average number of secondary cases that would arise from a primary case infected at time t if conditions remained the same after time t (Fraser 2007). The quantity \(\rho (t)\) in the Poisson process model (Examples 2 and 16), in contrast, is a time varying transmission rate, i.e., scaled by time, and therefore exists on a different scale. An alternative way of analysing \(R(\cdot )\) is to use the case reproduction number \({\mathcal {R}}(t)\) (Gostic et al. 2020; Wallinga and Teunis 2004), which represents the average number of secondary cases arising from a primary case infected at time t, i.e., transmissibility after time t. It is similarly possible to also analyse \(\rho (\cdot )\) through the case reproduction number and therefore compare the rates of transmission in both models commensurably. Namely, given \(\rho (\cdot )\) and \(R(\cdot )\), they can be transformed into \({\mathcal {R}}(\cdot )\) and be comparable on the same scale via

$$\begin{aligned} {\mathcal {R}}_{\textrm{Pois}}(t)&=\int _t^\infty \rho (u)k(u-t)\overline{G}^t(u-t) \textrm{d}u,\\ {\mathcal {R}}_{\textrm{BH}}(t)&=\int _t^\infty R(u)g^t(u-t) \textrm{d}u. \end{aligned}$$

Remark 23

(When do the Bellman–Harris process and the Poisson process model agree?). The fundamental difference between the Bellman–Harris (Example 13) and the Poisson process model (Example 16) integral equations is that the Bellman–Harris integral equations are parameterised by \(g^\tau (\cdot )\), and the Poisson process model equations by \(k(\cdot )\overline{G}^\tau (\cdot )\). Within the Bellman–Harris process, the precise interpretation of \(g^\tau (\cdot )\) is the PDF of the time between an individual becoming infected and occurrence of all subsequent infections generated by the individual, i.e., the generation time or interval (Svensson 2007). In contrast, the Poisson process model is parameterised by the product of the infectiousness profile \(k(\cdot )\), which broadly corresponds to the generation time (Cori et al. 2012), and the survival function \(\overline{G}^\tau (\cdot )\) of the duration of the infection. Generally, these two models differ in terms of their behaviour. That said, they give rise to equivalent cumulative incidence and prevalence provided

$$\begin{aligned} k(u) = \frac{g^\tau (u)}{\overline{G}^\tau (u)}, \quad u \ge 0, \quad \tau \ge 0. \end{aligned}$$
(24)

Hence, cumulative incidence and prevalence roughly agree between the two models when the infectiousness profile \(k(\cdot )\) approximates the hazard function of \(L^\tau \), i.e., the right-hand side of (24). Even in this case, the higher moments of the models typically do not agree, however.

2.3 Incidence

Incidence is defined as the time-derivative of cumulative incidence. To derive an integral equation for incidence à la (11) and (12), we shall assume that the function \(\Lambda ^\tau (\cdot )\) is continuously differentiable, that is,

$$\begin{aligned} \Lambda ^\tau (u) = \int _0^u \lambda ^\tau (u') \textrm{d}u', \end{aligned}$$
(25)

for some continuous function \(\lambda ^\tau (\cdot )\). The function \(\lambda ^\tau (\cdot )\) is necessarily non-negative since \(N^\tau (\cdot )\) is a counting process. The assumption (25) rules out infections occurring in a discrete time grid. It is satisfied with \(\lambda ^\tau (u) = \rho (u+\tau )k(u) \overline{G}^\tau (u)\) in Example 16 provided \(\rho (\cdot )\) and \(k(\cdot )\) are continuous, and with \(\lambda ^\tau (u) = R(u+\tau ) g^\tau (u)\) in Example 13 provided \(R(\cdot )\) is continuous and \(G^\tau (\cdot )\) has a continuous PDF \(g^\tau (\cdot )\).

Cumulative incidence, by definition, equals zero before the index case is infected at time \(\tau \), whilst it then jumps to one. Hence, cumulative incidence, when understood as a function on the entire real line, satisfies

$$\begin{aligned} \textrm{CI}(t,\tau ) = \textbf{1}_{[0,\infty )}(t-\tau ) + \int _0^{t-\tau } \textrm{CI}(t,u+\tau ) \lambda ^\tau (u) \textrm{d}u, \quad t \in \mathbb {R}. \end{aligned}$$
(26)

(When \(t < \tau \) we will interpret the integral, and similar integrals in what follows, as zero.) Incidence is then defined as the time-derivative

$$\begin{aligned} \textrm{I}(t,\tau ) := \frac{\partial }{\partial t} \textrm{CI}(t,\tau ). \end{aligned}$$

Before deriving incidence in full generality, let us however study the time-derivative of a related quantity

$$\begin{aligned} \widetilde{\textrm{CI}}(t,\tau ) := \textrm{CI}(t,\tau ) - \textbf{1}_{[0,\infty )}(t-\tau ), \quad t \in \mathbb {R}, \end{aligned}$$

which omits the initial jump and, in view of (26), satisfies

$$\begin{aligned} \widetilde{\textrm{CI}}(t,\tau ) = \int _0^{t-\tau } \lambda ^\tau (u) \textrm{d}u + \int _0^{t-\tau } \widetilde{\textrm{CI}}(t,u+\tau ) \lambda ^\tau (u) \textrm{d}u. \end{aligned}$$
(27)

Applying the Leibniz integral rule to the second integral on the right-hand side of (27) formally (see Remark 33 below), we obtain

$$\begin{aligned} \frac{\partial }{\partial t} \widetilde{\textrm{CI}}(t,\tau ) = \lambda ^\tau (t-\tau ) + \int _0^{t-\tau } \frac{\partial }{\partial t} \widetilde{\textrm{CI}}(t,u+\tau ) \lambda ^\tau (u) \textrm{d}u - \underbrace{\widetilde{\textrm{CI}}(t,t)}_{=0} \lambda ^\tau (t-\tau ).\nonumber \\ \end{aligned}$$
(28)

Since \(\textrm{I}(t,\tau ) = \frac{\partial }{\partial t} \textrm{CI}(t,\tau ) = \frac{\partial }{\partial t} \widetilde{\textrm{CI}}(t,\tau )\) for \(t > \tau \), we deduce that

$$\begin{aligned} \textrm{I}(t,\tau ) = \lambda ^\tau (t-\tau ) + \int _{(0,t-\tau )} \textrm{I}(t,u+\tau ) \lambda ^\tau (u) \textrm{d}u, \quad t > \tau . \end{aligned}$$
(29)

We have taken \((0,t-\tau )\) as the integration domain since \(\frac{\partial }{\partial t} \widetilde{\textrm{CI}}(t,u+\tau )\) and \(\textrm{I}(t,u+\tau )\) do not agree at \(u=t-\tau \) for reasons that will become clear in the next paragraph.

Whilst (29) already describes incidence for \(t > \tau \), for further developments in Sects. 2.4 and 2.5 it is essential that we have an equation characterising incidence for any \(t \ge \tau \). Thus, we need to also deal with the case \(t = \tau \) where the time-derivative \(\frac{\partial }{\partial t} \textrm{CI}(t,\tau )\) cannot be defined in the classical sense due to the jump in cumulative incidence. To this end, it is helpful to note that the derivative of \(t \mapsto \textbf{1}_{[0,\infty )}(t-\tau )\) may be understood as a Dirac delta function \(\delta (\,\cdot \, - \tau )\) in a distributional sense. We recall that the Dirac delta function is a generalised function with the characteristic property \(\int _{\mathbb {R}} f(x) \delta (y-x) \textrm{d}x = f(y)\). Now,

$$\begin{aligned} \textrm{I}(t,\tau ) = \delta (t-\tau ) + \frac{\partial }{\partial t} \widetilde{\textrm{CI}}(t,\tau ), \quad t \in \mathbb {R}. \end{aligned}$$

In particular, formally

$$\begin{aligned} \textrm{I}(\tau ,\tau ) = \delta (0) + \lambda ^\tau (0). \end{aligned}$$
(30)

Note that

$$\begin{aligned} \begin{aligned} \lambda ^\tau (t-\tau )&= \int _{\{t-\tau \}} \delta \big (t-(u+\tau )\big ) \lambda ^\tau (u) \textrm{d}u \\&= \int _{\{t-\tau \}} \bigg (\delta \big (t-(u+\tau )\big ) + \frac{\partial }{\partial t} \widetilde{\textrm{CI}}(t,u+\tau )\bigg ) \lambda ^\tau (u) \textrm{d}u \\&= \int _{\{t-\tau \}} \textrm{I}(t,u+\tau ) \lambda ^\tau (u) \textrm{d}u, \end{aligned} \end{aligned}$$

since \(\int _{\{t-\tau \}} \frac{\partial }{\partial t} \widetilde{\textrm{CI}}(t,u+\tau ) \lambda ^\tau (u) \textrm{d}u = 0\). Thus, we can write the right-hand side of (29) as a single integral over \((0,t-\tau ]\), i.e.,

$$\begin{aligned} \textrm{I}(t,\tau ) = \int _{(0,t-\tau ]} \textrm{I}(t,u+\tau ) \lambda ^\tau (u) \textrm{d}u, \quad t > \tau . \end{aligned}$$

Consequently, we find that incidence is generally governed by the equation

$$\begin{aligned} \textrm{I}(t,\tau ) = \delta (t-\tau ) + \int _{[0,t-\tau ]} \textrm{I}(t,u+\tau ) \lambda ^\tau (u) \textrm{d}u, \quad t \ge \tau . \end{aligned}$$
(31)

Remark 32

In (31), we have adjusted the integration domain from \((0,t-\tau ]\) to \([0,t-\tau ]\) to ensure that the equation agrees with (30) for \(t = \tau \). (This adjustment is immaterial for \(t > \tau \).) To see why this is the case, note that the right-hand side of (31) consists of the generalised function \(\delta (t-\tau )\) and the integral \(\int _{[0,t-\tau ]} \textrm{I}(t,u+\tau ) \lambda ^\tau (u) \textrm{d}u\), the latter of which is an ordinary function in t regardless of what the nature of \(\textrm{I}(t,u+\tau )\) is. Once we integrate \(\textrm{I}(t,u+\tau ) \lambda ^\tau (u)\) with respect to u over the singleton \(\{0\}\) in the case \(t = \tau \), integration will only pick up the generalised function part of \(\textrm{I}(\tau ,u+\tau )\), i.e., \(\delta \big (\tau -(u+\tau )\big )= \delta (u)\), producing the term \(\lambda ^\tau (0)\), as intended.

Remark 33

When applying the Leibniz integral rule in (28), we have not attempted to verify its assumptions. In fact, doing so would be difficult since we do not know a priori that cumulative incidence is differentiable with respect to time. Proving its differentiability from first principles using Lebesgue’s dominated convergence theorem would similarly be difficult since it is not straightforward to derive sufficiently sharp a priori estimates for the increments of \(t \mapsto \textrm{CI}(t,\tau )\). However, there is an alternative way of proving (29) and (31) rigorously, which can be outlined as follows. We first treat these equations as an educated guess and show they have a (unique) solution. We can then show that the time-integral of the solution satisfies Eq. (11) for cumulative incidence. Finally, it is straightforward to prove uniqueness of solutions for (11) using Grönwall’s lemma (cf. Appendix B), which then lets us conclude that the time-derivative of cumulative incidence indeed follows (31). We will elaborate on the remaining mathematical details of this argument, including rigorous treatment of the Dirac delta function as a generalised function, in a separate paper.

Example 34

(Incidence for the Bellman–Harris process and Poisson process model). Under the aforementioned assumptions, Eqs. (29) and (31) read as

$$\begin{aligned} \textrm{I}(t,\tau )&= R(t)g^\tau (t-\tau ) + \int _{(0,t-\tau )} \textrm{I}(t,u+\tau ) R(u+\tau ) g^\tau (u) \textrm{d}u,&t > \tau \ge 0,\nonumber \\ \textrm{I}(t,\tau )&= \delta (t-\tau )+\int _{[0,t-\tau ]} \textrm{I}(t,u+\tau ) R(u+\tau ) g^\tau (u) \textrm{d}u,&t \ge \tau \ge 0, \end{aligned}$$
(35)

respectively, for the Bellman–Harris process of Examples 1 and 13, and as

$$\begin{aligned} \textrm{I}(t,\tau )&= \rho (t)k(t-\tau )\overline{G}^\tau (t-\tau ) + \int _{(0,t-\tau )} \textrm{I}(t,u+\tau ) \rho (u+\tau )k(u) \overline{G}^\tau (u) \textrm{d}u,&\\&\qquad t > \tau \ge 0,\\ \textrm{I}(t,\tau )&= \delta (t-\tau )+\int _{[0,t-\tau ]} \textrm{I}(t,u+\tau ) \rho (u+\tau )k(u) \overline{G}^\tau (u) \textrm{d}u,&\\&\qquad t \ge \tau \ge 0, \end{aligned}$$

respectively, for the Poisson process model of Examples 2 and 16.

2.4 Consistency with back-calculation

Back-calculation is a standard method to recover prevalence from incidence by convolving the survival function of the generation interval with incidence (Brookmeyer and Gail 1988; Crump and Medley 2015). We will now show that the equations we have obtained for prevalence and incidence are consistent with the back-calculation relationship under the Assumption (25) and the additional assumption that the CDF \(G^\tau (\cdot )\) does not depend on the infection time \(\tau \), in which case we write \(G(\cdot )\) and \(\overline{G}(\cdot )\) in lieu of \(G^\tau (\cdot )\) and \(\overline{G}^\tau (\cdot )\), respectively.

Let f and \(\tilde{f}\) be two functions, one of which may be a generalised function, such that \(f(t) = 0\) for any \(t <0\) and \(\tilde{f}(t) = 0\) for any \(t < \tau \). Their convolution can be expressed as

$$\begin{aligned} (f * \tilde{f})(t) {:}{=} \int _{[\tau ,t]} f(t-s) \tilde{f}(s) \textrm{d}s \end{aligned}$$

for any \(t \ge \tau \) and equals zero otherwise. We proceed now to show that the back-calculation relationship

$$\begin{aligned} \big (\overline{G} * \textrm{I}(\,\cdot ,\tau )\big )(t) = \textrm{Pr}(t,\tau ), \quad t \ge \tau \ge 0, \end{aligned}$$
(36)

holds, with the convention \(\overline{G}(t) := 0\) for any \(t<0\). Starting from (31), we have

$$\begin{aligned} \overline{G} * \textrm{I}(\,\cdot ,\tau ) = \overline{G} * \delta (\,\cdot \,-\tau ) + \overline{G} * \int _{[0,\,\cdot \,-\tau ]} \textrm{I}(\,\cdot ,u+\tau ) \lambda ^\tau (u) \textrm{d}u, \end{aligned}$$
(37)

where the first term on the right-hand side can be computed as

$$\begin{aligned} \big (\overline{G} * \delta (\,\cdot \,-\tau )\big )(t) = \int _{\mathbb {R}} \overline{G}(t-s) \delta (s-\tau ) \textrm{d}s = \overline{G}(t-\tau ), \quad t \ge \tau . \end{aligned}$$
(38)

The second term on the right-hand side of (37) vanishes for any argument \(t \le \tau \), so it suffices to consider \(t > \tau \). In this case, switching the order of integration, we obtain

$$\begin{aligned} \begin{aligned} \bigg (\overline{G} * \int _{[0,\,\cdot \,-\tau ]} \textrm{I}(\,\cdot ,u+\tau ) \lambda ^\tau (u) \textrm{d}u\bigg )(t)&= \int _{[\tau ,t]} \overline{G}(t-s) \int _{[0,s-\tau ]} \textrm{I}(s,u+\tau ) \lambda ^\tau (u) \textrm{d}u \, \textrm{d}s \\&= \int _{[0,t-\tau ]} \int _{[u+\tau ,t]} \overline{G}(t-s) \textrm{I}(s,u+\tau ) \textrm{d}s \, \lambda ^\tau (u) \textrm{d}u \\&= \int _0^{t-\tau } \big (\overline{G} * \textrm{I}(\, \cdot ,u+\tau )\big )(t)\lambda ^\tau (u) \textrm{d}u. \end{aligned}\nonumber \\ \end{aligned}$$
(39)

Combining (38) and (39), we have altogether

$$\begin{aligned} \big (\overline{G} * \textrm{I}(\,\cdot ,\tau )\big )(t) = \overline{G}(t-\tau ) + \int _0^{t-\tau } \big (\overline{G} * \textrm{I}(\, \cdot ,u+\tau )\big ) (t) \lambda ^\tau (u) \textrm{d}u, \quad t \ge \tau \ge 0. \end{aligned}$$

Matching this with Eq. (12) under the assumption (25), we deduce

$$\begin{aligned} \big |\big (\overline{G} * \textrm{I}(\,\cdot ,\tau )\big )(t) - \textrm{Pr}(t,\tau )\big | \le \int _0^{t-\tau } \big |\big (\overline{G} * \textrm{I}(\, \cdot ,u+\tau )\big ) (t) -\textrm{Pr}(t,u+\tau )\big | \lambda ^\tau (u) \textrm{d}u. \end{aligned}$$

By an application of Grönwall’s inequality, as outlined in Appendix B, we can finally conclude that the back-calculation relationship (36) holds.

Remark 40

(Modelling HIV incidence from prevalence). HIV is an example of a disease where, due to long incubation times, routine surveillance generally returns prevalence—not incidence (Eaton et al. 2014). However, what is of interest to policy makers is incidence, not prevalence (Brown et al. 2014). Common approaches all make use of the back-calculation relationship through convolving a latent function for incidence with the survival function \(\overline{G}(\cdot )\) (Brown et al. 2014; Nishiura et al. 2004; Salomon et al. 1999). Our argument above shows that there is no need to model incidence as a latent function, rather one can fit \(\rho (\cdot )\) or \(R(\cdot )\) directly to prevalence data using the prevalence integral equation for \(\textrm{Pr}(t,\tau )\), after which \(\textrm{I}(t,\tau )\) can be computed directly without need for a latent incidence function. This relationship therefore can help facilitate simpler or more pragmatic modelling choices.

2.5 Consistency with a common renewal equation model for incidence

The key difference between our newly derived integral equations and the common renewal equation used (Cori et al. 2013; Fraser 2007; Nouvellet et al. 2018) is the inclusion of the parameter \(\tau \) that initially arises due to the timing of the index case. The inclusion of \(\tau \) means that we need to work with \(\textrm{I}(t,\tau )\), not simply \(\textrm{I}(t)\), and also gives rise to terms outside of the integral depending on whether one is interested in incidence, cumulative incidence or prevalence.

As in Sect. 2.4, we assume that \(G^\tau (\cdot )\) does not depend on \(\tau \), i.e., we work with \(G(\cdot )\), and we moreover assume that \(G(\cdot )\) has a PDF \(g(\cdot )\). In this context, when extended to accommodate the general initial infection time \(\tau \), the common renewal equation for incidence is tantamount to the integral equation

$$\begin{aligned} \textrm{I}_{\textrm{Ren}}(t,\tau ) = \delta (t-\tau ) + R(t) \int _{[0,t-\tau ]} \textrm{I}_{\textrm{Ren}}(t-u,\tau ) g(u) \textrm{d}u, \quad t \ge \tau . \end{aligned}$$
(41)

We show that the renewal equation (41) in fact agrees with the integral equation (31) in the Bellman–Harris case, that is,

$$\begin{aligned} \textrm{I}(t,\tau )=\textrm{I}_{\textrm{Ren}}(t,\tau ), \quad t \ge \tau \ge 0. \end{aligned}$$

While we focus on the Bellman–Harris process (Examples 1 and 13) here for notational simplicity, the argument also applies to the Poisson process model (Examples 2 and 16) simply by replacing \(R(\cdot )\) with \(\rho (\cdot )\) and \(g(\cdot )\) with \(k(\cdot ) \overline{G}(\cdot )\), respectively, throughout.

To this end, we first introduce

$$\begin{aligned} J(t,\tau ) {:}{=}R(t) \int _{[0,t-\tau ]} \textrm{I}_{\textrm{Ren}}(t-u,\tau ) g(u) \textrm{d}u, \quad t \ge \tau \ge 0, \end{aligned}$$
(42)

so that, given (41),

$$\begin{aligned} \textrm{I}_{\textrm{Ren}}(t,\tau ) = \delta (t-\tau ) + J(t,\tau ). \end{aligned}$$
(43)

Applying (43) to the integrand in (42) yields

$$\begin{aligned} J(t,\tau )&= R(t) \int _{[0,t-\tau ]} \big (\delta (t-u-\tau ) + J(t-u,\tau )\big ) g(u) \textrm{d}u \nonumber \\&= R(t) g(t-\tau ) + R(t) \int _0^{t-\tau } J(t-u,\tau ) g(u) \textrm{d}u. \end{aligned}$$
(44)

Additionally, we introduce

$$\begin{aligned} \widetilde{J}(t,\tau ) {:}{=}\int _{[0,t-\tau ]} \textrm{I}_{\textrm{Ren}}(t,u+\tau ) R(u+\tau ) g(u) \textrm{d}u, \quad t \ge \tau \ge 0. \end{aligned}$$
(45)

Subsequently, by applying (41) to the integrand in (45) and switching the order of integration, we obtain

$$\begin{aligned} \widetilde{J}(t,\tau )&= \int _{[0,t-\tau ]} \bigg (\delta (t-\tau -u) + R(t) \int _{[0,t-\tau -u]} \textrm{I}_{\textrm{Ren}}(t-s,u+\tau ) g(s) \textrm{d}s \bigg ) \nonumber \\&\qquad \times R(u+\tau ) g(u) \textrm{d}u \nonumber \\&= R(t)g(t-\tau ) + R(t) \int _{[0,t-\tau ]} \int _{[0,t-\tau -u]} \textrm{I}_{\textrm{Ren}}(t-s,u+\tau ) \nonumber \\&\qquad \times R(u+\tau ) g(u) g(s) \textrm{d}s \textrm{d}u \nonumber \\&= R(t)g(t-\tau ) + R(t) \int _{[0,t-\tau ]} \int _{[0,t-\tau -s]} \textrm{I}_{\textrm{Ren}}(t-s,u+\tau ) \nonumber \\&\qquad \times R(u+\tau ) g(u) \textrm{d}u \, g(s) \textrm{d}s \nonumber \\&= R(t)g(t-\tau ) + R(t) \int _0^{t-\tau } \widetilde{J}(t-s,\tau ) g(s) \textrm{d}s. \end{aligned}$$
(46)

The integral equations (44) and (46) then imply the bound

$$\begin{aligned} \big |J(t,\tau )-\widetilde{J}(t,\tau )\big | \le R(t) \int _0^{t-\tau } \big |J(t-u,\tau )-\widetilde{J}(t-u,\tau )\big | g(u) \textrm{d}u, \end{aligned}$$

and applying Grönwall’s inequality as outlined in Appendix B we deduce that

$$\begin{aligned} J(t,\tau ) = \widetilde{J}(t,\tau ), \quad t \ge \tau \ge 0. \end{aligned}$$
(47)

Finally, by (43) and (47),

$$\begin{aligned} \textrm{I}_{\textrm{Ren}}(t,\tau ) = \delta (t-\tau ) + \widetilde{J}(t,\tau ) = \delta (t-\tau ) + \int _{[0,t-\tau ]} \textrm{I}_{\textrm{Ren}}(t,u+\tau ) R(u+\tau ) g(u) \textrm{d}u. \end{aligned}$$

Given (31) in the Bellman–Harris case, we then have the bound

$$\begin{aligned} |\textrm{I}(t,\tau ) - \textrm{I}_{\textrm{Ren}}(t,\tau )| \le \int _0^{t-\tau } |\textrm{I}(t,u+\tau ) - \textrm{I}_{\textrm{Ren}}(t,u+\tau )| R(u+\tau ) g(u) \textrm{d}u \end{aligned}$$

for any \(t \ge \tau \ge 0\). Applying the result in Appendix B again we conclude that, indeed, \(\textrm{I}(t,\tau ) = \textrm{I}_{\textrm{Ren}}(t,\tau )\) holds for any \(t \ge \tau \ge 0\).

Remark 48

(Equivalence does not extend beyond incidence). In the case of prevalence or cumulative incidence, the equivalence between the common renewal equation and our newly derived integral equations is broken. This is easy to see by examining the derivations leading to (44) and (46). If we considered cumulative incidence, for example, a constant one instead of a Dirac delta function would appear and the leading terms in (44) and (46) would no longer agree, rendering the rest of the argument impossible to carry through. This illustrates why the common renewal equation is a special case of our integral equations only when the index case is infected at time \(\tau = 0\) and when considering incidence. Simpler renewal equations that do not involve varying dependence on the parameter \(\tau \) for prevalence or cumulative incidence are not possible.

3 Numerical implementation and empirical application

3.1 Discretisation of integral equations

The integral equations for cumulative incidence, prevalence and incidence under the assumption (25) are all special cases of a generic equation

$$\begin{aligned} f(t,\tau ) = h(t,\tau ) + \int _{0}^{t-\tau } f(t,u+\tau ) \lambda ^{\tau }(u) \textrm{d}u, \quad t \ge \tau \ge 0, \end{aligned}$$
(49)

with the choices

$$\begin{aligned} h(t,\tau ) \,{:}{=}\,{\left\{ \begin{array}{ll} 1, &{} f = \textrm{CI}, \\ \overline{G}^\tau (t-\tau ), &{} f = \textrm{Pr}, \\ \lambda ^{\tau }(t-\tau ), &{} f = \textrm{I} \ \ \text {for} \ \ (t > \tau ). \end{array}\right. } \end{aligned}$$
(50)

Recall that for the Bellman–Harris process of Examples 1 and 13, we substitute \(\lambda ^\tau (u) \textrm{d}u {:}{=}R(u+\tau ) g^\tau (u) \textrm{d}u\) and for the Poisson process model of Examples 2 and 16, \(\lambda ^\tau (u) \textrm{d}u {:}{=}\rho (u+\tau )k(u) \overline{G}^\tau (u) \textrm{d}u\).

A key hurdle in solving Eq. (49) is that on the right-hand side, we get f(tu) for \(\tau \le u\le t\) and not \(f(u,\tau )\) for \(\tau \le u \le t\). What this means is that in order to solve f(t, 0) for \(t \ge 0\), say, we need to actually solve \(f(t,\tau )\) for any pair \((t,\tau )\) such that \(t \ge \tau \ge 0\). This is in fact why we left the initial infection time \(\tau \) as a free parameter. (Alternatively, we could view (49) as a system of coupled integral equations, indexed by \(\tau \), that need to be solved simultaneously.)

Algorithm 1
figure a

Discretisation of integral equations

Algorithm 2
figure b

Discretisation of integral equations, vectorised

Solving Eq. (49) numerically is greatly facilitated if we introduce the auxiliary quantity \(f_c(t) {:}{=}f(c,c-t)\) for any \(c \ge t \ge 0\). From (49) we can deduce that, for fixed \(c \ge 0\), the single-argument function \(f_c(\cdot )\) is governed by the renewal-like integral equation

$$\begin{aligned} f_c(t) = h(c,c-t) + \int _0^t f_c(t-u) \lambda ^{c-t}(u) \textrm{d}u, \quad c \ge t \ge 0. \end{aligned}$$

We then recover f(t, 0) for \(t \ge 0\) via \(f(t,0) = f_t(t)\). In practice, we are interested in solving f(t, 0) discretely for \(t = 0,\Delta ,\ldots ,N\Delta \) for some \(N \in \mathbb {N}\) and \(\Delta >0\). To this end, we approximate \(f_{n\Delta }(\cdot )\) recursively by

$$\begin{aligned} \widehat{f}_{n\Delta }(i\Delta )\, {:}{=}\, {\left\{ \begin{array}{ll} f_{n\Delta }(0) = f(n\Delta ,n\Delta ) = h(n\Delta ,n\Delta ), &{} i = 0, \\ {\displaystyle h\big (n\Delta ,(n-i)\Delta \big ) + \sum _{j=1}^i \widehat{f}_{n\Delta }\big ((i-j)\Delta \big ) \lambda ^{(n-i)\Delta } (j\Delta ) \Delta ,} &{} i = 1,\ldots ,n, \end{array}\right. } \end{aligned}$$

for any \(n = 0,\ldots ,N\). For clarity, we present the entire procedure in pseudo-code in Algorithm 1.

Example 51

Concretely, in the Bellman–Harris case, we set

$$\begin{aligned} \lambda ^{(n-i)\Delta } (j\Delta ) {:}{=}R\left( (n-i+j)\Delta \right) g^{(n-i)\Delta }(j\Delta ), \end{aligned}$$

while in the case of the Poisson process model,

$$\begin{aligned} \lambda ^{(n-i)\Delta } (j\Delta ) {:}{=}\rho \left( (n-i+j)\Delta \right) k(j\Delta )\overline{G}^{(n-i)\Delta }(j\Delta ). \end{aligned}$$

A simplified version of the algorithm for cumulative incidence in the Bellman–Harris case is found in Appendix A.

In practice, the double for-loop in Algorithm 1 may lead to computational inefficiency when N is large and an interpreted language is used, so it is useful to refine it by vectorisation. To this end, for an \(m \times n\) matrix A and \(1 \le s \le m\) and \(1 \le t \le n\), we denote by A[st] the s-th row, t-th column element of A. Moreover, for \(1 \le i \le j \le m\) and \(1 \le k \le l \le n\), we write A[i : jk : l] for the sub-matrix consisting of each element A[st] where \(i \le s \le j\) and \(k \le t \le l\). (If \(i=j\), we simply write i in lieu of i : j.) We also denote by \(\odot \) element-wise (Hadamard) multiplication of matrices. The vectorised version of Algorithm 1 is given as Algorithm 2. This matrix computation is possible by observing that all relevant values of the functions \(h(\cdot ,\cdot )\) and \(\lambda ^{\cdot }(\cdot )\) can be stored in the matrices H and L, respectively. Algorithm 2 can be further vectorised with respect to parameters to produce simultaneously discretisations for multiple parameter values. Additional computational savings could be attained in Algorithm 2 by observing that the top-left corner of the matrix L typically contains very small values since \(g^{\tau }(u)\) and \(\overline{G}^{\tau }(u)\) are small with large u. Therefore the matrices L and F could in practice be truncated with a small error in the computation of \(\textrm{diag}(F)\).

We illustrate the use of these algorithms in Fig. 1, where we compute prevalence using Algorithm 2 and compare the results with statistical estimates of prevalence from a Monte Carlo simulation. Python implementations of Algorithms 1 and 2, including a version of the latter vectorised over parameters, are provided as fully documented Jupyter notebooks in: https://github.com/mspakkanen/integral-equations.

3.2 Bayesian inference on empirical data

We perform Bayesian inference to estimate the time-varying case reproduction number \({\mathcal {R}}(t)\), as defined in Remark 22, for historical incidence data for Influenza (Frost and Sydenstricker 1919), Measles (Groendyke et al. 2011), SARS (Lipsitch et al. 2003) and (Smallpox Gani and Leach 2001) and for recent SARS-CoV-2 serological prevalence data in the United Kingdom (Pouwels et al. 2021).

3.2.1 Historical incidence data

Table 1 Hierarchical Bayesian model for estimating incidence for a Bellman–Harris process
Fig. 3
figure 3

Bayesian modelling of incidence for Influenza (Frost and Sydenstricker 1919), Measles (Groendyke et al. 2011), SARS (Lipsitch et al. 2003) and Smallpox (Gani and Leach 2001). Plots show the case reproduction number \({\mathcal {R}}(t)\), the distribution \(g(\cdot )\) in discretised form and incidence for the Bellman–Harris process. Solid black lines in all plots are means, and the two red envelopes are the interquartile and \(95\%\) credible intervals. The horizontal blue line indicates \({\mathcal {R}}=1\). The x-axis in all plots is time measured in days (color figure online)

Historical incidence data for Influenza (Frost and Sydenstricker 1919), Measles (Groendyke et al. 2011), SARS (Lipsitch et al. 2003) and Smallpox (Gani and Leach 2001) have been extensively used in validating renewal equation frameworks (Cori et al. 2013). We fit an integral equation for the Bellman–Harris process. We work with \(G^\tau (\cdot ) = G(\cdot )\) that does not depend on \(\tau \). As demonstrated in Sect. 2.5, the corresponding integral equation agrees with the common renewal equation ubiquitously used in the modelling of incidence (Cori et al. 2013).

We first introduce a probabilistic model for the function \(R(\cdot )\) through a stochastic random walk process. To aid comparability to alternative methods (Wallinga and Teunis 2004), we transform \(R(\cdot )\) to the case reproduction number \({\mathcal {R}}(t)\), which represents the average number of secondary cases arising from a primary case infected at time t, i.e., transmissibility after time t. In Table 1, the Negative Binomial likelihood is re-parameterised to the mean–variance formulation, y is the observed count data (number of infections), \(\phi \) is the overdispersion parameter and \(\sigma \) is the random walk variance parameter. Therein, we write \(\text {Normal}^+(0,a)\) for a normal distribution \(\text {Normal}(0,a)\) constrained to the positive real axis. The observed count data and generation intervals were obtained from Cori et al. (2020, 2013). The priors were selected to be weakly informative and were generally robust to change.

Algorithm 2 was used to discretise and solve \(t \mapsto \textrm{I}(t,0)\)—recall that \(\tau \) is a parameter that is intrinsically involved in the solution of the integral equation, although we can ultimately restrict our attention to \(t \mapsto \textrm{I}(t,0)\) only, having assumed that the first infection occurs at time \(\tau =0\). For all data sets, an arbitrary seeding period of 10 days was used to correct for poor surveillance in the early epidemic. The seeding period was not included in the likelihood and we found our fits to be robust to different choices of seeding duration. Posterior sampling was performed using Hamiltonian Monte Carlo (1000 warmup/1000 sampling with multiple chains) in the Bayesian probabilistic programming language Numpyro (Bingham et al. 2019; Phan et al. 2019). Posterior predictive checks were performed by examining R-hat and K-hat distributions. Figure 3 shows the estimated case reproduction numbers \({\mathcal {R}}(t)\), which, as expected, match those previously estimated (Cori et al. 2013).

3.2.2 Serological prevalence data

The ONS infection survey, is a weekly, household cross-sectional survey of blood samples which are used to test for the presence of COVID-19 antibodies, led by the Office for National Statistics (ONS) and the Department of Health and Social Care of the United Kingdom. At any point in time the ONS infection survey provides an estimate for the number of individuals currently infected with SARS-CoV-2, i.e., the prevalence of infection/positivity rates. Estimation of incidence from the ONS infection survey is done using a bespoke deconvolution approach, and estimating R(t) or incidence directly from prevalence, to our knowledge, has not been attempted.

Table 2 Hierarchical Bayesian model for estimating prevalence for a Poisson process model
Fig. 4
figure 4

Bayesian modelling of the ONS COVID-19 infection survey for prevalence. Top left show the case reproduction number \({\mathcal {R}}(t)\), top right prevalence, bottom left incidence and bottom right the ascertainment ratio (incidence/reported cases). Solid black lines in all plots are means, and the two red envelopes are the interquartile and \(95\%\) credible intervals. The horizontal blue line indicates \({\mathcal {R}}=1\). The x-axis in all plots is decimal calendar time. The ascertainment ratio in the bottom right is adjusted for the reporting delay between infections and cases, and this delay is estimated as the maximal lagged cross correlation (color figure online)

We study estimates of prevalence from the ONS infection survey over the period 5th April 2021 to 15th November 2021. Our choice for this period arose from the requirement of wide spread, easily accessible SARS-CoV-2 PCR testing in the general population, which is required to ensure comparability between the ONS infection survey and reported case data (which we compare our estimates to). Prevalence estimates are reported weekly and we therefore use smoothing splines to interpolate these weekly estimates to daily estimates through a log linear generalised additive spline model (Hastie et al. 2009). ONS infection survey results are generally reported on the Friday of any given week, with the sampling period covering Wednesday to Wednesday—a period of 10 days. We therefore incorporate this observation lag by convolving daily prevalence with \(\text {Normal}(10,0.3)\) distribution to adjust for these reporting lags and incorporate some uncertainty in this lag. We fit a Poisson process model, detailed in Table 2, to these lagged prevalence data assuming the infectiousness profile to be analogous to the generation time such that \(k(\cdot )\) is given by the PDF of \(\text {Gamma}(4.84,1.73)\) distribution (Sharma et al. 2021). The CDF \(G(\cdot )\) of the infection duration was assumed to follow the CDF of \(\text {Normal}(10,1.5)\) distribution (Wölfel et al. 2020). We did fit an aggregated likelihood where a daily Poisson process was aggregated to weekly averages, but found little difference in results.

The top left panel of Fig. 4 shows the estimate case reproduction number \({\mathcal {R}}(t)\), which fluctuates around 1 over the period of study. The top right panel exhibits an excellent posterior fit to the daily smoothed ONS infection survey prevalence. Moreover, the bottom right panel shows infection incidence using the estimated R(t) from fitting prevalence, and bars indicate the reported case data. Note we do not fit directly to the case data, but only to the prevalence as estimated by the ONS survey—however including a second likelihood would be trivial to add. Lagging the time series and estimating the maximum cross correlation suggest a lag of approximately 7 days between infections and reported cases—a lag that is in line with previous studies (Sharma et al. 2021). Finally, correcting for this lag between infections and cases, we see a reasonably stable (aside from weekly reporting cycles) infection ascertainment ratio (bottom right panel of Fig. 4) with a mean of approximately 2.5, implying that for most of the study period there were 2.5 times more infections than reported and that this is relatively stable given testing policies over the period. This example demonstrates how our framework can fit prevalence directly without the need of deconvolution type approaches.

4 Discussion

Our primary goal in this paper is to bridge the worlds of individual-based models and mechanistic models to gain from the best of both. To this end, we began by choosing the most general branching process available—the Crump–Mode–Jagers process (Crump and Mode 1968, 1969; Jagers 1975). In the Crump–Mode–Jagers process, an epidemic is created at an individual level where, from a single infected individual, subsequent infections occur at random times according to their level of infectiousness. To our knowledge, for the first time, we generalise the Crump–Mode–Jagers process to allow for fully time-varying reproduction process for new infections. Indeed, rather than assuming the distribution of new infections to be constant (corresponding to a basic reproduction number) we allow it to change over time, which is essential in the modelling of real outbreaks (Gostic et al. 2020) beyond their early phase. We find that under this generalisation, a general integral equation arises from the Crump–Mode–Jagers process. Our framework also allows us to specify the dynamics of how new infections arise (in addition to them changing over time). Studying first the case where each infected individual produces all of their secondary cases, or “offspring”, at the same random time, we recover the well known Bellman–Harris process (Bellman and Harris 1948). Studying a more complex assumption where each infection can give rise to its offspring over the duration of its infection (an inhomogenous Poisson process) we derive a new integral equation, which to our knowledge, has not been previously presented. Remarkably, we find that despite the Poisson process model being much more complex than the simple Bellman–Harris assumption, the resultant integral equation has exactly the same form as the Bellman–Harris integral equation, only instead of the generation interval CDF, the survival probability is used.

Through starting from a stochastic process, we are able to define prevalence, incidence and cumulative incidence as summary statistics (via moments) of an individual-based infection process. The benefit of defining these well known epidemiological quantities from a single stochastic process is that they are, by design, consistent with one another—i.e., they are parameterised with the same generation interval and transmission rate (either \(\rho (t)\) or R(t)). This allows practitioners to fit to prevalence for example, and easily recover incidence with no additional fitting. We mathematically show that this is the case and prove our equations for prevalence and incidence are consistent under the commonly used back-calculation technique in epidemiology (Brookmeyer and Gail 1988). Given ever increasing amount of infectious disease surveillance, being able to model prevalence and incidence simultaneously under the same process can greatly improve estimates of the rates of the reproduction number. A recent example is the COVID-19 pandemic where several countries collected high quality data on both cases (incidence) and serology (prevalence) (Flaxman et al. 2020).

We also show that the incidence integral equations we recover from the Bellman–Harris process and from the Poisson process model are in fact in agreement with the renewal equation commonly used in the modelling of incidence (Cori et al. 2013). Specifically, the common renewal equation is a special case of our incidence equations under the scenario where the first infection occurs at a specific, non-random, time. We also show that our equations are more general, and accommodate the modelling of prevalence, cumulative incidence, complex importation functions, and time-varying generation times (Kimmel 1983). The common renewal equation is computationally simpler as it does not involve the time \(\tau \) of the first infection and simplifies the problem from two-dimensional to one-dimensional. We have however introduced an efficient algorithm which relies on straightforward matrix algebra to compute our more general integral equations. Given the ability of modern computers to perform matrix operations efficiently, we do not believe the computational overhead of our integral equations is meaningfully greater than that of the simple renewal equation. However, our integral equations allow for a far greater range of modelling choices with explicitly stated assumptions.

In this work, we have attempted to put the modelling of infectious diseases using renewal equations on firm mathematical ground. These mathematical foundations are broad enough to cover a variety of model specifications for transmission dynamics, and from them we can extract information about a wide range of relevant epidemiological quantities. In doing so, we have once again made explicit the connection between branching processes (Bellman and Harris 1948; Crump and Mode 1968) and renewal equations. Explicit links between renewal equations and SEIR models (Champredon et al. 2018) and Hawkes processes (Rizoiu et al. 2017) have been previously noted. It is likely other such relationships exist, and this is an interesting area of further study. Of additional interest is to use our framework to study the more complex Lévy and Cox process models, which may produce renewal equations with even more realistic dynamics. Equally, recent frameworks (Gomez-Rodriguez et al. 2012; Routledge et al. 2018) have extended the seminal work of Wallinga and Teunis (2004) to estimate case reproduction number on graphs—connecting these two approaches is an interesting area of future research. Finally, our framework, and the vast majority of previous frameworks, only consider the mean integral equation and ignore the dynamics of higher-order moments. Using our framework, we can recover these moments from our stochastic process and formulate more accurate likelihoods for model fitting.