1 Introduction

Up to high age, Fred Brauer has been very active in the field of Mathematical Epidemiology. His books Brauer and Castillo-Chavez 2012; Brauer et al. 2019, lecture notes (Brauer et al. 2008) and many papers constitute a valuable heritage. In his paper Brauer (2005), Fred recognizes that Kermack and McKendrick (KM) introduced in Kermack and McKendrick (1927) an age-of-infection model that is mathematically represented by a renewal equation (RE). Only for very special kernels does the RE reduce to a finite system of ODEs. Or, in other words, compartmental models form a rather restricted subclass of the general KM model. The paper Diekmann et al. (2018) provides necessary and sufficient conditions for when a delay equation [i.e., a delay differential equation (DDE) or a RE] allows a finite dimensional reduction. Most of that paper concentrates on linear equations, but Section 9.3 is devoted to the nonlinear KM model.

In Diekmann et al. (2012) and in work in progress (Bootsma et al. 2022), we formulate the abstract RE that incorporates static heterogeneity of the host population into the general KM model, see Section 5 below. From a modeling point of view, everything is straightforward. But when it comes to analysing the equation, the infinite dimensional character is a great stumbling block. In Section 8.4 of Diekmann et al. (2012), it is shown that under the assumption of separable mixing, [a slightly less restrictive version of (5.3) below], we are back to scalar quantities. This facilitates both the computational and the analytical aspects tremendously, see for instance (Tkachenko et al. 2021a; Neipel et al. 2020; Novozhilov 2008).

The interpretation of the separability condition can be informally described as follows: whenever the trait/type at the moment of becoming infected is following an a priori given distribution (in particular independently of the trait/type of the infecting individual), newly infected individuals are identical in a stochastic sense and therefore we know how to take averages. From a mathematical point of view, the key point is that various operators have a one-dimensional range.

After the reduction of the abstract RE to a scalar RE, we can further reduce to an ODE system, provided the time-since-infection kernel has the required form. We claim that this seemingly circuitous derivation of compartmental models, that incorporate separable static host heterogeneity, is, due to its systematic character, far more powerful and efficient than a direct approach that starts from the compartmental model describing a homogeneous host population and then adds heterogeneity. Stated differently: the RE formulation is much more amenable to generalization than the ODE formulation. The aim of the present paper is to establish this systematic procedure and to demonstrate its effectiveness.

We restrict to models of an outbreak in a closed population, i.e., we ignore both demographic turnover and loss of immunity. This is quite essential. Indeed, we shall first reconsider the homogeneous KM model and reduce it somewhat differently from the manner described in Section 9.3 of Diekmann et al. (2018). This new reduction involves an integration step that only works in the outbreak situation. And it is this new reduction that easily generalizes to the heterogeneous KM model.

The organization of the paper is as follows. In Sect. 2, we introduce the original KM model, characterized by a kernel \(A(\tau )\), with \(\tau \) the time elapsed since infection and A the expected contribution to the force-of-infection. We derive, as in Breda et al. (2012), a RE for the cumulative force-of-infection. In Sect. 3, we focus on the special case of a kernel A which is a matrix exponential sandwiched between two vectors. For this special case we deduce the corresponding ODE compartmental system. In Sect. 4, we explain how the form of the compartmental model derived in Sect. 3 relates to the standard form. As concrete examples, we consider the elementary SIR and SEIR models (the treatment of more complicated examples is postponed till Sect. 8). In Sect. 5, we turn to heterogeneity. Individuals are characterized by a trait x taking values in a measurable space \(\Omega \). The kernel is now a function of three variables, \(\tau \), x and \(\xi \), with \(\tau \) the time elapsed since an individual with trait \(\xi \) became infected and A being the expected contribution to the force-of-infection on individuals with trait x. Assuming that A is the product of functions a(x), \(b(\tau )\) and \(c(\xi )\), we derive a scalar renewal equation for the function w such that the cumulative force-of-infection on individuals with trait x equals a(x)w(t). Next we assume that b is a matrix exponential sandwiched between two vectors and reduce to an ODE system. This heterogeneous ODE system only differs from the corresponding homogeneous ODE system in the definition of a function \(\Psi : {\mathbb {R}} \rightarrow {\mathbb {R}}\). The definition of \(\Psi \) involves the functions a, c and the measure \(\Phi \) describing the distribution of the trait in the host population. The upshot is that one can incorporate separable static heterogeneity into compartmental models by appropriately choosing \(\Psi \). Section 6 is devoted to taking heterogeneity into account in the standard form of a compartmental model. In Sect. 7, inspired by Gomes et al. (2022), Montalbán et al. (2022), Neipel et al. (2020), Novozhilov (2008), Novozhilov (2012), Tkachenko et al. (2021a), we elaborate the special case that \(\Omega = (0, \infty )\), \(\Phi \) is the Gamma Distribution, \(a(x)=x\) and \(c(\xi )\) is either identically equal to one or equal to \(\xi \). Section 8 is devoted to examples and in the final Sect. 9 we collect some concluding remarks.

2 The general Kermack–McKendrick model

Let S be the size of the subpopulation of susceptible individuals, and let F denote the force-of-infection. In a closed population, the incidence is equal to both FS and to \(- dS/dt\), so we have

$$\begin{aligned} \frac{dS}{dt} = - F S. \end{aligned}$$
(2.1)

The essence of the KM model is the constitutive equation that expresses F in terms of contributions by individuals that became infected before the current time:

$$\begin{aligned} F(t) = \int _{0}^{\infty }A(\tau ) F(t-\tau )S(t-\tau ) d\tau . \end{aligned}$$
(2.2)

Here the one-and-only (apart from N, the total host population size) model ingredient A describes the expected contribution to the force-of-infection as a function of the time \(\tau \) elapsed since infection took place. In this top-down approach we postpone a specification of the stochastic processes that underly the word expected (in general, these concern both within host processes, in particular the struggle between the pathogen and the immune system, and the between host contact process). Indeed, KM wanted to know what general conclusions can be drawn without providing such a specification.

Now imagine that F was negligible in the infinite past. We introduce the cumulative force-of-infection w defined by

$$\begin{aligned} w(t) := \int _{-\infty }^{t} F(\sigma ) d\sigma . \end{aligned}$$
(2.3)

Suppose that \(S(-\infty )=N\). Then integration of (2.1) yields

$$\begin{aligned} S(t) = N e^{-w(t)}. \end{aligned}$$
(2.4)

So note, incidentally, that one can equivalently characterize w by the relation \(w = - \log (S/N)\).

Integration of (2.2) with respect to time over \((-\infty ,t]\) leads, upon replacing FS by \(-S'\) and a change of the order of the integrals, to the RE

$$\begin{aligned} w(t) = \int _{0}^{\infty } A(\tau ) \Psi (w(t - \tau )) d\tau , \end{aligned}$$
(2.5)

where

$$\begin{aligned} \Psi (w) := N ( 1 - e^{-w}), \end{aligned}$$
(2.6)

which corresponds to the subpopulation of no longer susceptible individuals, given the cumulative force of infection.

As a side remark, we mention that (2.5) can be considered as a deterministic version of the Sellke construction, as described in, for instance, Section 3.5.2 of Diekmann et al. (2012).

3 The special case in which reduction to a compartmental model is possible

Suppose there exist an integer n, \(1 \times n\), \(n \times 1\)-matrices U, V and an \(n \times n\)-matrix \(\Sigma \) such that

$$\begin{aligned} A(\tau ) = U e^{\tau \Sigma } V, \end{aligned}$$
(3.1)

then, in a sense, we have a state representation for the (one-time) input-(distributed-time) output map A. The matrix \(\Sigma \) generates the autonomous Markov chain dynamics of the state, the scalar quantity incidence is an input along the fixed vector V and the i-th component of the vector U measures the output, as a contribution to the force-of-infection, of the i-th state. System (4.1) below and the concrete examples (4.5) and (4.8) should clarify this somewhat vague description.

The claim is that the RE (2.5) reduces to an ODE system when (3.1) holds. To substantiate the claim, we define the n-vector valued function Q of time by

$$\begin{aligned} Q(t)&:= \int _{0}^{\infty } e^{\tau \Sigma } V \Psi (w(t-\tau )) d\tau \nonumber \\&= \int _{-\infty }^{t} e^{(t-\sigma ) \Sigma } V \Psi (w(\sigma )) d\sigma . \end{aligned}$$
(3.2)

It follows that

$$\begin{aligned} \frac{dQ}{dt} = \Sigma Q + V \Psi (w). \end{aligned}$$
(3.3)

On the other hand, it follows from (3.1), (3.2) and (2.5) that

$$\begin{aligned} w(t) = U Q(t). \end{aligned}$$
(3.4)

Combining (3.3) and (3.4) we obtain the closed system of ODE

$$\begin{aligned} \frac{dQ}{dt} = \Sigma Q + V \Psi (U Q). \end{aligned}$$
(3.5)

Note that, conversely, given a solution of (3.5), we can define w by (3.4) and verify that w satisfies (2.5). For reasons explained in the next section, we call (3.5) the integrated form of the compartmental model corresponding to \(\Sigma \), V and U.

For completeness, we like to mention that the assumption (3.1) immediately allows us to calculate basic indices for the KM model as follows:

  1. 1.

    the basic reproduction number is given by

    $$\begin{aligned} R_0 = N\int _{0}^{\infty }A(\tau )d\tau =- N U \Sigma ^{-1} V. \end{aligned}$$
    (3.6)
  2. 2.

    the Euler–Lotka equation is

    $$\begin{aligned} 1 = N\int _{0}^{\infty }e^{-\lambda \tau }A(\tau )d\tau =N U (\lambda I - \Sigma )^{-1} V, \end{aligned}$$
    (3.7)

    and the intrinsic growth rate r is given by its real root.

  3. 3.

    the generation time, here denoted by T, is given by

    $$\begin{aligned} T:=\frac{\int _{0}^{\infty }\tau A(\tau ) d\tau }{\int _{0}^{\infty }A(\tau )d\tau } =-\frac{U \Sigma ^{-2} V}{ U \Sigma ^{-1} V}. \end{aligned}$$
    (3.8)

4 Two ways of formulating compartmental models

We start from (2.1)–(2.2), make assumption (3.1), and introduce

$$\begin{aligned} Y(t) := \int _{0}^{\infty } e^{\tau \Sigma } V F(t-\tau ) S(t-\tau ) d\tau , \end{aligned}$$
(4.1)

to count the individuals, that were infected before time t, on the basis of their state at time t. Then we can deduce, as detailed in Section 9.3 of Diekmann et al. (2018), the ODE system

$$\begin{aligned}&\frac{dS}{dt} = - F S,\nonumber \\ {}&\frac{dY}{dt} = \Sigma Y + (F S)V, \end{aligned}$$
(4.2)

with

$$\begin{aligned} F := U Y. \end{aligned}$$
(4.3)

There are good reasons to call this the standard form of the compartmental model corresponding to \(\Sigma \), V and U.

We claim that Q, as introduced in Sect. 3, is the integral of Y, i.e.,

$$\begin{aligned} Q(t) = \int _{-\infty }^{t} Y(\sigma ) d\sigma . \end{aligned}$$
(4.4)

In fact, from (4.1) and (4.4), we obtain

$$\begin{aligned} Q(t)&= \int _{-\infty }^{t} Y(\sigma ) d\sigma =\int _{-\infty }^{t}d\sigma \int _{0}^{\infty } e^{\tau \Sigma } V (-\dot{S}(\sigma -\tau )) d\tau \nonumber \\ {}&= \int _{0}^{\infty } e^{\tau \Sigma } V (N-S(t-\tau )) d\tau \nonumber \\&= \int _{0}^{\infty } e^{\tau \Sigma } V \Psi \left( w(t-\tau )\right) d\tau = \int _{-\infty }^{t} e^{(t-\tau ) \Sigma } V \Psi (w(\tau )) d\tau , \end{aligned}$$
(4.5)

where we used (2.6). Note that convergence of the integral in (4.4) requires that Y converges to zero at \(-\infty \). Differentiating the above equation (4.5) with respect to t, we recover (3.3), i.e.,

$$\begin{aligned} \frac{dQ}{dt} = \Sigma Q + V \Psi (w). \end{aligned}$$
(4.6)

Since \(w(t)=UQ(t)\), we arrive at the integrated compartmental model (3.5).

A minor advantage of the integrated form is that the dimension is n, not \(n+1\). In the present context, the key favourable feature is that the integrated form extends seamlessly to the separable heterogeneous setting (while the standard form does not).

To illustrate the integrated formalism, we consider the two most basic examples. If we write the standard form of the SIR model as

$$\begin{aligned}&\frac{dS}{dt} = - \beta I S, \nonumber \\ {}&\frac{dI}{dt} = -\alpha I + \beta I S, \end{aligned}$$
(4.7)

the integrated form reads

$$\begin{aligned} \frac{dQ}{dt} = -\alpha Q + \Psi (\beta Q), \end{aligned}$$
(4.8)

where

$$\begin{aligned} Q(t)=\int _{-\infty }^{t}I(\sigma )d\sigma , \end{aligned}$$

and \(\Psi \) is defined in (2.6). So here \(n=1\), \(V=1\), \(U=\beta \) and \(\Sigma =-\alpha \).

If we write the standard form of the SEIR model as

$$\begin{aligned}&\frac{dS}{dt} = - \beta I S, \nonumber \\ {}&\frac{dE}{dt} = - \gamma E + \beta I S, \nonumber \\ {}&\frac{dI}{dt} = \gamma E - \alpha I, \end{aligned}$$
(4.9)

the integrated form reads

$$\begin{aligned}&\frac{dQ_1}{dt} = - \gamma Q_1 + \Psi (\beta Q_2), \nonumber \\&\frac{dQ_2}{dt} = \gamma Q_1 - \alpha Q_2, \end{aligned}$$
(4.10)

where

$$\begin{aligned}\begin{pmatrix} Q_1(t) \\ Q_2(t)\end{pmatrix}=\begin{pmatrix}\int _{-\infty }^{t}E(\sigma )d\sigma \\ \int _{-\infty }^{t}I(\sigma )d\sigma \end{pmatrix}. \end{aligned}$$

So here \(n=2\) and we have

$$\begin{aligned} V =\begin{pmatrix}1 \\ 0 \end{pmatrix}, \;U = \begin{pmatrix}0&\beta \end{pmatrix} , \;\Sigma = \begin{pmatrix}-\gamma &{} 0 \\ \gamma &{} -\alpha \end{pmatrix}. \end{aligned}$$
(4.11)

5 Incorporating heterogeneity

Let host individuals be characterized by a trait x, with x ranging in a measurable space \(\Omega \) (the advantage of this somewhat abstract formulation is that x may be a discrete variable, a continuous variable or a mixture of these two possibilities, in the sense that x has both a discrete and a continuous component). The kernel A now has three arguments, \(\tau \), x and \(\xi \). The variable x specifies the trait of the individual that is at risk of becoming infected, while \(\xi \) and \(\tau \) specify, respectively, the trait and the time-since-infection of an infected individual.

We want a formalism that captures both the case where \(\Omega \) is discrete (for instance a finite set) and the case where \(\Omega \) is continuous. To achieve this, we describe the population composition by a measure \(\Phi \) on \(\Omega \) and, for any t, a bounded measurable function \(s(t, \cdot )\) such that of the individuals with trait x, a fraction s(tx) is still susceptible at time t (in other words: s(tx) is the probability that an individual with trait x is susceptible at time t and \(s(-\infty ,x)=1\)).

We replace (2.1)–(2.2) by, respectively

$$\begin{aligned} \frac{\partial s}{\partial t} (t,x) = - F(t,x) s(t,x), \end{aligned}$$
(5.1)

and

$$\begin{aligned} F(t,x) =N\int _{0}^{\infty } \int _\Omega A(\tau , x, \xi ) F(t-\tau , \xi ) s(t-\tau , \xi ) \Phi (d\xi ) d\tau . \end{aligned}$$
(5.2)

When functions a, b and c exist such that

$$\begin{aligned} A(\tau , x, \xi ) = a(x) b(\tau ) c(\xi ), \end{aligned}$$
(5.3)

we deduce from (5.2) that F is the product of a(x) and a function of time. This motivates us to introduce a function w such that

$$\begin{aligned} \int _{-\infty }^{t} F(\sigma ,x) d\sigma = a(x) w(t). \end{aligned}$$
(5.4)

Next we integrate (5.2) with respect to time over \((-\infty , t]\), while using (5.1) to replace the product Fs by the time partial derivative of s. After dividing out the factor a(x) that occurs at both sides, we obtain

$$\begin{aligned} w(t) = \int _{0}^{\infty } b(\tau ) \Psi (w(t-\tau )) d\tau , \end{aligned}$$
(5.5)

with \(\Psi \) now defined by

$$\begin{aligned} \Psi (w) := N \int _{\Omega } c(\xi ) (1 - e^{- a(\xi ) w} ) \Phi (d\xi ), \end{aligned}$$
(5.6)

with \(\Phi \) the measure that describes the probability distribution of the trait in the host population. Note that from (5.6) we recover the earlier definition (2.6) in the special (homogeneous) case that both a and c are identically equal to 1.

Also note that when a is identically equal to one, (5.6) just states that we can simply work with the average value of c. So, as indeed emphasized in Chapter 2 of Diekmann et al. (2012), heterogeneity of infectiousness is, in the large numbers deterministic limit, simple : just take the expected value.

Essentially, (2.5) and (5.5) are the same. When a is not constant, (5.6) differs from (2.6). When

$$\begin{aligned} b(\tau ) = U e^{\tau \Sigma } V, \end{aligned}$$
(5.7)

we can define Q as before, see (3.2), and retrieve both (3.5) and (3.4). We conclude that in terms of the integrated formulation of a compartmental model, we can incorporate separable heterogeneity by simply redefining the function \(\Psi \). The fact that (5.6) involves an integral is of little to no importance for the theory. But when one wants to study (3.5) numerically, it is a nuissance. As noted before by various authors (cf. Gomes et al. 2022; Montalbán et al. 2022; Neipel et al. 2020; Novozhilov 2008, 2012; Tkachenko et al. 2021a), in a special case one can replace the integral by an explicit expression. In Sect. 7 we shall provide the details.

6 The standard form: a recipe

In Sect. 8 we present a model for a heterosexually transmitted disease. In that case we have to distinguish between newly infected males and newly infected females. So there are TWO states-at-infection. Here we focus on models in which there is only one state-at-infection, represented by the vector V (but note that V may be a probability vector with several non-zero components, see Example 8.1).

Our starting point is the compartmental system

$$\begin{aligned}&\frac{dS}{dt} = - F S, \nonumber \\ {}&\frac{dY}{dt} = \Sigma Y + (F S) V,\nonumber \\ {}&F = U Y. \end{aligned}$$
(6.1)

We want to incorporate separable static heterogeneity as described by the trait space \(\Omega \), the trait distribution \(\Phi \), the relative trait-specific susceptibility a and the relative trait-specific infectiousness c. Here relative means that we single out a representative \(\bar{x}\) in \(\Omega \) and normalise a and c by requiring that \(a(\bar{x}) = 1\), \(c(\bar{x}) = 1\) (note that (5.3) offers the possibility to thus normalise a and c; if \(\Omega \) is a continuum, it makes sense to choose the mean trait as \(\bar{x}\)).

Then \(\bar{s}(t):=s(t,\bar{x})\) denotes the fraction of the individuals with the trait \(\bar{x}\) that escaped infection up to the current time. For arbitrary trait x, the corresponding probability equals

$$\begin{aligned} s(t,x)=\bar{s}(t)^{a(x)}, \end{aligned}$$
(6.2)

because \(s(t,x)=\exp (-a(x)w(t))\). Hence the fraction of the total population that is still susceptible is given by

$$\begin{aligned} s_{\textrm{tot}} = \int _\Omega \bar{s}^{a(\xi )} \Phi (d\xi ). \end{aligned}$$
(6.3)

Moreover, since \(a(\bar{x})=1\), we have \(w(t)=-\log \bar{s}(t)\). Thus knowledge of \(\bar{s}\) is sufficient to determine both s(tx) and w(t).

On the other hand, in analogy with (4.1), let us define the trait-specific y-variable by

$$\begin{aligned} y(t,\xi ):=\int _{0}^\infty e^{\tau \Sigma }VF(t-\tau ,\xi )s(t-\tau ,\xi )d\tau , \end{aligned}$$
(6.4)

and let Y now denote the weighted y-variable:

$$\begin{aligned} Y(t)&=N\int _{\Omega }c(\xi )y(t,\xi )\Phi (d\xi ) \nonumber \\&=\int _{-\infty }^{t} e^{(t-\sigma ) \Sigma }V N\int _{\Omega }c(\xi )F(\sigma ,\xi )s(\sigma ,\xi )\Phi (d\xi ) d\sigma . \end{aligned}$$
(6.5)

Note that, since the dynamics of infected individuals is independent of the trait and linear, it does not matter whether we apply the weight factor c when computing the output by applying U or right at the start, i.e., immediately after the infection took place.

Now we formulate the standard compartmental system which takes into account the static heterogeneity:

Claim

The heterogeneous compartmental system consisting of (3.5) with \(\Psi \) defined by (5.6), has the standard form representation

$$\begin{aligned}&\frac{d\bar{s}}{dt} = - \bar{F} \bar{s},\nonumber \\ {}&\frac{dY}{dt} = \Sigma Y + (\bar{F} \Psi '(- \log \bar{s}) ) V, \nonumber \\ {}&\bar{F} = U Y, \end{aligned}$$
(6.6)

where \(\bar{s}(t)=s(t,\bar{x})\), \(\bar{F}(t)=F(t,\bar{x})\) and explicitly we have

$$\begin{aligned} \Psi '(w) = N \int _\Omega c(\xi ) a(\xi ) e^{- a(\xi ) w } \Phi (d\xi ). \end{aligned}$$
(6.7)

Proof

It follows from (5.2), (5.3), (6.4) and (6.5) that

$$\begin{aligned} \bar{F}(t)&=N\int _{0}^{\infty } \int _\Omega a(\bar{x})b(\tau )c(\xi ) F(t-\tau , \xi ) s(t-\tau , \xi ) \Phi (d\xi ) d\tau \nonumber \\&=\int _{0}^{\infty }Ue^{\tau \Sigma }V N \int _\Omega c(\xi ) F(t-\tau , \xi ) s(t-\tau , \xi ) \Phi (d\xi ) d\tau \nonumber \\&=U Y(t). \end{aligned}$$
(6.8)

Differentiation of (3.5) with respect to time yields

$$\begin{aligned} \frac{dY}{dt} = \Sigma Y + (\Psi '(U Q) U Y ) V. \end{aligned}$$
(6.9)

We identify UQ with \(- \log \bar{s}\) , since this is what we obtain when we integrate

$$\begin{aligned} \frac{d\bar{s}}{dt} = - U Y \bar{s}. \end{aligned}$$
(6.10)

\(\square \)

7 The gamma distribution

Let \(\Omega = [0, \infty )\) and let \(a(x) = x\), i.e., let the trait correspond directly to relative susceptibility. Let \(\Phi \) be the Gamma Distribution with mean 1 and variance \(p^{-1}\) . In other words, let \(\Phi \) have density

$$\begin{aligned} x \rightarrow \frac{p^p}{\Gamma (p)} x^{p-1} e^{-px}. \end{aligned}$$
(7.1)

The key feature is that under these assumptions we can evaluate the integral in (5.6) when c is a (low order) polynomial and thus obtain an explicit expression for \(\Psi \). The underlying reason is that we deal with (a derivative of) the Laplace Transform of \(\Phi \), which is itself explicitly given by

$$\begin{aligned} \hat{\Phi }(\lambda ) = \left( \frac{\lambda }{p} + 1\right) ^{-p}. \end{aligned}$$
(7.2)

If the trait has no influence on infectiousness, i.e., c is identically equal to 1, we have

$$\begin{aligned} \Psi (w) =N\left[ 1 - \left( \frac{w}{p} + 1\right) ^{-p} \right] , \end{aligned}$$
(7.3)

while if infectiousness too is proportional to the trait, i.e., \(c(\xi )=\xi \), we obtain

$$\begin{aligned} \Psi (w) = N\left[ 1 - \left( \frac{w}{p} + 1\right) ^{-p-1}\right] . \end{aligned}$$
(7.4)

In Bootsma et al. (2022), we compare and contrast these special cases in terms of \(R_0\) , the Herd Immunity Threshold and the final size. Here we limit ourselves to the observation that (3.5), with either (7.3) or (7.4), enables modelers to study rather easily the impact of separable heterogeneity on the dynamics of their favourite compartmental model, cf. Neipel et al. (2020).

Now recall that the standard form involves \(\bar{s}\), the value of s in a representative point \(\bar{x}\). In the special case of the Gamma Distribution, we can work with \(s_{\textrm{tot}}\) instead of \(\bar{s}\) and replace (6.6) by

$$\begin{aligned}&\frac{d s_{\textrm{tot}}}{dt} = - \bar{F} s_{\textrm{tot}}^{1 + \frac{1}{p}} \nonumber \\ {}&\frac{dY}{dt} = \Sigma Y + (\bar{F} H(s_{\textrm{tot}}) ) V\nonumber \\ {}&\bar{F} = U Y \end{aligned}$$
(7.5)

with

$$\begin{aligned} H(s)={\left\{ \begin{array}{ll} N s^{1+\frac{1}{p}} &{} \textrm{if}\;c\;\mathrm{is\,identically\,equal\,to\,1},\\ N \left( 1+\frac{1}{p}\right) s^{1+\frac{2}{p}} &{} \textrm{if}\;c\,(\xi ) = \xi . \end{array}\right. } \end{aligned}$$
(7.6)

The derivation of (7.5) starts by choosing \(\bar{x} = 1\) (= mean trait), \(a(\bar{x})=1\) and rewriting (6.3) in this special case as

$$\begin{aligned} s_{\textrm{tot}} = \hat{\Phi }( - \log \bar{s}). \end{aligned}$$
(7.7)

Since \(\hat{\Phi }\) is given explicitly by (7.2), one can express derivatives of \(\hat{\Phi }\) in terms of \(\hat{\Phi }\) itself. And derivatives correspond to incorporating powers of the variable in the function that is Laplace transformed. We refer to Novozhilov (2008) for an early derivation of (7.5) and for references to still earlier work.

8 Some examples

Example 1

The diagram depicted in Fig. 1 gives a concise representation of the compartmental model that we consider in this section as a slightly more complex example. Our first aim is to illustrate how one obtains the model ingredients U, \(\Sigma \), V from such a diagram.

Fig. 1
figure 1

SEIR model with asymptomatic infection and quarantine

The index 1 denotes asymptomatic individuals, the index 2 symptomatic individuals. As the symptoms get noticed, a diagnosis is possible for symptomatic individuals and subsequently they may be put into quarantaine, i.e., enter Q. The two types of individuals occur with ratio p : \(1-p\) , with p a parameter. So the state-at-infection/birth is a probability vector V with two non-zero components. Immediately following infection an individual is Exposed but not yet Infectious. The sojourn times of the various compartments are all exponentially distributed with a parameter specified in the diagram (in the form of a name label, so as a parameter, not as a number with dimension 1/time). The compartments R and Q aid the bookkeeping, but their contents is irrelevant for future dynamics, so we do not incorporate them into the population state vector.

As is customarily done, we use the same characters to denote a compartment and the contents of this compartment. Define the 4-vector Y by

$$\begin{aligned} Y =\begin{pmatrix} E_1 \\ E_2 \\ I_1 \\ I_2 \end{pmatrix}, \end{aligned}$$
(8.1)

the vector V by

$$\begin{aligned} V = \begin{pmatrix} p \\ 1-p \\ 0 \\ 0 \end{pmatrix}, \end{aligned}$$
(8.2)

the vector U by

$$\begin{aligned} U=\begin{pmatrix} 0&0&\beta _1&\beta _2 \end{pmatrix}, \end{aligned}$$
(8.3)

and the matrix \(\Sigma \) by

$$\begin{aligned} \Sigma =\begin{pmatrix} -\gamma _1 &{} 0 &{} 0 &{} 0 \\ 0 &{} -\gamma _2 &{} 0 &{} 0 \\ \gamma _1 &{} 0 &{} -\alpha _1 &{} 0 \\ 0 &{} \gamma _2 &{} 0 &{} -(\alpha _2 + \theta ) \end{pmatrix}, \end{aligned}$$
(8.4)

then (4.1)–(4.2) is the standard representation of the compartmental model specified by the diagram. So if we define Q by (4.4) then we obtain the integrated representation (3.5).

Either by a direct consideration, or by verifying that

$$\begin{aligned} -\Sigma ^{-1}=\begin{pmatrix} \frac{1}{\gamma _1}&{} 0 &{} 0 &{} 0 \\ 0 &{} \frac{1}{\gamma _2} &{} 0 &{} 0\\ \frac{1}{\alpha _1} &{} 0 &{} \frac{1}{\alpha _1} &{} 0\\ 0 &{} \frac{1}{\alpha _2 + \theta } &{} 0 &{} \frac{1}{\alpha _2 + \theta } \end{pmatrix}, \end{aligned}$$
(8.5)

and applying the formula (3.6), we obtain

$$\begin{aligned} R_0 = N \left\{ p \frac{\beta _1}{\alpha _1} + (1-p) \frac{\beta _2}{\alpha _2 + \theta } \right\} . \end{aligned}$$
(8.6)

Similarly we obtain from (3.8) that the generation time is given by

$$\begin{aligned} T=\frac{ p \frac{\beta _1}{\alpha _1} \left( \frac{1}{\gamma _1} + \frac{1}{\alpha _1}\right) + (1-p)\frac{\beta _2}{\alpha _2 + \theta } \left( \frac{1}{\gamma _2} + \frac{1}{\alpha _2 + \theta }\right) }{ p \frac{\beta _1}{\alpha _1} + (1-p) \frac{\beta _2}{\alpha _2 + \theta } }. \end{aligned}$$
(8.7)

Example 1, continued

Next let us introduce immune system related heterogeneity in the sense that we distinguish between standard individuals, which we label 1, and partly immune individuals, which we label 2. The relative susceptibility of type 2 individuals is given by the parameter \(\epsilon _1\) and the relative infectiousness by the parameter \(\epsilon _2\). In the present context it does not matter whether the immunity results from an earlier outbreak or from vaccination. But we assume it exists before the outbreak that we model is initiated or, in other words, that it does not result from control measures during the outbreak.

Let \(N = N_1 + N_2\) with \(N_1\) and \(N_2\) the size of the subpopulation of individuals of type, respectively, 1 and 2. Then \(\Phi \) has components \(N_1/N\) and \(N_2/N\). We may choose 1 as \(\bar{x}\) and let a have components 1 and \(\epsilon _1\) and let c have components 1 and \(\epsilon _2\). It follows that

$$\begin{aligned} \Psi (w)= & {} N_1 (1-e^{-w}) + N_2 \epsilon _2 (1- e^{- \epsilon _1 w}) \end{aligned}$$
(8.8)
$$\begin{aligned} \Psi '(- \log \bar{s})= & {} N_1 \bar{s} + N_2 \epsilon _1 \epsilon _2 \bar{s}^{\epsilon _1}. \end{aligned}$$
(8.9)

One can now consider (3.5) or (6.6) with the above definitions of U, \(\Sigma \), V and \(\Psi \) (resp. \(\Psi '\)) and study, for instance, numerically how peak size is influenced by the parameters \(\epsilon _1\), \(\epsilon _2\) and \(N_1\) (for given N).

Note that when we choose \(\epsilon _1=\epsilon =\epsilon _2\), this example allows for an alternative interpretation: as a result of a control measure, individuals reduce their social activity with a factor \(\epsilon \), but only a fraction \(N_2/N\) complies, the complementary fraction does not reduce its social activity.

Example 2

In the formulation of our results, we have restricted to the situation in which one vector V and one vector U suffice. In Diekmann et al. (2018), the generic result actually concerns systems of REs for which one needs as many vectors V and U as the number of components of the system. Here we illustrate how this works by concentrating on the epidemiologically relevant example of a heterosexually transmitted disease (or a disease transmitted by a vector).

Let \(S_i\) \((i=1,2)\) be the size of the susceptible population, where the index 1 denotes males and the index 2 females (or the host and the vector, respectively). We postulate that

$$\begin{aligned} \frac{dS_i}{dt}&= - F_i S_i,\nonumber \\ \begin{pmatrix} F_1(t) \\ F_2(t)\end{pmatrix}&= \int _{0}^{\infty }A(\tau ) \begin{pmatrix}F_1(t-\tau )S_1(t-\tau )\\ F_2(t-\tau )S_2(t-\tau ) \end{pmatrix} d\tau , \end{aligned}$$
(8.10)

where

$$\begin{aligned} A(\tau ):=\begin{pmatrix} 0 &{} A_{12}(\tau ) \\ A_{21}(\tau ) &{} 0 \end{pmatrix}. \end{aligned}$$
(8.11)

Define the cumulative force of infection as

$$\begin{aligned} w(t)=\begin{pmatrix} \int _{-\infty }^{t}F_1(\sigma )d\sigma \\ \int _{-\infty }^{t}F_2(\sigma )d\sigma \end{pmatrix}. \end{aligned}$$
(8.12)

Then we obtain the renewal equation:

$$\begin{aligned} w(t)=\int _{0}^{\infty }A(\tau )\Psi (w(t-\tau ))d\tau , \end{aligned}$$
(8.13)

with

$$\begin{aligned} \Psi (w)=\begin{pmatrix} N_1(1-e^{-w_1}) \\ N_2(1-e^{-w_2})\end{pmatrix}, \end{aligned}$$
(8.14)

where \(N_1\) denotes the total size of the male population and \(N_2\) is the total size of the female population.

For the compartmental case, we have

$$\begin{aligned} A(\tau ):=\begin{pmatrix} 0 &{} U_2e^{\tau \Sigma _2}V_2 \\ U_1e^{\tau \Sigma _1}V_1 &{} 0 \end{pmatrix}. \end{aligned}$$
(8.15)

Here \(U_i\), \(V_i\) are \(1 \times n_i\), \(n_i \times 1\) matrices and \(\Sigma _i\) is an \(n_i \times n_i\) matrix. It seems to make sense to assume that \(n_1=n_2\) and perhaps \(\Sigma _1=\Sigma _2\). But if within host processes are different for males and females, we should allow for \(n_1 \ne n_2\) and \(\Sigma _1 \ne \Sigma _2\). This is anyhow a reasonable assumption for the host-vector situation.

Define vectors \(Q_1\) and \(Q_2\) by

$$\begin{aligned}w_1=U_2 Q_2, \quad w_2=U_1 Q_1,\end{aligned}$$

then

$$\begin{aligned} \frac{dQ_1}{dt}&=\Sigma _1 Q_1+\Psi _1(U_2Q_2)V_1,\nonumber \\ \frac{dQ_2}{dt}&=\Sigma _2 Q_2+\Psi _2(U_1Q_1)V_2. \end{aligned}$$
(8.16)

This is the integrated version of the following standard form:

$$\begin{aligned} \frac{dS_i}{dt}&= - F_i S_i, \quad (i=1, 2),\nonumber \\ \frac{dY_i}{dt}&=\Sigma _i Y_i+F_iS_iV_i,\nonumber \\ F_1&=U_2Y_2, \nonumber \\ F_2&=U_1Y_1. \end{aligned}$$
(8.17)

Example 2, continued

We use the same trait \(x \in \Omega \) to characterize males and females. The trait x may represent promiscuity. We start from (5.2) but now F is a 2-vector and A a \(2 \times 2\)-matrix with zero’s on the diagonal. We assume

$$\begin{aligned} A_{ij}(\tau ,x,\xi )=a_i(x)b_j(\tau )c_j(\xi ). \end{aligned}$$
(8.18)

Let

$$\begin{aligned} \int _{-\infty }^{t}F_i(\sigma ,x)d\sigma =a_i(x)w_i(t), \end{aligned}$$
(8.19)

then

$$\begin{aligned} w_1(t)&=N_2\int _{0}^{\infty }b_2(\tau )\int _{\Omega } c_2(\xi )(1-e^{-a_2(\xi )w_2(t-\tau )})\Phi _2(d\xi )d\tau , \nonumber \\ w_2(t)&=N_1\int _{0}^{\infty }b_1(\tau )\int _{\Omega } c_1(\xi )(1-e^{-a_1(\xi )w_1(t-\tau )})\Phi _1(d\xi )d\tau , \end{aligned}$$
(8.20)

which is of the form (8.13) with \(A=\begin{pmatrix} 0 &{} b_2 \\ b_1 &{} 0 \end{pmatrix}\) and

$$\begin{aligned} \Psi (w)=\begin{pmatrix} N_1\int _{\Omega } c_1(\xi )(1-e^{-a_1(\xi )w_1(t-\tau )})\Phi _1(d\xi ) \\ N_2\int _{\Omega } c_2(\xi )(1-e^{-a_2(\xi )w_2(t-\tau )})\Phi _2(d\xi ) \\ \end{pmatrix}. \end{aligned}$$
(8.21)

Now we assume that (8.15) holds. Then (8.16) holds, with now \(\Psi \) as defined above.

In our presentation of this example we avoided to discuss the consistency requirement that there are, in total, as many contacts of males with females as there are contacts of females with males. Partly we did so because transmission risk may be asymmetric, which obviously impairs the symmetry requirement. But the more important reason is that our aim here is just to illustrate the flexibility of the bookkeeping framework (and not to analyse a model of a heterosexually transmitted disease).

9 Concluding remarks

On November 13, 2022, the KM paper (Kermack and McKendrick 1927) had, according to Google Scholar, 12.590 citations. On that same date the Royal Society listed 60.773 downloads since the beginning of 1997, when the paper became available online. No doubt there are among the authors that cite (Kermack and McKendrick 1927) some who actually read the paper and who refer to the general age-of-infection model, see e.g. Brauer (2005), Breda et al. (2012), Diekmann et al. (2012), Inaba (2017), Montalbán et al. (2022), Novozhilov (2008, 2012), Thieme (2003) and Tkachenko et al. (2021a). In the big majority of cases, however, it is explicitly stated that Kermack and McKendrick (1927) introduced the SIR compartmental model and implicitly suggested that that is it. This both reflects and reinforces an incessant misconception in the math-epi community at large, viz., that Kermack and McKendrick (1927) is just about the SIR compartmental model.

In fact, as shown in Diekmann et al. (2018), any compartmental model in which the (probability distribution of the) state-at-infection is described by a given fixed vector V corresponds to a (very) special case of the renewal equation. The compartmental system is coded by the triple \((V, \Sigma , U)\), with the vector V describing the initial state, the matrix \(\Sigma \) specifying the rates at which state transitions occur and the vector U describing the contribution of the various states to the force of infection. (When there are several possibilities for the state-at-infection, one needs a system of renewal equations and a corresponding number of vectors V and U, see Diekmann et al. 2018 and Example 8.2.)

It has long been recognized that host heterogeneity can have a big impact on epidemic dynamics, see, e.g., Diekmann et al. 1990, 2012, Inaba (2017), Katriel (2011), Novozhilov (2008, 2012), Veliov and Widder (2016), Tsachev et al. (2017) and Hickson and Roberts (2014). In general, the incorporation of heterogeneity necessitates the introduction of kernels and leads to infinite dimensional dynamical systems. In this context too, the question arises whether or not one can reduce to a finite system of ODE. In Novozhilov (2008, 2012) and in the more recent Covid-triggered papers (Gomes et al. 2022; Montalbán et al. 2022; Neipel et al. 2020; Rose et al. 2021) a restricted form of static heterogeneity, in which the traits of the two individuals involved in a contact are assumed to have independent influence on the likelihood of contact and/or transmission, is considered. The impact of such heterogeneity on the outbreak dynamics, is, of course, most easily studied if the heterogeneity is captured by a modification of the compartmental model one is interested in.

Here we have shown that it is straightforward to derive the desired modification if one first formulates the heterogeneous version of the KM RE and relates it to an integrated version of the compartmental model. Once the modified integrated version is available, it is easy to derive the modification of the standard form by differentiation. The end result only involves (the derivative of) a function \(\Psi \) from \({\mathbb {R}}\) to \({\mathbb {R}}\) defined in terms of the distribution of the trait and the functions that describe how susceptibility and infectiousness depend on the trait (so one can use the end result without any reference to renewal equations).

From a more general theoretical point of view, our methodology is in the spirit of Diekmann et al. (2020a, 2020b) and the much older references in there. In a similar manner, Novozhilov (2008, 2012) build on the ideas of G.P. Karev as described in Karev (2010), Karev and Novozhilov (2019) and the much older references in there.

In Breda et al. (2012), it is shown how to extend the RE formulation to models incorporating demographic turnover and/or temporary immunity. Such models allow for endemic steady states. They do not have an integrated version, for the simple reason that the integrals are bound to diverge. At present it remains unclear whether or not one can include separable static heterogeneity in such models via a simple modification (the authors are not very optimistic...).

We refer to Tkachenko et al. (2021b) for an up to date Covid inspired account of the role of heterogeneity in outbreak dynamics. In there it is emphasized that, on top of persistent static heterogeneity, dynamic heterogeneity may play a major role in damping the overshoot of the herd immunity threshold (in much the same way as a prey refuge dampens prey-predator oscillations). An additional variable h is introduced to capture the dynamic heterogeneity.

Our message in this paper is a bit equivocal. On the one hand, our aim was to show how separable static heterogeneity can be easily incorporated into compartmental models. On the other hand, we want to emphasize that the RE formulation is far more general and flexible and that the predominance of compartmental models is rather detrimental for epidemic modeling. No doubt the fact, that user friendly tools for the numerical study of RE are lacking, contributes to their unpopularity. In order to end at a positive note, we call attention to two recent developments: (i) discrete time models, see Diekmann et al. (2021), Kreck and Scholz (2022), Sofonea et al. (2021) and (ii) pseudospectral approximation, see Scarabel et al. (2021).