1 Introduction

Understanding and predicting the dynamics and control of infectious diseases relies on representative models, whether conceptual or mathematical. Mathematical modelling was established in infectious diseases over a century ago, with the seminal works of Ross (1916), Ross and Hudson (1917), Kermack and McKendrick (1927) and others. Propelled by the discovery of aetiological agents for infectious diseases, and the germ theory, models have focused on the complexities of pathogen transmission and evolution (Heesterbeek 2015). It has recurrently been noted for over a century, however, that these models tend to overpredict transmission potential and overestimate the impact of control measures which may be explained by limitations in capturing the effects of heterogeneity (Kermack and McKendrick 1927; McKendrick 1939; Gart 1968, 1971; Ball 1985; Anderson et al. 1986; Pastor-Satorras and Vespignani 2001; Miller et al. 2012; Gomes et al. 2022).

Here we analyze a set of susceptible-exposed-infected-recovered (SEIR) models presented in Gomes et al. (2022), Aguas et al. (2021) where each of the compartments \({\mathsf {S}}\), \({\mathsf {E}}\), \({\mathsf {I}}\) and \({\mathsf {R}}\) is expanded into continuum many compartments S(x), E(x), I(x) and R(x), where \(x\in {{\mathbb {R}}}^+\) is a trait that varies among individuals. Specifically we model a situation where each individual has a level of susceptibility or exposure (connectivity) x, starting in compartment S(x) and staying within the compartments S(x), E(x), I(x) and R(x) the whole time. This individual may infect or be infected by others irrespective of their trait value x assuming random mixing (Anderson and May 1991; Diekmann et al. 2013). We will consider two types of models:

A variable susceptibility case where the susceptibility of an individual at level x is proportional to x, or, in other words, if we compare an individual at level x and an individual at level y, the one at level x is x/y times more likely to get infected than the one with susceptibility y. We may interpret this as variation in biological susceptibility which may be due to genetics, epigenetics or life history.

A variable connectivity case where the propensities for an individual at level x to acquire infection and transmit to others are both proportional to x, or, in other words, if we compare an individual at level x and an individual at level y, the one at level x is x/y times more likely to get infected than the one in level y and also x/y times more likely to infect someone else once infected. This is interpreted as individuals with many contacts being both more likely to get infected and to infect others.

For each x, we have a system of the form:

where \(\lambda \) is the force of infection which is formulated differently in the variable susceptibility or the variable connectivity cases:

$$\begin{aligned} \text{ Variable } \text{ susceptibility: }&\quad&\lambda = \beta \int I(x) \ dx, \\ \text{ Variable } \text{ connectivity: }&\quad&\lambda = \beta \int x \ I(x) \ dx. \end{aligned}$$

Note that \(\lambda \) varies with time, as it depends on the time-dependent infected population.

The dynamics of compartments S(x), E(x), I(x) and R(x) are governed by the infinite system of ordinary differential equations:

$$\begin{aligned} \frac{dS(x)}{dt}= & {} - \lambda \ x\ S(x), \end{aligned}$$
(1)
$$\begin{aligned} \frac{dE(x)}{dt}= & {} \lambda \ x\ S(x) -\delta \ E(x), \end{aligned}$$
(2)
$$\begin{aligned} \frac{dI(x)}{dt}= & {} \delta \ E(x) - \gamma \ I(x), \end{aligned}$$
(3)
$$\begin{aligned} \frac{dR(x)}{dt}= & {} \gamma \ I(x). \end{aligned}$$
(4)

We assume that the system has been scaled such that the total population is 1. The initial conditions for variables S(xt), E(xt), I(xt) and R(xt), satisfy \(S(x,0)=(1-\epsilon )\ q(x)\), \(E(x,0)=\epsilon \ q(x)\) and \(I(x,0)=R(x,0)=0\), where \(0<\epsilon \ll 1\) is a small scalar to seed the epidemic, and q(x) is a probability density function with mean 1 and coefficient of variation \(\nu \):

$$\begin{aligned} \int x q(x)\ dx = 1 \qquad&\text{ and }&\qquad \sqrt{\int (x-1)^2 q(x)\ dx} = \nu . \end{aligned}$$
(5)

We use \({{\mathsf {S}}}(t)\) to denote the integral over all susceptibility levels, S(xt), for \(x\in {{\mathbb {R}}}^+\). We thus have \({{\mathsf {S}}}(t)=\int _0^{+\infty } S(x,t) dx\). Same with \({{\mathsf {E}}}(t)\), \({{\mathsf {I}}}(t)\) and \({{\mathsf {R}}}(t)\).

We will use the first three moments of S(xt), that we denote \({{\mathsf {S}}}(t)\), \(\bar{{{\mathsf {S}}}}(t)\) and \(\overline{\overline{{\mathsf {S}}}}(t)\):

$$\begin{aligned} {{\mathsf {S}}}(t) = \int S(x,t)\ dx, \quad \quad \bar{{{\mathsf {S}}}}(t) = \int x S(x,t)\ dx \quad \text{ and }\quad \overline{\overline{{\mathsf {S}}}}(t) = \int x^2 S(x,t)\ dx. \end{aligned}$$
(6)

When infection is absent (\(\epsilon =0\)), we have \({{\mathsf {S}}}(0)=1\), \(\bar{{{\mathsf {S}}}}(0)=1\) and \(\overline{\overline{{\mathsf {S}}}}(0) = 1 + \nu ^2\). But note that S(xt) is not a probability density function for \(\epsilon >0\) as \({{\mathsf {S}}}(t)\) becomes less than 1. The quotient \(S(x,t)/{{\mathsf {S}}}(t)\) as a function of x for fixed t will be a probability density function for \(\epsilon >0\) and all t with first and second moments \(\bar{{{\mathsf {S}}}}(t)/{{\mathsf {S}}}(t)\) and \(\overline{\overline{{\mathsf {S}}}}(t)/{{\mathsf {S}}}(t)\) which decrease over time. When the initial configuration q(x) is a gamma distribution, all the distributions \(S(x,t)/{{\mathsf {S}}}(t)\) are also gamma with the same coefficient of variation \(\nu \) but with lower mean (see Appendix A), an argument which enables mathematical derivations to advance further when traits are assumed to be gamma distributed (Novozhilov 2008).

Similarly, we define the moments \({{\mathsf {R}}}(t)\), \(\bar{{{\mathsf {R}}}}(t)\) and \(\overline{\overline{{\mathsf {R}}}}(t)\) for the recovered compartment, and the same with \({\mathsf {E}}\) and \({\mathsf {I}}\). Notice for instance that \(\lambda (t)\) is written as \(\beta \ {\mathsf {I}}(t)\) and \(\beta \ \bar{{{\mathsf {I}}}}(t)\) in the variable susceptibility and variable connectivity cases, respectively.

Here we describe key epidemiological quantities when system of equations (Eqs. 14) is adopted. The basic reproduction number is the average number of secondary infections generated by an infected individual in a totally susceptible population. It depends on characteristics of both the pathogen and the host population. When this number is below 1 no epidemics are expected. When is above 1, however, the introduction of infection in a virgin population is expected to generate an epidemic. This is followed by almost exponential growth in cumulative infections which decelerates gradually as susceptibles are depleted. The effective reproduction number is a time-dependent quantity loosely defined as the number of secondary infections generated by a typical infected individual when the susceptibility of the population is as at time t. coincides with at the beginning of an epidemic (when the population is totally susceptible) but declines as individuals are removed from the susceptible pool by infection and immunity. As crosses 1 towards lower values, the epidemic subsides and future reintroductions of infection are not expected to generate new outbreaks as long as population immunity is maintained.

We derive formulas for the effective reproduction number and the herd immunity threshold \({\mathcal {H}}\) in terms of moments \({{\mathsf {S}}}\), \(\bar{{{\mathsf {S}}}}\) and \(\overline{\overline{{\mathsf {S}}}}\), of the susceptible population. When q(x) is a gamma distribution, \(\bar{{{\mathsf {S}}}}\) and \(\overline{\overline{{\mathsf {S}}}}\) can be formulated in terms of \({{\mathsf {S}}}\) and we get an exact formula for \({\mathcal {H}}\) in terms only of the basic reproduction number and the coefficient of variation \(\nu \). In this case we can also reduce the infinite system (Eqs. 14) to a finite system of ordinary differential equations in \({\mathsf {S}}\), \({\mathsf {E}}\), \({\mathsf {I}}\) and \({\mathsf {R}}\) with nonlinear transmission (exactly when the variable trait is susceptibility and approximately in the case of variable connectivity). In the case of variable connectivity, we provide an exact derivation of a finite system in the variables \(\bar{{{\mathsf {S}}}}\), \(\bar{{{\mathsf {E}}}}\), \(\bar{{{\mathsf {I}}}}\) and \(\bar{{{\mathsf {R}}}}\).

In the variable susceptibility case we will get that:

and consequently, . This implies that the population is above the herd immunity threshold when . When we assume that q(x) is a gamma distribution, the proportion of individuals that have been infected by the time the herd immunity threshold is reached is deduced as:

(7)

and system (Eqs. 14) can be reduced to:

$$\begin{aligned} \frac{d{\mathsf {S}}}{dt}= & {} - \beta \ {\mathsf {I}}\ {\mathsf {S}}^{1+\nu ^2}, \end{aligned}$$
(8)
$$\begin{aligned} \frac{d{\mathsf {E}}}{dt}= & {} \beta \ {\mathsf {I}}\ {\mathsf {S}}^{1+\nu ^2} -\delta \ {\mathsf {E}}, \end{aligned}$$
(9)
$$\begin{aligned} \frac{d{\mathsf {I}}}{dt}= & {} \delta \ {\mathsf {E}}- \gamma \ {\mathsf {I}}, \end{aligned}$$
(10)
$$\begin{aligned} \frac{d{\mathsf {R}}}{dt}= & {} \gamma \ {\mathsf {I}}. \end{aligned}$$
(11)

These equations are exact.

In the variable connectivity case we will get that:

and consequently, . This implies that the population is above the herd immunity threshold when . When we assume that q(x) is a gamma distribution, the proportion of individuals that have been infected by the time the herd immunity threshold is reached is deduced as:

(12)

In (Eqs. 2831), we derive a closed system for the variable connectivity model on the variables \(\bar{{{\mathsf {S}}}}\), \(\bar{{{\mathsf {E}}}}\), \(\bar{{{\mathsf {I}}}}\) and \(\bar{{{\mathsf {R}}}}\). But this does not directly allow the model to be fitted to incidence data provided by routine surveillance. For finding equations that determine the system only on the variables \({\mathsf {S}}\), \({\mathsf {E}}\), \({\mathsf {I}}\) and \({\mathsf {R}}\), in the variable connectivity case we need to make an approximation as detailed in Appendix B. The resulting system (Eqs. 2225) is shown to approximate the original (Eqs. 14) when the infectious period is small as in acute infectious diseases.

In Fig. 1 we provide graphical representations for the \({\mathcal {H}}\) formulas corresponding to the basic model introduced in this section (variable susceptibility in green and variable connectivity in blue), showing a monotonic decrease as the coefficient of variation \(\nu \) increases (Gomes et al. 2022).

Fig. 1
figure 1

Herd immunity threshold. Curves generated using formulas (Eq. 7) for gamma distributed susceptibility and (Eq. 12) for gamma distributed connectivity, with

In Sects. 2 and 3 we provide general derivations for effective reproduction numbers and herd immunity thresholds, while Sect. 4 is focussed on special cases when traits x are gamma distributed. Towards the end of the paper, we analyze two extensions of the basic model. In Sect. 5, we consider a model with reinfection where immunity after recovery is not fully protective but only partially. In Sect. 6, we consider a model with a carrier state, which (Gomes et al. 2022; Aguas et al. 2021) apply to the coronavirus disease (COVID-19) pandemic. There, the exposed compartments are not simply latent but a carrier state where individuals are infectious but to a lesser degree than individuals in the fully infectious compartment. In each case we derive formulas for herd immunity thresholds especially when the initial trait distribution is gamma.

2 Effective reproduction number

The effective reproduction number at time t is defined as the number of secondary infections caused by a typical infected individual over their entire infectious period in an idealized situation, as described below.

Before proceeding with its derivation we need to make two considerations. The first concerns the evolving susceptible pool. We assume that during the period in which an individual is contagious, the density of susceptibles is frozen in time. This means that we disregard the fact that since the susceptible population declines, this individual infects less at the end than at the beginning of their infection. In acute infections, the decline in the susceptible population is usually slow compared to the rate of recovery from infection so, in practice, the impact of the assumption is negligible. Moreover, when is used to analyze stable configurations, such as in the derivation of herd immunity thresholds, the assumption holds and hence has no effect on the results. This consideration pertains to both variable susceptibility and variable connectivity models.

The second concerns the infectivity profile of the infected population at time t. We define:

at time t as the average number of secondary infections generated by an individual who becomes infected at time t. This average is taken over the pool of individuals that go from \({\mathsf {S}}\) to \({\mathsf {E}}\) at time t.

When , infection is not expected to invade an infection-free population. Further details on this concept are discussed in Appendix B. In the remaining of this section we derive explicit formulas for .

First, the variable susceptibility case: Consider an individual who gets infected (more precisely, exposed and consequently infected) at time t (i.e., moves from \({\mathsf {S}}\) to \({\mathsf {E}}\) at time t). This individual will eventually move to \({\mathsf {I}}\) and spend on average \(1/\gamma \) days there. While in \({\mathsf {I}}\), the individual will infect an average of \(\beta \int y\ S(y,t)\ dy\) others per day. We thus get:

(13)

In particular, we get:

(14)

and consequently:

(15)

Second, the variable connectivity case: Consider again an individual who gets infected at time t. It now matters what trait value x this individual has because it determines how many others they will infect.

Let p(xt) be the density function measuring the probability at time t that this individual has connectivity level x. The probability of becoming infected (i.e., of entering the \({\mathsf {E}}\) compartment) is \(x \lambda (t)\). Thus, the value of p(xt) is proportional to xS(xt):

$$\begin{aligned} p(x,t)= & {} x\ \frac{S(x,t)}{\bar{{{\mathsf {S}}}}(t)}. \end{aligned}$$

As above, an individual who enters \({\mathsf {E}}\) will eventually move to \({\mathsf {I}}\), spend on average \(1/\gamma \) days there, and infect an average of \(\beta \int y\ S(y,t)\ dy\) others per day. We thus get:

(16)

In particular, we get:

(17)

and consequently:

(18)

Expressions for , such as (Eq. 14) and (Eq. 17), have been known for decades (Anderson and May 1991; Diekmann et al. 2013; Woolhouse et al. 1997). Worth highlighting, however, is that disproportionately less attention has been given to variable susceptibility than to variable connectivity due to the coefficient of variation \(\nu \) not affecting the formula explicitly in the former case but only in the latter. It will be important to realise, however, that variation in susceptibility is just as impactful when we consider quantities such as herd immunity thresholds and inferences of from observational data (Gomes et al. 2022; Aguas et al. 2021).

3 Herd immunity threshold

Suppose we have a population with no infected individuals, so that all individuals are either susceptible or recovered. The population is said to be at or above the herd immunity threshold for a pathogen if its susceptibility profile to that pathogen is such that a new introduction of infection (i.e., a small increase in \({\mathsf {E}}\) or \({\mathsf {I}}\)) does not trigger an outbreak. By inspection on the differential equations (Eqs. 14), we see that a configuration with \(E(x)=I(x)=0\) satisfies this condition if and only if

In the variable susceptibility case it is equivalent to formulate the herd immunity threshold in terms of suppression of future outbreaks (as adopted here) or in terms of an unmitigated epidemic passing its peak infection prevalence. With variable connectivity, however, this equivalence does not hold as explained in Sect. 2.

A configuration with no infected individuals is then said to be at the herd immunity threshold if and only if . In SEIR models with no individual variation, configurations with no infected individuals are determined by the values of \({\mathsf {S}}\) and \({\mathsf {R}}=1-{\mathsf {S}}\), and the herd immunity threshold is defined as the value of \(1-{\mathsf {S}}\) at the unique configuration with . This value is well-known to be equal to (Anderson and May 1991; Diekmann et al. 2013). With individual variation in susceptibility or exposure to infection, however, there are many configurations which satisfy . One such configuration is given by for all x. This could be obtained, for instance, by vaccinating a proportion of the total population randomly without taking into account susceptibility or exposure levels (Fine et al. 2011). When immunity is acquired naturally, however, individuals with higher susceptibility or exposure tend to be infected earlier and the herd immunity threshold is reached before the susceptible population is as low as of the total. For now, let us retain that for the basic heterogeneous models considered here the herd immunity threshold is reached when in the variable susceptibility case (Eq. 15) and with variable connectivity (Eq. 18).

Next, we will see how we can derive \(\bar{{{\mathsf {S}}}}(t)\) and \(\overline{\overline{{\mathsf {S}}}}(t)\) under the assumption that the initial distribution is gamma.

4 Case of the gamma distribution

Here we study how the distribution of the trait x within the susceptible compartment evolves generally, and then specify to the case where individual variation in susceptibility or connectivity is gamma distributed. We also refer to related work in Neipel et al. (2020), Novozhilov (2008).

4.1 Evolution of the susceptible compartment

From the SEIR equation for dS(xt)/dt we get:

$$\begin{aligned} \frac{1}{S(x,t)} \frac{d S(x,t)}{d t}= & {} - x \ \lambda (t). \end{aligned}$$

Integrating with respect to t we get:

$$\begin{aligned} S(x,t) = q(x) \ e^{-x\ k_t} \quad \text{ where } \quad k_t= \int _0^t \lambda (u)\ du. \end{aligned}$$
(19)

This holds in both variable susceptibility and variable connectivity models (with different values for \(k_t\)). It also holds in the cases with reinfection and with carrier state considered in Sects. 5 and 6.

4.2 Gamma distributed traits

The key observation here is that since \(S(x,t) = q(x) \ e^{-x\ k_t}\), we have that \(S(x,t)/{{\mathsf {S}}}(t)\) remains a gamma distribution at all values of t. This enables the derivations in Appendix A of explicit formulas for the moments \(\bar{{{\mathsf {S}}}}\) (Eq. A.1) and \(\overline{\overline{{\mathsf {S}}}}\) (Eq. A.2) in terms of \({{\mathsf {S}}}\) and the shape parameter \(\alpha \), when susceptibility or connectivity are gamma distributed.

Using that the coefficient of variation is \(\nu = 1/\sqrt{\alpha }\) we rewrite the respective formulas as:

$$\begin{aligned} \bar{{{\mathsf {S}}}}(t)= & {} {{\mathsf {S}}}(t)^{1+\nu ^2}, \end{aligned}$$
(20)
$$\begin{aligned} \overline{\overline{{\mathsf {S}}}}(t)= & {} (1+\nu ^2)\ {{\mathsf {S}}}(t)^{1+2 \nu ^2}. \end{aligned}$$
(21)

To derive the reduced system in the variable susceptibility case (Eqs. 811) we integrate the equations in (Eqs. 14) over the susceptibility domain and apply (Eqs. 13 and 20) to get:

$$\begin{aligned} \frac{d{\mathsf {S}}}{dt}= & {} - \beta \ {\mathsf {I}}\ \bar{{{\mathsf {S}}}}= - \beta \ {\mathsf {I}}\ {\mathsf {S}}^{1+\nu ^2}, \end{aligned}$$
(22)
$$\begin{aligned} \frac{d{\mathsf {E}}}{dt}= & {} \beta \ {\mathsf {I}}\ \bar{{{\mathsf {S}}}}-\delta \ {\mathsf {E}}= \beta \ {\mathsf {I}}\ {\mathsf {S}}^{1+\nu ^2} -\delta \ {\mathsf {E}}, \end{aligned}$$
(23)
$$\begin{aligned} \frac{d{\mathsf {I}}}{dt}= & {} \delta \ {\mathsf {E}}- \gamma \ {\mathsf {I}}, \end{aligned}$$
(24)
$$\begin{aligned} \frac{d{\mathsf {R}}}{dt}= & {} \gamma \ {\mathsf {I}}. \end{aligned}$$
(25)

This closed system in \({\mathsf {S}}\), \({\mathsf {E}}\), \({\mathsf {I}}\) and \({\mathsf {R}}\) has been used to fit epidemic curves of COVID-19 (Gomes et al. 2022; Aguas et al. 2021). Recalling that the herd immunity threshold \({\mathcal {H}}\) is \(1-{{\mathsf {S}}}(t)\) at the time t when and that (Eq. 15), we get that at the herd immunity threshold when susceptibility is gamma distributed. Hence:

(26)

In the variable connectivity case we have that (Eq. 18). Thus, when , we get when connectivity is gamma distributed. Hence:

(27)

Multiplying the original (Eqs. 14) by x, integrating over the connectivity domain and applying (Eqs. 16 and 21) we get:

$$\begin{aligned} \frac{d\bar{{{\mathsf {S}}}}}{dt}= & {} - \beta \ \bar{{{\mathsf {I}}}}\ \overline{\overline{{\mathsf {S}}}}= - (1+\nu ^2)\ \beta \ \bar{{{\mathsf {I}}}}\ \bar{{{\mathsf {S}}}}^\frac{1+2\nu ^2}{1+\nu ^2} , \end{aligned}$$
(28)
$$\begin{aligned} \frac{d\bar{{{\mathsf {E}}}}}{dt}= & {} \beta \ \bar{{{\mathsf {I}}}}\ \overline{\overline{{\mathsf {S}}}}-\delta \ \bar{{{\mathsf {E}}}}= (1+\nu ^2)\ \beta \ \bar{{{\mathsf {I}}}}\ \bar{{{\mathsf {S}}}}^\frac{1+2\nu ^2}{1+\nu ^2} -\delta \ \bar{{{\mathsf {E}}}}, \end{aligned}$$
(29)
$$\begin{aligned} \frac{d\bar{{{\mathsf {I}}}}}{dt}= & {} \delta \ \bar{{{\mathsf {E}}}}- \gamma \ \bar{{{\mathsf {I}}}}, \end{aligned}$$
(30)
$$\begin{aligned} \frac{d\bar{{{\mathsf {R}}}}}{dt}= & {} \gamma \ \bar{{{\mathsf {I}}}}. \end{aligned}$$
(31)

Mathematically this is a tractable closed system in \(\bar{{{\mathsf {S}}}}\), \(\bar{{{\mathsf {E}}}}\), \(\bar{{{\mathsf {I}}}}\) and \(\bar{{{\mathsf {R}}}}\). However, these variables are not convenient for practical data fitting and parameter estimation. In Appendix B we propose an approximation in the variables \({\mathsf {S}}\), \({\mathsf {E}}\), \({\mathsf {I}}\) and \({\mathsf {R}}\).

5 Model with reinfection

Here we consider an extension of the model considering that immunity after recovery is not fully protective, but only partially. A factor \(\sigma \), with \(0\le \sigma \le 1\), is added to represent the quotient of the probability of getting reinfected after recovery over the probability of getting infected while fully susceptible.

The model is represented diagrammatically as:

with \(\lambda \) as in the basic models (without reinfection) studied above. The extended model is given by the equations:

$$\begin{aligned} \frac{d S(x)}{dt}= & {} - \lambda \ x\ S(x), \end{aligned}$$
(32)
$$\begin{aligned} \frac{d E(x)}{dt}= & {} \lambda \ x\ (S(x)+\sigma \ R(x)) -\delta \ E(x), \end{aligned}$$
(33)
$$\begin{aligned} \frac{d I(x)}{dt}= & {} \delta \ E(x) - \gamma \ I(x), \end{aligned}$$
(34)
$$\begin{aligned} \frac{d R(x)}{dt}= & {} \gamma \ I(x) - \sigma \ \lambda \ x\ R(x). \end{aligned}$$
(35)

The system exhibits newer dynamics in comparison with the basic case. Depending on whether \(\sigma \) is below or above (known as the reinfection threshold (Gomes et al. 2004, 2016)) we get that either the disease dies out after a while and a certain proportion of the population never gets infected, or continues endemically and every individual is eventually infected and reinfected repeatedly.

5.1 Effective reproduction number

The basic reproduction number is calculated exactly, as in the absence of reinfection, but the effective reproduction now depends not only on the distribution of S(xt), but also on the distribution of R(xt). When we consider configurations with no infected individuals, we will have that \(R(x,t)=q(x)-S(x,t)\) and will be able to express in terms of S(xt) only.

The formulas for the effective reproduction number at time t are:

  • in the variable susceptibility case, and

  • in the variable connectivity case.

The derivation of these formulas is essentially as the derivations in Sect. 2, with two differences. First, each individual with trait value x infects \(\beta (\bar{{{\mathsf {S}}}}(t)+ \sigma \bar{{{\mathsf {R}}}}(t))\) or \( \beta x(\bar{{{\mathsf {S}}}}(t)+ \sigma \bar{{{\mathsf {R}}}}(t))\) others per day spent in \({\mathsf {I}}\), in the respective cases, instead of \(\beta \bar{{{\mathsf {S}}}}(t)\) or \(\beta x\bar{{{\mathsf {S}}}}(t)\). Second, when we consider an individual that gets infected in the variable connectivity case, the probability that this individual has trait value x is proportional to \(x(S(x,t)+\sigma R(x,t))\) instead of xS(xt).

5.2 Herd immunity threshold

Recall that a configuration with no infected individuals is at or above the herd immunity threshold if and only if Assuming that no one is infected, that is \(R(x)=q(x)-S(x)\), we get \(\bar{{{\mathsf {R}}}}=1-\bar{{{\mathsf {S}}}}\) and \(\overline{\overline{{\mathsf {R}}}}=\overline{\overline{{\mathsf {q}}}}-\overline{\overline{{\mathsf {S}}}}\), where \(\overline{\overline{{\mathsf {q}}}}=\int x^2 q(x)\ dx = 1+\nu ^2\). We can then understand the configurations at the herd immunity threshold in terms of \(\bar{{{\mathsf {S}}}}\) and \(\overline{\overline{{\mathsf {S}}}}\).

In the variable susceptibility case a configuration with no infected individuals is at the herd immunity threshold if and only if , and hence:

In the variable connectivity case a configuration with no infected individuals is at the herd immunity threshold if and only if , and hence:

5.3 Reinfection threshold

The formulas above require:

That is, the reinfection factor \(\sigma \) has to be below , a critical value known as the reinfection threshold (Gomes et al. 2004, 2016). If this is verified, then all configurations with no infected individuals and satisfying the conditions above (either or depending on the case) are herd immunity threshold configurations in the sense that when susceptibility is at that level or lower, infection reintroductions will not trigger new outbreaks as will not increase above 1.

If the reinfection factor is so high that the population is above the reinfection threshold, will be greater than 1 in any such configurations, so there will not be any configuration with no infected individuals which is at the herd immunity threshold. This implies that there will always be a portion of the population infected, and hence that the population of susceptible individuals will eventually be completely depleted. The infection becomes endemic. The equilibrium configuration will now have \(S(x)=0\) for all x. In these situations the level of endemicity will depend on how much resistance the population is able to mount and maintain.

5.4 Case of the gamma distribution

Recall from Sect. 3 that, when the initial distribution q(x) is gamma, we get that \(S(x,t)/{{\mathsf {S}}}(t)\) remains a gamma distribution for all t, and that \(\bar{{{\mathsf {S}}}}(t)={{\mathsf {S}}}(t)^{1+\nu ^2}\) and \(\overline{\overline{{\mathsf {S}}}}(t)=(1+\nu ^2)\ {{\mathsf {S}}}(t)^{1+2\nu ^2}\). We can then obtain the values of \({{\mathsf {S}}}(t)\) at the time when the herd immunity threshold is reached, and then calculate \({\mathcal {H}}\) as \(1-{{\mathsf {S}}}(t)\) for that particular t.

In the variable susceptibility case we get:

(36)

In the variable connectivity case we get:

(37)

Curves assuming a selection of values for \(\sigma \) are represented graphically in Fig. 2. Note the critical behaviour at the reinfection threshold () in red, which separates the regime where individual immunity is sufficiently potent for the herd immunity threshold to be achievable from the regime where endemicity will establish.

Fig. 2
figure 2

Herd immunity threshold with reinfection. Curves correspond to the SEIR model with reinfection (Eqs. 3235) assuming and gamma distributed susceptibility (left, Eq. 36) or connectivity (right, Eq. 37). Efficacy of naturally acquired immunity is captured by a reinfection parameter \(\sigma \), potentially ranging between \(\sigma =0\) (100% efficacy) and \(\sigma =1\) (0 efficacy). Five values of the reinfection parameter are depicted: \(\sigma =0\) (black); \(\sigma =0.1\) (green); \(\sigma =0.2\) (blue); \(\sigma =0.3\) (magenta); and (red). Above (reinfection threshold Gomes et al. 2004, 2016) the infection becomes stably endemic and there is no herd immunity threshold (colour figure online)

6 Model with a carrier state

In applications to the COVID-19 pandemic (Gomes et al. 2022; Aguas et al. 2021) the exposed compartments are not simply a latent state but a carrier state where individuals are infectious but to a lesser degree than those in the fully infectious compartment. Building on the reinfection model (Eqs. 3232) we now introduce parameter \(\rho \le 1\) to denote the ratio of infectiousness between exposed and fully infectious individuals. What changes is the force of infection \(\lambda \):

$$\begin{aligned} \text{ Variable } \text{ susceptibility: }&\quad&\lambda (t) = \beta \int \rho E(x,t)+ I(x,t)\ dx = \beta \ (\rho {\mathsf {E}}(t)+ {\mathsf {I}}(t)), \\ \text{ Variable } \text{ connectivity: }&\quad&\lambda (t) = \beta \int x (\rho E(x,t)+ I(x,t))\ dx = \beta \ (\rho \bar{{{\mathsf {E}}}}(t)+ \bar{{{\mathsf {I}}}}(t)). \end{aligned}$$

The basic and effective reproduction numbers become:

The calculation of the effective reproduction number is slightly different. The difference is that now we have to add the time an individual is incubating the infection in \({\mathsf {E}}\) to the infectious period, multiplied by the factor \(\rho \). Since the average time an individual spends in \({\mathsf {E}}\) is \(1/\delta \) we get:

  • in the variable susceptibility case, and

  • in the variable connectivity case,

where \(\bar{{{\mathsf {S}}}}(t)\) and \(\overline{\overline{{\mathsf {S}}}}(t)\) are the moments of S(xt) defined above, and the same with R(xt).

In particular, we get and , respectively, and and . Finally, we obtain the same formulas for the herd immunity threshold in terms of and \(\nu \) (Sect. 3, without reinfection) or more generally , \(\nu \) and \(\sigma \) (Sect. 5, with reinfection).

7 Discussion

After completion of this work, similar ideas for capturing individual variation with mean-field epidemic models were elaborated (Britton et al. 2020; Di Lauro et al. 2021; Kawagoe et al. 2021; Neipel et al. 2020; Rose et al. 2021; Tkachenko et al. 2021). These recent developments were largely prompted by the COVID-19 pandemic. In agreement with our results, Rose et al. (2021), Neipel et al. (2020) find that when susceptibility is initially gamma distributed it remains so through the course of the epidemic, leading naturally to power-law behaviour in the force of infection (Novozhilov 2008). In addition the authors show that other initial distributions converge towards gamma through the process of contagion. Other authors (Kawagoe et al. 2021) derive epidemic final sizes assuming alternative distributions of susceptibility, considering in addition that infectivity may exhibit some correlation with susceptibility (such as in the variable connectivity models analyzed here). They compare numerical results for gamma and lognormal distributions with those obtained using an empirical distribution of individual contacts derived from cell phone geolocation data. Focussing on variable connectivity (Britton et al. 2020; Di Lauro et al. 2021) address herd immunity thresholds using age-structured compartmental models. Additionally (Di Lauro et al. 2021) consider a variety of non-pharmaceutical intervention scenarios, emphasizing subtle results when interventions change the contact network. Another group of authors (Tkachenko et al. 2021) distinguishes between persistent and transient individual variation to highlight that only the former is subject to the kind of selection that lowers epidemic final sizes and herd immunity thresholds.

Despite being prompted by COVID-19 none of the above references attempted to quantify the individual variation which was under selection by the force of infection and hence contributed to lower epidemic final sizes and herd immunity thresholds. This is done in our associated work (Gomes et al. 2022; Aguas et al. 2021), where the problem is inverted and selectable variation is inferred from its effects on epidemic patterns. Although we and others had previous adopted similar approaches to other infectious diseases (Finkenstadt and Grenfell 2000; Smith et al. 2005; Bellan et al. 2015; Corder et al. 2020) the work has been processed more cautiously during the pandemic due to the greater implications that estimating lower herd immunity thresholds and epidemic final sizes might have for policies and behaviors in this context.

The concept of herd immunity was originally developed in the context of vaccination programs (Gonçalves 2008; Fine et al. 2011). Defining the percentage of the population that must be immunised to cause infection prevalence to decline, the concept has provided a useful target for vaccination coverage. In hypothetical scenarios of vaccines delivered at random and individuals mixing at random, the herd immunity threshold is given by the simple expression () when immunity is fully protective. Concretely, for between 2.5 and 5, this would indicate that 60-80% random subjects would need be immunized to prevent spread of infection. This formula would not apply, however, if vaccination programmes were designed to prioritize more connected individuals, for instance (Elbasha and Gumel 2021). Similarly, it does not apply when immunity is acquired in response to natural infection, which does not occur at random. Individuals who are more susceptible or more exposed to infection are prone to be infected and become immune earlier. As a result, earlier episodes contribute disproportionately to herd immunity as they remove highly susceptible and exposed subjects from the susceptible pool (Ferrari et al. 2006; Britton et al. 2020; Kawagoe et al. 2021; Neipel et al. 2020; Rose et al. 2021; Gomes et al. 2022; Aguas et al. 2021). In our basic models, the herd immunity threshold becomes in the case of gamma distributed susceptibility, and with gamma distributed connectivity (exposure), which decline sharply when coefficients of variation (\(\nu \)) increase from 0 to 2, remaining below 20% for more variable populations in a particular illustration where (Fig. 1). The magnitude of the decline depends on what property is heterogeneous and how it is distributed among individuals, but the downward trend is robust provided that acquired immunity is efficacious enough to keep transmission below the reinfection threshold (Fig. 2) (, where \(\sigma \) is the susceptibility of individuals who have recovered relative to their respective susceptibility prior to infection) (Gomes et al. 2004, 2016). In our reinfection models, herd immunity thresholds are derived as in the case of gamma distributed susceptibility, and with gamma distributed connectivity, when . If immunity is not potent enough to keep the system below the reinfection threshold then a herd immunity threshold is not attainable and the disease persists in stable endemicity, irrespective of individual variation.

Finally, we stress that the herd immunity threshold (\({\mathcal {H}}\)) is a theoretical framework to assess epidemic potential to the same extent that is a theoretical framework. Their interdependence shows that if increases due to evolution of the infectious agent, for example, so does \({\mathcal {H}}\). Also, if new susceptibles enter the population through birth or other processes, or if immunity wanes or is evaded by pathogen lineages, a previously acquired herd immunity status may be lost leaving the population prone to subsequent outbreaks. Furthermore, if transmission has a marked seasonal pattern the same level of immunity may place the population above threshold in low season and below threshold in high season, in a cyclical manner. Although \({\mathcal {H}}\) is not as immediately applicable as often implied, it is a more informative measure of epidemic potential than given that it accounts for variation in susceptibility or exposure (\(\nu \)) in addition to average transmissibility . The more accurately we know \({\mathcal {H}}\) the better we can assess trade-offs and inform public health policy.