1 Introduction

The trajectory of an epidemic can be dramatically changed by the implementation of a vaccination program, as has been shown in the case of COVID-19 (Bloom et al. 2021). These vaccination programs are most effective when they target specific groups in a population (Fitzpatrick and Galvani 2021), although the optimal targeting strategy is dependent on the properties of the disease and vaccine (Moore et al. 2021). Thus, it is important to have robust methods to determine the optimal strategy whenever a new epidemic emerges.

In recent years, the epidemiological literature has grown rapidly, and a wide range of models have been developed and analysed. These include branching-process models (Pakkanen et al. 2021); network-based models (Bedson et al. 2021); and machine-learning-based models (Muhammad et al. 2021), among many others (Brauer et al. 2019).

However, despite these innovations, compartmental models, where the population is split into a number of subgroups and disease transmission is modelled by a system of differential equations (Abou-Ismail 2020), remain a popular choice for epidemiologists and have been widely used for modelling the COVID-19 pandemic (Kong et al. 2022). As discussed in (Kong et al. 2022), a number of different compartment structures have been used, while many authors have also sought to model the effect of government interventions and quarantining procedures (Vardavas et al. 2021; de Camino-Beck 2020; Adhikari et al. 2020).

One such compartmental model that is widely used (Ram and Schaposnik 2021; Acemoglu et al. 2021; Kuniya 2019) is the multi-group SIR (susceptible–infected–recovered) model. This is an extension of the classical SIR model (Kermack and McKendrick 1927) and has been used to model a range of diseases such as measles (Sattenspiel and Dietz 1995), influenza (Brauer 2008) and COVID-19 (Ellison 2020). It provides a general framework with which to assess the effectiveness of different vaccination policies, while also remaining mathematically tractable, allowing theorems about its behaviour to be rigorously proved (Penn and Donnelly 2022). It splits a population up into a number of interconnected subgroups (such as age groups (Longini Jr et al. 1978) and captures the different transmission dynamics between each group. This construction highlights the dual benefit that vaccination can have—vaccines that are infection-reducing directly protect the individuals that are vaccinated while transmission-reducing vaccines can also indirectly protect unvaccinated individuals (Eichner et al. 2017).

This dual benefit can significantly complicate the optimal vaccination problem when there is a negative correlation between the infectiousness of a group and the vulnerability of its members to the disease. Examples of this occur when the population is divided by age for diseases such as COVID-19 (Miura et al. 2021) and seasonal influenza (Molinari et al. 2007). In such cases, the optimal strategy may not be obvious and could be highly dependent on uncertain parameters (Saadi et al. 2021), while the seemingly intuitive solution may be significantly sub-optimal (Delmas et al. 2021). Moreover, the complicated methods used to find the optimal solution, involving solving the adjoint equations derived via Pontryagin’s maximum principle (Boutayeb et al. 2021; Lee et al. 2012), mean that the optimal solution may be difficult to understand or qualitatively justify to policy-makers.

When attempting to understand a complicated problem such as finding the optimal vaccination policy, it is often helpful to look at cases with extreme parameter values via asymptotic analysis, which helps the problem to be analytically solvable (at least to leading order). This can help from general principles for optimal vaccination policies. These principles can then be used both to form heuristics for finding the true optimal policy in a more general setting and also to explain the resultant optimal solution, as it is often comprised of a mixture of policies resulting from these principles.

There have been a number of recent papers that have used asymptotic analysis to derive general principles. Gavish and Katriel (2022) discusses a model with high reproduction numbers and shows that in this case, it is often optimal to vaccinate the less infectious groups in a population. Moreover, Rao and Brandeau (2021), building on the work of Zaric and Brandeau (2001), linearises the model equations and derives a simple knapsack problem, although the solution to this problem is only optimal when considering the short-term evolution of the epidemic. Other special cases are investigated in Duijzer et al. (2018) (which looks at a population with disconnected subgroups) and Duijzer et al. (2016) (which examines the critical vaccination fraction for a population with separable mixing).

Two cases will be considered in this paper, which both provide novel contributions to the literature. Firstly, the case of a population with a small vulnerable subgroup will be analysed, and it will be shown that, in the asymptotic limit (as the size of this population group tends to zero and its vulnerability tends to infinity), any vaccination policy is eventually outperformed by one where this group is vaccinated first. Of course, the concept that vaccinating vulnerable groups is important has been raised in many previous papers, such as Moore et al. (2021) and Dushoff et al. (2007), but the mathematically rigorous asymptotics presented here provide new evidence for the importance of this principle.

The second case to be discussed is that of a small total vaccination supply. The key novel result that will be shown is that (to leading order) the optimal vaccination problem reduces to a linear knapsack problem, which can be easily solved. This knapsack problem differs from the one in Rao and Brandeau (2021) because, by linearising the final size equations rather than the model ODEs (ordinary differential equations), the optimal solutions and predictions of their behaviour are valid for the full evolution of the epidemic, rather than just in the short term. Again, the case of a small vaccine supply has been examined in many papers such as Shim (2021, 2011) and Medlock and Meyers (2009), but these papers have simply analysed the optimisation problem in the standard way, without deriving the explicit leading-order solution as is done in this paper.

In order to prove these results, it is necessary to build on previous literature. A number of results from Penn and Donnelly (2022) (found in Appendix D) are used in the course of the proof alongside some well-established results, such as the final size of an epidemic in SIR-type models (Anderson and May 1992). However, the theorems presented in the main text are completely novel, with their proofs requiring a significant extension of the current literature. In particular, the various propositions in the proofs (found in Appendices A–C) are, to the best of the authors’ knowledge, new to the literature. Some of these results, such as, for example, the proof that epidemic final size is continuously dependent on initial conditions and the vaccination policy found in Proposition 5 may also be helpful to those seeking to prove similar results.

The main analytic results will be further investigated through examples, and, in particular, the small supply case will be used to show that it is not always optimal to vaccinate the most infectious group, even when all groups are equally vulnerable. The UK population’s age structure will be used to relate these results to a realistic example, and optimal small-supply vaccination policies will be approximated for diseases with different age-dependent case fatality ratios.

The paper is structured as follows: Firstly, the multi-group SIR model will be introduced. Then, analytic results will be presented in the case of a small vulnerable subgroup, which will be explored through numerical examples. Finally, analytic results related to a small vaccination supply will be presented and again, examples will be used to illustrate the findings.

2 Modelling

2.1 Disease Transmission and Vaccination Model

The model used in this paper is identical to the model presented in Penn and Donnelly (2022), and this section is simply a summary of the modelling section in Penn and Donnelly (2022). The population is divided into n subgroups, and each subgroup i is further divided into six compartments:

$$\begin{aligned} S_i :=&\text {Number of people that are in group}\ i, \text {are susceptible, and are unvaccinated} \end{aligned}$$
(1)
$$\begin{aligned} I_i:=&\text {Number of people that are in group}\ i, \text {are currently infected, and}\nonumber \\&\text {were infected while unvaccinated }\end{aligned}$$
(2)
$$\begin{aligned} R_i:=&\text {Number of people that are in group}\ i, \text {are recovered, and}\nonumber \\&\text {were infected while unvaccinated} \end{aligned}$$
(3)
$$\begin{aligned} S^V_i :=&\text {Number of people that are in group}\ i, \text {are susceptible and are vaccinated} \end{aligned}$$
(4)
$$\begin{aligned} I^V_i :=&\text {Number of people that are in group}\ i, \text {are infected}\nonumber \\&\text {and were infected after being vaccinated}\end{aligned}$$
(5)
$$\begin{aligned} R^V_i :=&\text {Number of people that are in group}\ i, \text {are recovered and were infected} \nonumber \\&\text {after being vaccinated.} \end{aligned}$$
(6)

Using SIR principles, the model becomes

$$\begin{aligned} \frac{{\textrm{d}}S_i}{{\textrm{d}}t}&= -\sum _{j=1}^n(\beta ^1_{ij}I_j+ \beta ^2_{ij} I^V_j)S_i - \frac{U_i(t) S_i}{N_i-W_i(t)} \end{aligned}$$
(7)
$$\begin{aligned} \frac{{\textrm{d}}I_i}{{\textrm{d}}t}&= \sum _{j=1}^n(\beta ^1_{ij} I_j+ \beta ^2_{ij} I^V_j)S_i - \mu ^1_i I_i\end{aligned}$$
(8)
$$\begin{aligned} \frac{{\textrm{d}}R_i}{{\textrm{d}}t}&= \mu ^1_i I_i\end{aligned}$$
(9)
$$\begin{aligned} \frac{{\textrm{d}}S^V_i}{{\textrm{d}}t}&= -\sum _{j=1}^n(\beta ^3_{ij}I_j + \beta _{ij}^4 I^V_j)S^V_i + \frac{U_i(t) S_i}{N_i-W_i(t)}\end{aligned}$$
(10)
$$\begin{aligned} \frac{{\textrm{d}}I^V_i}{{\textrm{d}}t}&= \sum _{j=1}^n(\beta ^3_{ij}I_j + \beta _{ij}^4I^V_j)S^V_i -\mu ^2_i I^V_i \end{aligned}$$
(11)
$$\begin{aligned} \frac{{\textrm{d}}R^V_i}{{\textrm{d}}t}&= \mu ^2_i I^V_i \end{aligned}$$
(12)

where

$$\begin{aligned} W_i(t) := \int _0^t U_i(s){\textrm{d}}s, \end{aligned}$$
(13)

and

$$\begin{aligned} N_i = S_i(t) + I_i(t) + R_i(t) + S^V_i(t) + I^V_i(t) + R^V_i(t) \end{aligned}$$
(14)

is the size of group i. Moreover, the \(\beta ^{\alpha }_{ij}\) terms represent transmission from group j to group i and the \(\mu _i^{\alpha }\) terms give the infectious period of the relevant individuals in group i.

Here, \(U_i(t){\textrm{d}}t\) gives the number of individuals in group i that are vaccinated in the small time interval \([t,t+{\textrm{d}}t]\), and hence, \(W_i(t)\) is the number of individuals that have been vaccinated in group i in [0, t]. It is assumed that these vaccinations are assigned randomly to the unvaccinated members of group i, so that each vaccine is given to a susceptible member of group i with probability

$$\begin{aligned} \frac{\text {number of susceptible members}}{\text {number of unvaccinated members}} = \frac{S_i}{N_i - W_i(t)} \end{aligned}$$
(15)

Thus, the total rate of susceptibles being vaccinated is \(\frac{U_i(t) S_i}{N_i-W_i(t)}\).

Note that there is a slight difference between this model and the one commonly found in the literature (in Hansen and Day 2011; Zaman et al. 2008; Kar and Batabyal 2011 among many others) which set the vaccination term equal to \(S_iU_i(t)\) instead of \(\frac{U_i(t) S_i}{N_i-W_i(t)}\). As discussed in Penn and Donnelly (2022), this corresponds to vaccines that are randomly distributed to the whole population, which can be seen by rewriting the vaccination term as:

$$\begin{aligned} S_iU_i(t){\textrm{d}}t = \frac{S_i}{N_i} \times N_iU_i(t){\textrm{d}}t \end{aligned}$$
(16)

The first term on the right-hand side is then the probability of a randomly chosen member of group i being susceptible, while the second term is the total number of vaccines assigned in a small time interval \([t,t+{\textrm{d}}t]\), noting that here the dimension of \(U_i(t)\) is 1/time (compared to the model used in this paper where the dimension of \(U_i(t)\) is population/time), and hence, it is necessary to scale by \(N_i{\textrm{d}}t\) to convert \(U_i(t)\) into a number of vaccines.

This is in contrast to the model in this paper which corresponds to vaccines that are randomly distributed only to the unvaccinated population. Penn and Donnelly (2022) provides justification for the use of this “unvaccinated-only model”, which is therefore the one that will be used in this paper. However, they are structurally very similar, and so it would be possible to apply the results in this paper to the more commonly found model.

To deal with the (removable) singularity that can occur when \(W_i = N_i\), it is assumed that

$$\begin{aligned} W_i(t)\le N_i \quad \forall t \ge 0 \quad \text {and} \quad W_i(t) = N_i \Rightarrow \frac{U_i(t) S_i}{N_i-W_i(t)} = 0 \end{aligned}$$
(17)

To capture the benefits of vaccination, there are additional constraints put on the \(\beta _{ij}^{\alpha }\) and \(\mu _j^{\alpha }\) terms which are

$$\begin{aligned} \beta ^1_{ij} \ge \beta _{ij}^2, \beta _{ij}^3 \ge \beta _{ij}^4 \quad \text {and} \quad \mu _i^1 \le \mu _i^2. \end{aligned}$$
(18)

Finally, it will be assumed throughout the remainder of this paper that the population sizes are normalised so that

$$\begin{aligned} \sum _{i=1}^n N_i = 1 \end{aligned}$$
(19)

Further details are given in Penn and Donnelly (2022).

2.2 Optimisation Problem

The optimal vaccination problem considered in this paper aims to find the vaccination policy, \({\varvec{U}}\), which minimises a weighted sum of the total number of infections in each group. Thus, the problem is:

$$\begin{aligned} \min \bigg \{\sum _{i=1}^np_i\bigg (R_i(\infty ) + \kappa _i R^V_i(\infty )\bigg ) :&\sum _{i=1}^nU_i(t) \le A(t),\quad \sum _{i=1}^n W_i(t) \le B(t),\nonumber \\&U_i(t)\ge 0, \quad W_i(t)\le N_i \quad \forall t \ge 0\bigg \}. \end{aligned}$$
(20)

Here, A(t) represents the maximal vaccination rate, B(t) represents the maximal vaccine supply and \(R_i(\infty )\) and \(R_i^V(\infty )\) are the limiting values of \(R_i(t)\) and \(R_i^V(t)\) as \(t \rightarrow \infty \). The weights \(p_i\) and \(p_i\kappa _i\) could be interpreted in a number of ways, depending on the quantity of interest. For example, \(p_i = \kappa _i = 1\) if one wanted to minimise infections, or \(p_i\) and \(p_i\kappa _i\) could be the case fatality ratio of unvaccinated and vaccinated members of group i, respectively, if one wanted to minimise deaths. However, it is important to note that \(\kappa _i \le 1\) for each i as vaccinated members of the population should be no more vulnerable to the disease that their unvaccinated counterparts.

It is helpful to define \(H({\varvec{U}})\) to be the objective function—that is

$$\begin{aligned} H({\varvec{U}}) = \sum _{i=1}^np_i\bigg (R_i(\infty ) + \kappa _i R^V_i(\infty )\bigg ), \end{aligned}$$
(21)

where \(R_i\) and \(R^V_i\) are found from solving the model equations with vaccination policy given by \({\varvec{U}}\).

It will be assumed throughout this paper that all “feasible” \({\varvec{U}}\) are sufficiently smooth for all the quoted theorems to hold. In general, this does not significantly restrict \({\varvec{U}}\)—for example, the results in Penn and Donnelly (2022) simply require that each \(U_i(t)\) is bounded and Lebesgue integrable, while Theorems 1 and 2 require only that \({\varvec{U}}\) has finite support. Moreover, it is assumed that B(t) is non-decreasing (as total supply should not decrease over time) and piecewise differentiable.

3 Results

3.1 A Small, Vulnerable Subgroup

Consider the case where one of the groups in the population (which, without loss of generality, will be assumed to be group 1) is very small and vulnerable. That is, the population \(N_1\) satisfies

$$\begin{aligned} N_1(\epsilon ) =\epsilon<< 1 \end{aligned}$$
(22)

while the weights satisfy

$$\begin{aligned} p_1(\epsilon ) = p_1 \quad \text {and} \quad p_i(\epsilon ) = p^*_i\epsilon \quad \forall i \ne 1 \end{aligned}$$
(23)

for some constants \(p_1\) and \(p_i^*\). It will be assumed that all \(\kappa _i\) are constant. In this setting, group 1 contains a very small proportion of the population, but each member of group 1 is much more vulnerable than the rest of the population.

Thus, this case is practically valid when there is a small subsection of the population that carries the majority of the vulnerability to a disease. As will be discussed further in Section 3.2.3, this has applicability to diseases such as COVID-19, where the majority of the deaths occur significantly older people, while it could also apply to diseases where there are rare conditions that cause a minority of people to be much more vulnerable.

It is mathematically convenient to rescale the parameters \(p_i\) so that only \(p_1\) depends on \(\epsilon \). This can be done by multiplying all the \(p_i\) terms by \(\frac{1}{p_1\epsilon }\) so that

$$\begin{aligned} {\tilde{p}}_1(\epsilon ) = \frac{1}{\epsilon } \quad \text {and} \quad {\tilde{p}}_i(\epsilon ) = \frac{p^*_i}{p_1} := {\tilde{p}}_i \quad \forall i \ne 1. \end{aligned}$$
(24)

This leads to an equivalent optimisation problem in the sense that the optimal vaccination policy will be the same. This occurs because the only change to the objective function is a scalar multiplication of \(\frac{1}{p_1\epsilon }\) to each of the terms. Note that while this multiplicative factor tends to infinity as \(\epsilon \) tends to 0, this system is only analysed for nonzero values of \(\epsilon \), and hence, this rescaling is valid.

3.1.1 Analytic Results

The first result presented in this section shows that, in the limit of a group with small size and large vulnerability (with the total cost of the whole group being infected, \(N_1{\tilde{p}}_1\), remaining constant) any fixed vaccination policy where the vulnerable group is not vaccinated first will eventually (that is, for sufficiently small \(\epsilon \)) be outperformed by a similar policy where the vulnerable group is vaccinated first.

Group 1 will be given a population size \(N_1 = \epsilon \) and an infection cost \({\tilde{p}}_1 = \frac{1}{\epsilon }\) (recall that the \({\tilde{p}}_i\) represent the rescaled values of \(p_i\), and so it is acceptable that \({\tilde{p}}_1 > 1\) for small \(\epsilon \)). It will be assumed that the initial conditions in the group are proportional to \(\epsilon \), so that there exists some \(\sigma \in (0,1]\) such that the initial susceptible population is \(\sigma \epsilon \) and the initial infected population is \((1-\sigma )\epsilon \).

Before stating the full theorem, it is helpful to explain the various constraints and variables that will be introduced. Define, for each value of \(\epsilon \ge 0\), \({\varvec{U}}(t;\epsilon )\) to be the “fixed” vaccination policy where group 1 is not vaccinated first. Of course, the vaccination policy cannot be completely fixed, as the size, \(\epsilon \), of group 1 is decreasing, and so it will simply be assumed that the vaccines given out to each group satisfies

$$\begin{aligned} |W_i(t;\epsilon ) - W_i(t;0) |< \epsilon \quad \forall t \ge 0 \quad \text {and} \quad \forall i \in \{1,\ldots ,n\} \end{aligned}$$
(25)

Note that all groups are allowed to have small changes in the number of vaccinations they receive—this allows, for example, for vaccinations that would have been given to group 1 being reassigned as group 1’s population shrinks.

Moreover, to reduce the lengths of the proofs, it will be assumed that \({\varvec{U}}\) has uniformly bounded finite support—that is, there is some constant \(t_U\) such that for each \(i \in \{1,\ldots ,n\}\),

$$\begin{aligned} t > t_U \Rightarrow U_i(t;\epsilon ) = 0 \quad \forall t,\epsilon \ge 0 \end{aligned}$$
(26)

In order for group 1 to not be vaccinated first in the limit as \(\epsilon \rightarrow 0\), there must be some time \(\tau \) at which some fixed proportion w of the other groups have been vaccinated, while at least some fixed proportion \((1-\alpha )\) of group 1 has not been vaccinated. That is,

$$\begin{aligned} W_1(\tau ;\epsilon ) < \alpha \epsilon \quad \text {and} \quad \sum _{i=1}^n W_i(\tau ;\epsilon ) > w. \end{aligned}$$
(27)

One can also define a vaccination policy \(\tilde{{\varvec{U}}}(t;\epsilon )\) where group 1 is vaccinated first. This will be done by re-directing all vaccinations from the \({\varvec{U}}(t;\epsilon )\) policy to group 1 until it is fully vaccinated, and keeping the same vaccination policy after group 1 is fully vaccinated (ignoring any vaccines that \({\varvec{U}}(t;\epsilon )\) assigns to group 1 after this time).

To ensure convergence of the model at \(\epsilon = 0\), given \(\Pi (\epsilon )\) defined by

$$\begin{aligned} \Pi (\epsilon ) := \bigg \{ i : \exists t \ge 0 \quad \text {s.t.} \quad I_i(t;\epsilon ) > 0\bigg \}, \end{aligned}$$
(28)

it will be assumed that \(\Pi (\epsilon ) = \{1,\ldots ,n\}\) for all \(\epsilon > 0\) (as any groups which never suffer any infections can be ignored) and that \(\Pi (0) = \{2,\ldots ,n\}\). While this second condition may not be strictly necessary for the theorem to hold, it is unrestrictive and ensures convergence—if this were not the case, then it would be possible that infection in some set of groups were seeded only by group 1. Thus, when \(\epsilon = 0\), these groups would suffer no infections, while for any \(\epsilon > 0\), they would have an epidemic of size independent (at leading order) of \(\epsilon \).

The final condition on the model is that the people in group 1 can be infected by other groups and that vaccinated members of group 1 gain protection from this infection. That is, there is some \(i \in \{1,\ldots ,n\}\) such that

$$\begin{aligned} \beta ^1_{1i} > \beta _{1i}^3 \ge 0. \end{aligned}$$
(29)

This is an important condition, as if people group 1 could only be infected by other members of group 1 the total number of infections in group 1 would decay as \(\epsilon \rightarrow 0\), meaning that it would no longer necessarily be optimal to vaccinate group 1 first (as most people in group 1 would not catch the disease anyway for small \(\epsilon \)).

With these considerations, Theorem 1 can now be stated.

Theorem 1

Suppose that for all \(\epsilon > 0\),

$$\begin{aligned} N_1(\epsilon ) = \epsilon , \quad S_1(0;\epsilon ) = \epsilon \sigma , \quad I_1(0;\epsilon ) = (1-\sigma )\epsilon \quad \text {and} \quad {\tilde{p}}_1(\epsilon ) = \frac{1}{\epsilon } \end{aligned}$$
(30)

for some \(\sigma \in (0,1)\) and that all other parameter values and initial conditions are independent of \(\epsilon \).

Consider any vaccination policy with uniformly bounded finite support given by \({\varvec{U}}(t;\epsilon )\) and suppose that there exists fixed \(\alpha ,\tau , w > 0\) such that

$$\begin{aligned} W_1(\tau ;\epsilon ) < \alpha \epsilon \quad \text {and} \quad \sum _{i=1}^n W_i(\tau ;\epsilon )> w \quad \forall \epsilon >0. \end{aligned}$$
(31)

Define a new policy, \(\tilde{{\varvec{U}}}(t;\epsilon )\), given by

$$\begin{aligned} {\tilde{U}}_1(t;\epsilon ) = \left\{ \begin{matrix} \sum \nolimits _{i=1}^n U_i(t) &{} \text {if } \sum \nolimits _{i=1}^n W_i(t;\epsilon ) \le \epsilon \\ 0 &{} \text {otherwise} \\ \end{matrix} \right. \end{aligned}$$
(32)

and, for \(i \ne 1\),

$$\begin{aligned} {\tilde{U}}_i(t;\epsilon ) = \left\{ \begin{matrix}0 &{} \text {if } \sum \nolimits _{i=1}^n W_i(t;\epsilon ) \le \epsilon \\ U_i(t;\epsilon ) &{} \text {otherwise} \\ \end{matrix} \right. . \end{aligned}$$
(33)

Suppose that for each \(i \in \{1,\ldots ,n\}\) and \(t \ge 0\),

$$\begin{aligned} |W_i(t;0) -W_i(t;\epsilon ) |< \epsilon . \end{aligned}$$
(34)

Define

$$\begin{aligned} \Pi (\epsilon ) := \{ i : \exists t \ge 0 \quad \text {s.t.} \quad I_i(t;\epsilon ) > 0\} \end{aligned}$$
(35)

and suppose that \(\Pi (\epsilon ) = \{1,\ldots ,n\}\) for any \(\epsilon > 0\) and that \(\Pi (0) = \{2,\ldots ,n\}\). Finally, suppose that there exists an \(i \in \{2,\ldots ,n\}\) such that

$$\begin{aligned} \beta _{1i}^1>\beta _{1i}^3 \ge 0. \end{aligned}$$
(36)

Then, the policy \(\tilde{{\varvec{U}}}\) is feasible and for sufficiently small \(\epsilon \),

$$\begin{aligned} H({\varvec{U}}(t;\epsilon )) > H(\tilde{{\varvec{U}}}(t;\epsilon )). \end{aligned}$$
(37)

For the second theorem, it is helpful to note that, using the results in Penn and Donnelly (2022), if one defines

$$\begin{aligned} \chi (t) := \left\{ \begin{matrix} A(t) &{} \text {if} \quad \int _0^tA(s){\textrm{d}}s < B(t) \\ \min (A(t), B'(t)) &{} \text {if} \quad \int _0^tA(s){\textrm{d}}s \ge B(t) \end{matrix}\right. , \end{aligned}$$
(38)

then (assuming that there is an optimal solution, and under mild smoothness conditions on \({\varvec{U}}\), A and B) there must be an optimal solution satisfying

$$\begin{aligned} \sum _{i=1}^nW_i(t) =\max \bigg (\int _0^t \chi (s){\textrm{d}}s,1\bigg ). \end{aligned}$$
(39)

The following theorem then proves that the limiting optimal vaccination policy vaccinates the vulnerable group as quickly as possible. To reduce the length of the proof, it will be assumed that \(\sigma = 1\), so that (in the small \(\epsilon \) limit) all members of group 1 can be vaccinated before being infected.

Theorem 2

With the definitions of Theorem 1, suppose additionally that

$$\begin{aligned} \sum _{j=2}^n (\beta _{1j}^1 - \beta _{1j}^3)I_j(0;\epsilon ) > 0. \end{aligned}$$
(40)

That is, the initial difference between the infective force on vaccinated and unvaccinated members of the population is positive. Suppose further that

$$\begin{aligned} \sigma = 1. \end{aligned}$$
(41)

Suppose an optimal vaccination policy for each \(\epsilon \) is given by \(\overline{{\varvec{U}}}(t;\epsilon )\) and suppose that \(\overline{{\varvec{U}}}(t;\epsilon )\) has uniformly bounded finite support. Then, there exists an \(\eta \) depending only on \(\alpha \), \(\tau \), w and the model parameters such that, for any \({\varvec{U}}\) satisfying the condition (31) as defined in Theorem 1

$$\begin{aligned} \epsilon \in (0,\eta ) \Rightarrow H({\varvec{U}}) > H(\overline{{\varvec{U}}}). \end{aligned}$$
(42)

Moreover, there is a sequence of optimal vaccination policies, \(\overline{{\varvec{U}}}(t;\epsilon )\), which satisfies

$$\begin{aligned} \lim _{\epsilon \rightarrow 0}\left( \frac{{\overline{W}}_1(t;\epsilon )}{\epsilon }\right) = 1 \quad \forall t \quad \text {s.t.}\quad \int _0^t \chi (s){\textrm{d}}s > 0. \end{aligned}$$
(43)

Note that the existence of an optimal vaccination policy has been assumed in the statement of this theorem. The authors believe that an optimal policy should exist, as Proposition 5 in the appendices can be used to show that \(H({\varvec{U}})\) is continuous. However, more care would need to be taken with the smoothness assumptions on \({\varvec{U}}\) to create a rigorous proof of this.

Theorems 1 and 2 are proved in the appendices.

3.1.2 Examples

To illustrate these analytic results, consider a simple two-group example. Suppose that group 1 is small, vulnerable, and non-infectious, while group 2 is large, invulnerable and infectious. These groups could be interpreted as “old” and “young”, respectively, although there is no specific physical situation being modelled here.

Suppose the transmission matrices are given by

$$\begin{aligned} \beta ^{1}&= \begin{pmatrix} 1 &{} 2 \\ 2 &{} 4 \\ \end{pmatrix}, \quad \beta ^{2} = \chi \beta ^1 \quad \beta ^{3} = \rho \beta ^1 \quad \text {and} \quad \beta ^{4} = \chi \rho \beta ^1 \end{aligned}$$
(44)

for some parameters \(\chi \) and \(\rho \) which will be varied. This corresponds to the case of vaccination having (independently) an effectiveness \(\chi \) at stopping people being infected and \(\rho \) at stopping infected people transmitting the disease. Moreover, suppose that

$$\begin{aligned} \mu _i^{\alpha } = 1 \quad \forall i, \alpha \end{aligned}$$
(45)

and

$$\begin{aligned} N_1 = \epsilon , \quad {\tilde{p}}_1 = \frac{1}{\epsilon },\quad \kappa _1 = 1 \quad N_2 = 1, \quad {\tilde{p}}_2 = p^* \quad \text {and} \quad \kappa _2 = 1, \end{aligned}$$
(46)

for some parameter \(p^*\) that will be varied. Finally, suppose that the initial conditions are

$$\begin{aligned} S_1(0;\epsilon ) = \epsilon , \quad I_1(0;\epsilon ), = 0 \quad S_2(0;\epsilon ) = 1-I^* \quad \text {and} \quad I_2(0;\epsilon ) = I^*, \end{aligned}$$
(47)

for some parameter \(I^*\) that will be varied and that the vaccination constraints are given by:

$$\begin{aligned} A(t) = 1\quad \text {and} \quad B(t) = \max (t,1). \end{aligned}$$
(48)

Consider therefore a vaccination policy where group 2, the infectious group, is vaccinated first (and hence, as \(B(\infty ) = N_2\), it is the only group that is vaccinated). That is,

$$\begin{aligned} U_1(t;\epsilon ) = 0 \quad \text {and} \quad U_2(t;\epsilon ) =\left\{ \begin{matrix} 1 &{} \text {if } t \le 1\\ 0 &{} \text {otherwise} \\ \end{matrix}\right. . \end{aligned}$$
(49)

Hence, with \(\tilde{{\varvec{U}}}\) defined as in Theorem 1, one has

$$\begin{aligned} {\tilde{U}}_1(t;\epsilon ) = \left\{ \begin{matrix} 1 &{} \text {if } t \le \min (1,\epsilon )\\ 0 &{} \text {otherwise} \\ \end{matrix}\right. \quad \text {and} \quad {\tilde{U}}_2(t;\epsilon ) = \left\{ \begin{matrix} 1 &{} \text {if } t \in (\epsilon ,1]\\ 0 &{} \text {otherwise} \\ \end{matrix}\right. . \end{aligned}$$
(50)

Figure  shows a comparison of the objective values \(H({\varvec{U}}(t;\epsilon ))\) and \(H(\tilde{{\varvec{U}}}(t;\epsilon ))\) for different values of \(\epsilon \). As expected, when \(\epsilon = 1\), vaccinating the more infectious group first is optimal (as they have the same vulnerability in this case), while for \(\epsilon \) smaller than around 0.1, it becomes more effective to vaccinate the vulnerable group first, illustrating the results of Theorem 1.

Fig. 1
figure 1

(Color Figure Online) A comparison of the two vaccination policies, \({\varvec{U}}(t;\epsilon )\) (where the infectious group is vaccinated first) and \(\tilde{{\varvec{U}}}(t;\epsilon )\) (where the vulnerable group is vaccinated first) for different values of \(\epsilon \). Note that here, \(I^* = 0.01\), \(\chi = \rho = 0.5\) and \(p^* = 1\)

It is useful to consider the approximate smallness of \(\epsilon \) required in Theorem 1. That is, how small \(\epsilon \) needs to be in order for \(\tilde{{\varvec{U}}}(t;\epsilon )\) to be the better vaccination policy. To explore this, define, for each value of \(I^*\) and \(p^*\),

$$\begin{aligned} \epsilon ^*(I^*,p^*) := \inf \bigg (\left\{ \epsilon : H(\tilde{{\varvec{U}}}(t;\epsilon )) > H({\varvec{U}}(t;\epsilon ))\right\} \cup \{1\}\bigg ). \end{aligned}$$
(51)

That is, \(\epsilon ^*(I^*,p^*)\) is the smallest value of \(\epsilon \) such that vaccinating group 2 first is better that the \(\tilde{{\varvec{U}}}\) policy, with a cut-off value at 1 (as it is possible that for some parameter values, the \(\tilde{{\varvec{U}}}\) policy is always better).

Figure  shows the behaviour of \(\epsilon ^*(I^*,p^*)\). As expected, \(\epsilon ^*\) is decreasing in \(I^*\)—this is because when there are fewer initial infectives, there is more time to vaccinate the infectious group before the epidemic has a chance to grow, reducing the peak of the epidemic. Moreover, \(\epsilon ^*\) is decreasing in \(p^*\), as higher values of \(p^*\) mean that the number of infections in group 2 is valued higher.

Fig. 2
figure 2

(Color Figure Online) A plot of \(\epsilon ^*(I^*,p^*)\), the highest value of \(\epsilon \) for which \({\varvec{U}}\), is a better vaccination policy that \(\tilde{{\varvec{U}}}\). Note that \(\epsilon ^*\) is capped at 1, so that a value of 1 indicates that there were no values found of \(\epsilon ^*\) such that \({\varvec{U}}\) was the better policy. Note that here, \(\chi = \rho = 0.5\)

Moreover, Fig. 2 suggests that, for each fixed \(p^*\), \(\epsilon ^*\) is uniformly bounded below for all \(I^*\). Indeed, this is expected as when \(I^*\) is very small, there are negligible infections within the interval \(t \in [0,1]\) and so the vaccination policies \({\varvec{U}}\) and \(\tilde{{\varvec{U}}}\) are in effect being carried out in a completely uninfected population. As the \(R_0\) (that is, the initial growth rate of the disease) number of a fully vaccinated population (in this case) is greater than 1, \(I(t;\epsilon )\) will reach an O(1) value regardless of the vaccination policy. Thus, while decreasing \(I^*\) will increase the time to reach this O(1) value, it will not significantly change the final infections in the epidemic, and hence, \(\epsilon ^*\) should converge to a fixed value for small \(I^*\).

When the fully vaccinated population has an \(R_0\) lower than 1, the difference between \({\varvec{U}}\) and \(\tilde{{\varvec{U}}}\) is more distinct. Indeed, provided \(I^*\) is small enough for vaccination to be completed before many infections have occurred, one would expect \(O(I^*)\) infections in group 2 in either of the two vaccination policies (for sufficiently small \(\epsilon \)), as in both policies, the size of the infected compartment will be decreasing after the vaccination has been completed. However, in the \({\varvec{U}}\) case, one would expect \(O(I^* \epsilon )\) infections in total in group 1 (as there is an \(O(I^*)\) infection force on a group of size \(O(\epsilon )\) for O(1) time), while in the \(\varvec{{\tilde{U}}}\) case, one would expect \(O(I^*\epsilon ^2)\) infections in total in group 1, as the population of this group is only of size \(O(\epsilon )\) for \(O(\epsilon )\) time. This behaviour is illustrated in Fig. , which shows that \(\epsilon ^*\) converges to significantly higher values than in Fig. 2—indeed, in the case that \(p^* = 0\), it appears that \({\varvec{U}}\) is never optimal for any \(\epsilon \le 1\).

Fig. 3
figure 3

(Color Figure Online) A plot of \(\epsilon ^*(I^*,p^*)\), the highest value of \(\epsilon \) for which \({\varvec{U}}\) is a better vaccination policy that \(\tilde{{\varvec{U}}}\), in the case of complete vaccination effectiveness (so \(\chi = \rho = 0\)). Note that because the values of the objective function are \(O(I^*)\), there is some numerical instability which has caused some non-smoothness of the plot

3.2 A Small Vaccination Supply

In this section, the case of a small, immediately available vaccine supply will be considered. In this case, it will be possible to analytically derive the optimal vaccination policy (in the limit of small supply).

This case may be particularly relevant if there was an outbreak of a disease where a vaccine already existed (so that some vaccinations are available immediately), but where supplies were limited, and scaling production would take a significant amount of time. An example of this can be found in the recent monkeypox outbreak (Mahase 2022) where the UK initially purchased 20 000 smallpox vaccines. This small figure—not even enough to vaccinate 0.1% of the UK population (UN 2019)—would certainly fall within the small vaccination supply case.

Moreover, one can use the results in this section regardless of the time at which vaccinations become available (that is, they are not only relevant at the start of an epidemic). This would be of practical use whenever vaccine production is slow, or when the disease is sufficiently mild (or vaccine production is sufficiently expensive) that a large-scale vaccination program is not deemed economically feasible.

3.2.1 Analytic Results

To state the analytic result from this section, it is helpful to define

$$\begin{aligned} \beta '_{ij} = \left\{ \begin{matrix} \beta ^1_{ij} &{} \text {if } i,j\le n\\ \beta ^2_{i(n-j)} &{} \text {if } i \le n< j \le 2n\\ \beta ^3_{(n-i)j} &{} \text {if } j \le n< i\le 2n \\ \beta ^4_{(n-i)(n-j)} &{} \text {if } n < i,j \le 2n\\ \end{matrix}\right. ,. \end{aligned}$$
(52)

This large transmission matrix captures the dynamics of all 2n susceptible and infectious groups in the model (both vaccinated and unvaccinated). Indeed, after vaccination has been completed, there is no movement from \(S_i\) to \(S^V_i\) so \(\beta '\) allows for the model to be considered as a 2n-group SIR model without vaccination. Thus, in particular, one can derive a simple final size relation for the total number of infections in the epidemic. Similarly, define

$$\begin{aligned} \mu '_i = \left\{ \begin{matrix} \mu _i^1 &{} \text {if } i \le n\\ \mu _{(i-n)}^2 &{} \text {if } n < i \le 2n \end{matrix}\right. \end{aligned}$$
(53)

and

$$\begin{aligned} p'_i = \left\{ \begin{matrix} p_i &{} \text {if } i \le n\\ \kappa _{(i-n)} p_{(i-n)} &{} \text {if } n<i\le 2n\end{matrix}\right. . \end{aligned}$$
(54)

In this case of small supply, it is possible to effectively differentiate the final size of the epidemic with respect to the vaccination policy and use the resultant linear approximation to form a simple knapsack problem for the optimal vaccination policy. This will involve writing the objective in the form:

$$\begin{aligned} H({\varvec{U}}(t;\epsilon )) = H({\varvec{0}}) + {\varvec{y}}^T{\varvec{W}}(\tau (\epsilon );\epsilon ) + o(\epsilon ) \end{aligned}$$
(55)

where \({\varvec{W}}\) is the final vaccination amounts in each group. To define the gradient, \({\varvec{y}}\), it is necessary to use the inverse of a matrix \({\varvec{Q}}\) given by

$$\begin{aligned} Q_{ij} = \frac{1}{1-e^{-\sum \nolimits _{j=1}^{2n}\frac{\beta '_{ij}}{\mu '_j}R_j(\infty ;0)}}\bigg [\delta _{ij} +\frac{ S_i(0;0)e^{-\sum \nolimits _{j=1}^{2n}\frac{\beta '_{ij}}{\mu '_j}R_j(\infty ;0)}\beta '_{ij}}{\mu '_j}\bigg ], \end{aligned}$$
(56)

where as before, the variables \(f_i(t;\eta )\) indicate the value of the relevant model variable at time t, given that the parameter \(\epsilon \) is equal to \(\eta \), and \(\delta _{ij}\) is the Kronecker delta. Then, \({\varvec{y}}\) is defined by:

$$\begin{aligned} {\varvec{x}} = {\varvec{Q}}^{-T}{\varvec{p}}' \quad \text {and} \quad y_i = \frac{S_i(0;0)}{N_i}(x_{i+n} - x_i) \quad \forall i \in \{1,\ldots ,n\} . \end{aligned}$$
(57)

These definitions allow for the theorem to be stated.

Theorem 3

Suppose that, for all \(\epsilon > 0\)

$$\begin{aligned} B(t;\epsilon ) = \epsilon \quad \forall t \ge 0. \end{aligned}$$
(58)

and that all other parameter values and initial conditions are independent of \(\epsilon \). Suppose that A(t) is a continuous function with

$$\begin{aligned} A(0) > 0 \end{aligned}$$
(59)

and that the matrix M is invertible. For sufficiently small \(\epsilon \), define

$$\begin{aligned} \tau (\epsilon ) := \inf \bigg \{t : \int _0^t A(s){\textrm{d}}s = \epsilon \bigg \}. \end{aligned}$$
(60)

Suppose that \({\varvec{U}}\) satisfies the condition

$$\begin{aligned} \sum _{i=1}^nU_i(s) = \min \bigg (\int _0^t\chi (s){\textrm{d}}s,1\bigg ), \end{aligned}$$
(61)

where \(\chi \) is defined in (B169). Then, for sufficiently small \(\epsilon \), the objective function is given by:

$$\begin{aligned} H({\varvec{U}}(t;\epsilon )) = H({\varvec{0}}) + {\varvec{y}}^T{\varvec{W}}(\tau (\epsilon );\epsilon ) + o(\epsilon ). \end{aligned}$$
(62)

Moreover, if there is a unique element of \({\varvec{y}}\) equal to the minimum of \({\varvec{y}}\) then the optimal vaccination policy (to leading order in \(\epsilon \)) is uniquely given by:

$$\begin{aligned} U_i(t;\epsilon ) = \left\{ \begin{matrix} A(t) &{}\text {if } i = \min \{ y_i : i \in \{1,\ldots ,n\} \} &{} \text {and } \int _0^t A(s){\textrm{d}}s < \epsilon \\ 0 &{} \text {otherwise} \\ \end{matrix}\right. . \end{aligned}$$
(63)

The second part of the theorem assumes a unique minimal element of \({\varvec{y}}\). This is not guaranteed to happen, and if there were multiple groups with equal values of \({\varvec{y}}\), this would mean that the effectiveness of vaccinating these groups would be equal to \(O(\epsilon )\). However, any sets of parameters satisfying this condition would be unstable to small perturbations (as a trivial example, consider perturbing the initial susceptible populations \(S_i(0,0)\) of the groups with a minimal values of \(y_i\)). Thus, in any practical scenario, the probability that the best estimates of the parameters give multiple minimal values of \(y_i\) is very small.

Theorem 3 is proved in the appendices.

3.2.2 Vaccinating a Homogeneous Population

To illustrate the effectiveness of this approximation, consider first an example of a homogeneous population (so \(n=1\)). Consider the case where \(\beta ^1 = \beta \), \(\beta ^2 = \beta ^3 = 0.5\beta \) and \(\beta ^4 = 0.25 \beta \) for some parameter \(\beta \) that will be varied. Suppose moreover that

$$\begin{aligned} N_1= \mu ^1_1 = \mu ^2_1 = p_1 = \kappa _1 = A(t) = 1, \quad S_1(0) = 1-10^{-4} \quad \text {and}\quad I_1(0) = 10^{-4}.\nonumber \\ \end{aligned}$$
(64)

Finally, suppose \(B(t) = \epsilon \) where \(\epsilon \) will be varied.

Figure  shows a comparison of the predicted and actual change in number of deaths, \(\rho _1\) for two values of \(\epsilon \). It illustrates that, even when \(\epsilon = 0.1\), a relatively large value, \({\varvec{y}}\) gives a good approximation of the true value (found by simulation). Moreover, when \(\epsilon = 0.01\), the two lines are almost indistinguishable. This is useful validation for the approximation, as the correction term was simply proved to be \(o(\epsilon )\) rather than, for example, \(O(\epsilon ^2)\), and so it is encouraging that the predictions are so close.

Fig. 4
figure 4

(Color Figure Online) A comparison of the predicted and actual values of the change in deaths, \(\rho _1\), in the case of a homogeneous population for two different values of vaccination supply, \(\epsilon \) and for different values of infectivity, \(\beta \). Note the different scales on the two y axes

An interesting property of Fig. 4 is that the value of \(\beta \) for which vaccination is most effective appears to be very close to \(S(0)\beta = 1\) (as \(S(0) \sim 1\)). Note that here, as \(\mu =1\), this is equal to the initial reproduction number of the disease. This has the perhaps surprising consequence that if one has a set of disconnected, equally vulnerable subgroups, a small vaccination supply should be assigned to a group with initial reproduction number close to 1, rather than giving it to the group with the highest value of \(\beta \) (that is, the most group with the most infectious individuals). This result is in line with the findings of Gavish and Katriel (2022), which showed that vaccinating less infectious groups can be more effective, and is an important consideration for vaccination policy planning.

3.2.3 Application to Age-Structured Populations

Consider assigning a small quantity of vaccinations to an age-structured population, using the example of the UK. The disease model has been estimated using the inter-age-group contact matrices \(\varvec{\Lambda }\) from Prem et al. (2017), alongside population estimates \({\varvec{N}}\) from UN (2019). As in Prem et al. (2017), this gives a transmission matrix of

$$\begin{aligned} \beta ^1_{ij} = \beta \frac{\Lambda _{ij}}{N_j} \end{aligned}$$
(65)

for some scalar parameter \(\beta \). As in the previous section, it will be assumed that

$$\begin{aligned} \mu _i^{\alpha } = 1 \quad \forall i , \alpha \end{aligned}$$
(66)

and

$$\begin{aligned} \beta ^2 = 0.5\beta ^1, \quad \beta ^3 = 0.5 \beta ^1\quad \text {and} \quad \beta ^4 = 0.25 \beta ^1. \end{aligned}$$
(67)

It will also be assumed that the initial infected population is small, so that, for each i

$$\begin{aligned} S_i(0;\epsilon ) = (1-10^{-4})N_i \quad \text {and} \quad I_i(0;\epsilon ) = 10^{-4}N_i. \end{aligned}$$
(68)

In the following examples, \(\beta \) will be chosen so that the disease-free next-generation matrix of a completely unvaccinated population, given by

$$\begin{aligned} R_{ij} = \frac{N_i\beta ^1_{ij}}{\mu ^1_j} = \beta ^1_{ij} \end{aligned}$$
(69)

has a spectral radius (that is, largest eigenvalue) equal to 4. This sets the \(R_0\) number in the overall population to be 4. To illustrate the population structure, Fig.  shows a heatmap of the matrix \(R_{ij}\). This highlights the strongly assortative nature of the contacts (that is, members of a subgroup are most likely to be contacts with members of their own subgroup), while also showing that contacts are lower for older age groups.

Fig. 5
figure 5

(Color Figure Online) A heatmap of the next-generation matrix for the age-structured UK population

Now, two different age-dependent case-fatality ratios will be considered—uniform case-fatality and approximate COVID-19 case fatality, taken from Dyer (2021). In both cases, it will be assumed that vaccination reduces the case fatality ratio by 90% (following the results of Dyer (2021) for the COVID-19 vaccines) so that \(\kappa _i = 0.1\) for all i. However, it is worth emphasising that this model is simply based on real-world data and does not seek to accurately model the COVID-19 pandemic.

Figure  shows the effectiveness of vaccinating each age group in the two different cases, as a proportion of the optimal effectiveness. Note that here the proportion of effectiveness of assigning vaccine to group i is given by \(\frac{y_i}{\min _j(y_j)}\), as each \(y_j\) is non-positive. It highlights that the significantly higher mortality rates for COVID-19 for the older age groups mean that vaccinating them is much more effective than vaccinating the other age groups. This is an example of Theorems 1 and 2, as the oldest age group makes up a relatively small percentage (around 9%) of the population, but, if one scales p such that it has median value 1, the \(p_iN_i\) value for the oldest age group is approximately 20, and so is definitely O(1) rather than \(O(\epsilon )\).

Fig. 6
figure 6

(Color Figure Online) The effectiveness of assigning a small quantity of vaccines to each age group as a proportion of the optimal effectiveness

A perhaps surprising exception to the general correlation between effectiveness and mortality is the relatively low effectiveness of vaccinating the 55–59-year-old age group, which is lower than the 45–49-year-old and 50–54-year-old groups. This illustrates the non-intuitive nature that optimal vaccination policies can take, and the importance of investigating their behaviour fully. The main reason for this low effectiveness is that, while the 55–59-year-old age group is more vulnerable to COVID-19 than the younger groups, according to Prem et al. (2017), they have much less contact with the 75+-year-old age group, and thus, vaccinating this group provides significantly less secondary protection to most vulnerable members of the population. The authors speculate that this could be due to a significant number of the parents of the 55–59-year-old age group having died (particularly in comparison with the younger groups), reducing their links with the 75+-year-old age group. Moreover, those in the 55–59-year-old age group may also not be old enough to have many 75+-year olds in their social circles (in comparison with members of older groups). However, further investigation would be needed to justify this claim.

In the case of uniform mortality, the vaccination policy becomes even less intuitive, as Fig. 6 shows that the optimal age group to vaccinate is the 40–44-year olds. Indeed, from Fig. 5, it may seem that the 15–19-year-old group would be the best group to vaccinate, as they have the highest overall transmission—that is, the maximum value of

$$\begin{aligned} \hbox { Total infectious force of group}\ j := \sum _{i=1}^{16}R_{ij}. \end{aligned}$$
(70)

However, if instead, one considers

$$\begin{aligned} \hbox { Total external infectious force of group}\ j := \sum _{i=1,i \ne j}^{16}R_{ij}, \end{aligned}$$
(71)

then it is the 35–39 and the 40–44 age groups which have the highest values. This can be considered in conjunction with the results of the previous subsection, which showed that vaccinating groups with \(R_0\) numbers close to 1 is optimal for disconnected populations. Indeed, the “secondary effect” of vaccinations (that is, the number of people who are not vaccinated, but are protected from the disease because of vaccines given to others) can be higher for groups with lower internal infectious force, particularly when their external infectious force is higher.

Finally, it is useful to again explore the range of values for \(\epsilon \) for which \({{\textbf {y}}}\) gives a good approximation of the true number of infections. As the minimum (scaled so that the total population size is 1) value of \(N_i\) is 0.0498 in this case, \(\epsilon \) will be tested at 0.0498. The results of this are shown in Fig. , which again illustrates the effectiveness of this approximation. Indeed, the largest error across either case is of order \(10^{-4}\), which in turn is of order \(\epsilon ^2{\varvec{y}}\). This suggests that the \(o(\epsilon )\) correction term in Theorem 3 is significantly smaller than \(\epsilon \), which increases the usefulness of this approximation. However, further investigation is needed to determine whether this correction is of \(O(\epsilon ^2 {\varvec{y}})\) for all parameter values.

Fig. 7
figure 7

(Color Figure Online) A comparison of the predicted and simulated change in the objective function when vaccinating each individual group at \(\epsilon = 0.0498\). Both the cases of a COVID-19 case fatality ratio (in a) and a uniform case fatality ratio (in b) are presented

4 Discussion

This paper has shown two general principles for optimal vaccination policies by looking at the asymptotic behaviour of the optimal policy in the case of extreme parameters. Firstly, it has shown that small, vulnerable groups should in general be vaccinated first, regardless of the overall timetable of vaccination. This is an important result as it requires very little data on the population—merely the case fatality ratios and populations of the different subgroups—and in particular needs no forecasting of future transmission trends or vaccine supply.

The analytically derived results (in the limiting case) also show that the effect of vaccinating this small group far outweighs the effect of vaccinating any of the other groups. Indeed, if the size of the vulnerable group is \(O(\epsilon )\) and the case fatality ratio of the other groups is \(O(\epsilon )\), then Theorem 1 shows that vaccinating the vulnerable group will lead to an \(O(\epsilon )\) decrease in the number of fatalities, while vaccinating the same number of people from another group will only decrease this by \(O(\epsilon ^2)\). As discussed in Sect. 3.2.3, this result is of practical importance for diseases such as COVID-19, where the majority of the fatalities would be from certain age groups within the population. In particular, it provides strong evidence for the importance of sharing vaccines on a global scale, as this is the only way to ensure that vaccinations can be given to all people who are most vulnerable to the disease.

However, this result should be used with caution, as it certainly does not imply that a population should always be vaccinated in order of decreasing vulnerability to the disease. The optimal vaccination policy is, in general, a balance between directly protecting the vulnerable by vaccinating them and by indirectly protecting them by vaccinating those groups with the highest infectiousness. This is shown in Fig. 6 by the fact that, when a COVID-19 case fatality ratio is used, the relative effectiveness of vaccinating each age group does not decrease everywhere with age. The results of Theorems 1 and 2 simply provide a principle that in the asymptotic limit, the optimal strategy is to vaccinate small, vulnerable groups first. In the absence of data on vaccination effectiveness (which is crucial in determining whether indirectly protecting the more vulnerable population may be better), this provides a mathematically sound justification for beginning with the most vulnerable members of a population while gathering data to determine the rest of the vaccination policy.

The second principle derived in this paper was a linear approximation to the change in number of fatalities from a disease, which allows for the estimation of the optimal vaccination policy in the case of a small total supply. Again, this principle is flexible, applying for any set of parameters and provides a computationally cheap way of the approximating the optimal solution, even for large numbers of groups, as it merely requires the solution of a linear system involving the same number of variables as the number of groups.

A useful feature of this approximation is that it appears to have high accuracy even for reasonably large values of the total supply, such as when 10% of the population can be vaccinated. Figures 4 and 7 show that there is very little deviation between the predicted and actual values of the objective function and so suggest that this is a flexible and widely applicable method of approximation, even when the population contains a large number of subgroups. However, it would be helpful to strengthen the results of Theorem 3 to get a stronger bound on the error for small \(\epsilon \) to ensure that this similarity holds for all models.

The results of the examples presented in Sect. 4 are also informative for vaccination policy. As shown in Fig.  4, in a completely homogeneous population, vaccination has the most effect when the reproduction number (\(\frac{\beta }{\mu }\) in this case) is slightly bigger than 1, with a steep decline in effectiveness for reproduction numbers below 1 and a more gradual decline for large reproduction numbers. This result allows one to consider the “vaccination leverage” of a population—that is, the effectiveness that a small quantity of vaccination can have—and shows that, even in the case of homogeneous case fatality ratios, vaccinating in order of infectiousness may be far from optimal, as it is much more difficult to reduce infections in a highly infectious population.

Indeed, a similar idea was shown to apply when the UK age structure was considered. In the case of uniform case fatality, the optimal group to assign a small amount of vaccinations was the 40–44 age group which, as shown in Fig. 5, is not the most infectious group. This perhaps counter-intuitive result highlights the importance of mathematically justifying the principles one uses to decide on optimal vaccination policies, as “common-sense” arguments may in fact give false conclusions. Communicating such principles to governments and policy-makers will be crucial in future pandemics, particularly ones with more homogeneous case fatality ratios where the optimal policy is not as intuitive as for diseases like COVID-19.

An important limitation of Theorem 3 is that the optimal policies for small vaccination supplies do not necessarily generalise to give the beginning of the optimal vaccination policy in the case of a much larger vaccination supply. Indeed, it is possible to have bifurcations in the optimal vaccination policy as the supply increases—for example, it can become possible to completely avoid an epidemic by vaccinating a large quantity of an infectious group. Thus, while the linear approximation can be a useful starting point when attempting to estimate the optimal strategy, it is important to consider alternatives when a large proportion of the population can be vaccinated.

The results of this paper are only applicable if the trajectory of the disease in question can be well-approximated by multi-group SIR dynamics. In particular, this requires there to be reasonably high levels of the disease in a population [otherwise stochastic dynamics change the epidemic behaviour (Ball and Neal 2002)], and for population subgroups to be sufficiently large (again to prevent stochasticity dominating). Moreover, the model assumptions would not hold if individuals could be re-infected, or if the effect of vaccination was not eternal (though if the timescale of the epidemic was sufficiently shorter than the timescale of immunity decay, then the model would still provide a good approximation).

A final barrier to using the results in this paper is that estimation error in the model parameters could lead to the optimal solutions being incorrectly calculated. Estimating the \(\beta _{ij}^{\alpha }\) parameters is particularly complicated, especially in a multi-group setting where it is difficult to establish the chain of transmission between different groups. Because of this, building models based on contact rates between groups [estimated using surveys (Prem et al. 2017)] or proxies such as commuting patterns (Keeling and White 2011) may be the best method, at least to provide priors on the parameters. Theorems 1 and 2 are significantly less susceptible to errors in parameters, as they do not require any of the \(\beta _{ij}^{\alpha }\) or \(\mu _j^{\alpha }\) parameters to be known, although the level of “smallness” of \(\epsilon \) would vary depending on the disease in question. Theorem 3 is significantly more susceptible to error, as all the model parameters are needed. However, while there may be bifurcations in the optimal strategy, the optimal value of the objective function should depend continuously on the parameters (a fact which could be proved by extending the results of Proposition 5), limiting the effect of small estimation errors.

Despite this, the authors expect that similar results to those presented in Theorem 3 will hold for a very wide class of deterministic models. Essentially, the only necessary characteristic of the model that is required by Theorem 3 is that the objective function, \(H({\varvec{U}})\), is a continuously differentiable function of the vaccination policy \({\varvec{U}}\) in some neighbourhood of \({\varvec{0}}\). Indeed, \({\varvec{y}}\) in Theorem 3 can be replaced by \(\nabla {\varvec{H}}({\varvec{0}})\) in a general setting. Certainly, it should be conceptually simple (though perhaps algebraically complicated) to generalise this result to other compartmental models such as SEIR (Susceptible–Exposed–Infected–Recovered) and even those modelling vector-transmitted diseases.

The authors also expect that Theorem 1 will hold for general models where the effect of vaccination is eternal. The essential points in the proof of Theorem 1 are that vaccinating the small group does not affect the overall vaccination program (to leading order) and that it does have an O(1) effect on the objective function. Both of these should still hold in a wide range of models, although it may be difficult to define the meaning of “very small group” and “very vulnerable group”—particularly in more complicated settings such as individual-based models.

This work could be extended by deriving more principles for extreme parameter values and investigating whether they generalise to realistic model parameters. By combining the existing results in this paper and others such as Gavish and Katriel (2022) with potential new ones, one could create an algorithm that creates good heuristics of optimal vaccination policies that could be used as starting points for accurately approximating the optimal policy for a general parameter set. This could have significant implications for the design of vaccination policies, as it would enable the optimisation problem to be estimated for very complex models, as the time taken to converge to an optimal solution would significantly decrease given good initial heuristics.

5 Conclusion

The results of this paper are summarised below:

  • If a sufficiently vulnerable, sufficiently small population exists in a multi-group SIR model, it is optimal to vaccinate this group first.

  • For small overall vaccination supplies, the optimal vaccination problem can be well approximated by a simple knapsack problem.

  • This linearisation appears to be a good approximation even for relatively large vaccination supplies (such as 10% of the population).

  • This linearisation shows that, in the case of uniform case fatality, it is not necessarily optimal to vaccinate the most infectious group.