1 Introduction

The COVID-19 pandemic has illustrated the importance of quickly implementing vaccination policies which target particular groups within a population (Fitzpatrick and Galvani 2021). The difference in final infections between targeted policies and uniform distribution to the entire population can be significant (Castro and Singer 2021; Estadilla et al. 2021) and so it is important that the models underlying these decisions provide realistic predictions of the outcomes of different policies.

One of the most commonly used models to forecast epidemics is the multi-group SIR (Susceptible–Infected–Recovered) model (Acemoglu et al. 2021; Kuniya 2019; Ram and Schaposnik 2021). This model divides the population into different groups based on characteristics such as age or occupation. Each group is then further sub-divided into categories of susceptible, infected and recovered. Where vaccination does not give perfect immunity, further sub-categorization based on vaccination status can also be used (Kuga and Tanimoto 2018), as will be done in this paper.

While many other approaches have been developed either by adding compartments to the SIR framework (Moore et al. 2021) or using completely different models such as networks (Chen and Sun 2014) or stochastic simulations (Ball and Lyne 2002), the multi-group SIR model remains popular because of its comparatively small number of parameters and its relatively simple construction and solution. In this paper, attention will thus be restricted to the multi-group SIR model, although it would be beneficial for future work to consider a wider range of disease models.

There are two general frameworks that are used to model optimal vaccination policies in a resource-limited setting. The first, used in papers such as Hill and Longini Jr (2003) and Becker and Starczak (1997), seeks to reduce the reproduction number, \(R_0\) of the epidemic as much as possible by vaccinating before infections arrive in a population. It is simple to show that in this case, one should use all of the vaccinations available, and so this problem will not be considered further in this paper.

The second framework, used in papers such as Acemoglu et al. (2021) and Hansen and Day (2011) aims to minimise the total cost of an epidemic. This is the framework that will be discussed in this paper. The “cost” of an epidemic is, in general, defined to be the number of deaths (or equivalently, infections), with many papers also considering the cost of vaccination alongside the cost of other control measures, such as isolation, lockdown or treatment (Fu et al. 2022).

One important principle which underlies all of these vaccination policies is the acceptance that giving people their first dose of vaccine as soon as possible reduces the number of infections. Of course, this only holds when the timescale considered is sufficiently short for effects such as waning immunity and disease seasonality to be negligible, and a more complicated framework would be needed to model these effects. However, the acceptance of at least short-term optimality of maximal vaccination effort has been highlighted in the COVID-19 pandemic response, as countries began their vaccination programs as soon as vaccines became available (Mathieu et al. 2021).

To the best of the authors’ knowledge, no one has provided a mathematical proof that in a general, multi-group SIR model with imperfect vaccination, it is always best to vaccinate people as early as possible. Of course, it is not difficult to create a conceptually sound justification—vaccinating more people means that fewer people will catch the disease which will reduce the total number of infections. However, the SIR model is an approximation of the process of a disease spreading, and so it is important that it obeys this principle for all physical parameter values and vaccination policies.

Some special cases of the theorem presented in this paper have been previously proved in the literature. In particular, a significant number of papers have considered the optimal vaccination policy for a homogeneous population, with Abakuks (1972) first proving that, in this case, it is optimal to vaccinate at maximal effort (if one ignores the cost of vaccination). This proof held for vaccination policies that were finite sums of point mass “impulse” vaccinations, and has been generalised by papers such as Hansen and Day (2011), Zaman et al. (2008), Morton and Wickwire (1974) and Zhou et al. (2014) to a much wider class of vaccination policies, although the proof was still restricted to a single group and to perfect vaccination. Moreover, Hansen and Day (2011) notes that the case of imperfect vaccination (where vaccinated individuals can still get infected, although at a lower rate) remained a topic of open investigation, and so it can not easily be solved using the same methods presented in these papers. A slight extension is made in Duijzer et al. (2018) where it is shown that maximal effort is optimal in the case of perfect vaccination of any number of disconnected groups, but the full problem is still far from understood.

The general method of proof in the literature relies on Pontryagin’s Maximum Principle, which is difficult to apply to multi-group models due to the more complex structure of the equations. It is simple to characterise the solution in terms of the adjoint variables, as is done in Zhang et al. (2020) and Zavrakli et al. (2021) for a two-group model with imperfect vaccination, in Boutayeb et al. (2021) for a general n-group model with perfect vaccination and in Lee et al. (2012) for a six-group model with imperfect vaccination. However, determining whether this solution corresponds to the maximal effort solution in the case of zero vaccination cost requires the analysis of the adjoint ODE system, which is often just as complicated as the original disease model. In particular, the fact that vaccinated people need to be no more infectious, no more susceptible and be infected for no more time than unvaccinated people means that any analysis of the adjoint system would be complicated, as the properties of all the constituent parameters would need to be used.

Thus, in this paper, a novel approach is developed. Rather than attempting to use the general optimal control theory methodology, the specific structure of the SIR equation system is exploited. Using this, an inequality is derived which shows that if a given vaccination policy, \(\tilde{{\varvec{U}}}\) vaccinates each individual at least as early as another vaccination policy, \({\varvec{U}}\), then the latter policy will lead to at least as many deaths (or equivalently, infections) as the former. As well as providing a constraint on the optimal solution, this theorem also highlights important structural properties of the model, as it shows that the number of deaths is everywhere non-increasing in the vaccination rates, rather than this just holding near the optimal solutions.

Also introduced in this paper is a more general model of vaccination than the one normally used in the literature. The one that is typically used (in almost all papers cited in this work such as Hansen and Day (2011), Zaman et al. (2008) and Kar and Batabyal (2011)) models decreasing vaccination uptake by assuming that the total number rate of people being vaccinated is the product of a vaccination rate and the proportion of susceptible people in the population. The model introduced in this paper allows for more flexibility in modelling the demand. However, the standard vaccination model is a special case of the general model introduced here, and so the theoretical results proved in this paper can be used by those following the standard model.

Alongside proving that the final infected, recovered and dead populations are non-increasing with increased and earlier vaccination effort, some cautionary contradictions to perhaps intuitive conjectures are also provided which show the importance of mathematical proof instead of simply intuition. In particular, it is shown that increased vaccination (under this model) can lead to, at a fixed finite time of the simulation, higher infection rates or a higher death count, despite the longer-term better performance of this policy. Indeed, it is results similar to these which make the proof of the optimality of maximal effort difficult, as it means that one must be very careful when constructing the inequalities that do hold for all models.

2 Modelling

2.1 Disease Transmission and Vaccination Model

Suppose that the population is divided into n subgroups, such that population of people in group i is \(N_i\) and define

$$\begin{aligned} N:= \sum _{i=1}^n N_i. \end{aligned}$$

Define the compartments of people as follows, for \(i = 1,...,n\):

$$\begin{aligned} S_i&:= \text {Number of people that are in group }i\text {, are susceptible, and are unvaccinated},\\ I_i&:= \text {Number of people that are in group }i\text {, are currently infected, and}\\&\qquad \quad \text {were infected while unvaccinated },\\ R_i&:= \text {Number of people that are in group }i\text {, are recovered or dead, and}\\&\quad \qquad \text {were infected while unvaccinated}, \\ S^V_i&:= \text {Number of people that are in group }i\text {, are susceptible and are vaccinated}, \\ I^V_i&:= \text {Number of people that are in group }i\text {, are infected}\\&\quad \qquad \text {and were infected after being vaccinated},\\ R^V_i&:= \text {Number of people that are in group }i\text {, are recovered or dead} \\&\quad \qquad \text {and were infected after being vaccinated}. \end{aligned}$$

This paper introduces a more general and flexible framework for vaccination, which is motivated as follows. It is assumed that there is a record of people who have received a vaccination and that protection from vaccination does not decay over time, so that no one is vaccinated more than once. Thus, if a total number, \(U_i(t)dt\), of people in group i are given vaccines in a small time interval \((t,t+dt)\), and these vaccines are distributed randomly to the unvaccinated population in group i, the total population of susceptibles given vaccines in group i is

$$\begin{aligned} U_i(t)dt \times \text {P}\bigg (\text {A person in group }i\text { is in }S_i \vert \text { A person is in group }i\text { is unvaccinated}\bigg ) \end{aligned}$$

which is equal to

$$\begin{aligned} \frac{U_i(t)dt S_i(t)}{N_i - \int _0^tU_i(s)ds}, \end{aligned}$$

as \(\int _0^tU_i(s)ds\) is the total population that are in group i and have been vaccinated before time t. For the remainder of this section, this vaccination model will be referred to as the “general” model

This results in the following model, based on SIR principles

$$\begin{aligned} \frac{dS_i}{dt}&= -\sum _{j=1}^n(\beta ^1_{ij}I_j+ \beta ^2_{ij} I^V_j)S_i - \frac{U_i(t) S_i}{N_i-W_i(t)}, \end{aligned}$$
(1)
$$\begin{aligned} \frac{dI_i}{dt}&= \sum _{j=1}^n(\beta ^1_{ij} I_j+ \beta ^2_{ij} I^V_j)S_i - \mu ^1_i I_i,\nonumber \\ \frac{dR_i}{dt}&= \mu ^1_i I_i,\nonumber \\ \frac{dS^V_i}{dt}&= -\sum _{j=1}^n(\beta ^3_{ij}I_j + \beta _{ij}^4 I^V_j)S^V_i + \frac{U_i(t) S_i}{N_i-W_i(t)},\nonumber \\ \frac{dI^V_i}{dt}&= \sum _{j=1}^n(\beta ^3_{ij}I_j + \beta _{ij}^4I^V_j)S^V_i -\mu ^2_i I^V_i, \nonumber \\ \frac{dR^V_i}{dt}&= \mu ^2_i I^V_i, \end{aligned}$$
(2)

where

$$\begin{aligned} W_i(t):= \int _0^t U_i(s)ds. \end{aligned}$$

Here, \(\beta ^{1}_{ij}\) represents transmission from the unvaccinated members of group j to the unvaccinated members of group i, \(\beta ^{2}_{ij}\) represents transmission from vaccinated members to unvaccinated members, \(\beta ^{3}_{ij}\) represents transmission from vaccinated members to unvaccinated members and \(\beta ^{4}_{ij}\) represents transmission from vaccinated members to vaccinated members. Additionally, \(\mu _i^{1}\) represents the infectious period of unvaccinated infected members in group i while \(\mu _i^2\) represents the infectious period of vaccinated members. Note that the superscript denotes different parameter values, so that \(\beta ^2_{ij}\) is not necessarily the square of \(\beta ^1_{ij}\).

To ensure that vaccination is “locally effective” (that is, a vaccinated individual is no more likely to transmit or be infected by the disease, and is infectious for no longer than an unvaccinated individual in the same subgroup), and that the parameters are epidemiologically feasible, the following constraints are imposed:

$$\begin{aligned} \beta ^1_{ij} \ge \beta _{ij}^2, \beta _{ij}^3 \ge \beta _{ij}^4\ge 0 \quad \text {and} \quad \mu _i^2 \ge \mu _i^1 > 0 \end{aligned}$$

Note that there is no constraint on the ordering of \(\beta _{ij}^2\) and \(\beta _{ij}^3\). It is assumed for convenience that all variables except the \(S_i\) and \(I_i\) are initially zero. Finally, we assume that all initial conditions are non-negative.

Ultimately, the objective of the vaccination program will be to minimize a weighted sum of the total infections in each group—that is

$$\begin{aligned} \sum _{i=1}^np_i(R_i(\infty ) + \kappa _iR_i^V(\infty )). \end{aligned}$$

Here \(p_i\) is the weighting of a member of group i who is infected before being vaccinated, while \(p_i\kappa _i\) is the weighting of a member of group i who is infected after being vaccinated. These parameters could be chosen to capture one of a range of objectives, such as minimizing deaths, minimizing hospitalisations, or minimizing total infections. Again assuming “local effectiveness” of the vaccination, it is imposed that \(\kappa _i \le 1\), as vaccination should not increase the severity of the infection.

The equations (1)–(2) sum to zero on the right-hand side, and so for each i,

$$\begin{aligned} S_i(t) + I_i(t) + R_i(t) + S^V_i(t) + I^V_i(t) + R^V_i(t) = N_i \quad \forall t \ge 0. \end{aligned}$$
(3)

It will be assumed that the populations and parameters have been scaled such that \(N = 1\), Finally, it is assumed that

$$\begin{aligned} W_i(t)\le N_i \quad \forall t \ge 0 \quad \text {and} \quad W_i(t) = N_i \Rightarrow \frac{U_i(t) S_i}{N_i-W_i(t)} = 0. \end{aligned}$$

to ensure feasibility of the vaccination policies.

2.2 Comparison to the Standard Vaccination Model

A more common model of vaccination in the literature is the “standard” vaccination model (Hansen and Day 2011; Zaman et al. 2008; Kar and Batabyal 2011), where Eq. (1) becomes

$$\begin{aligned} \frac{dS_i}{dt} = -\sum _{j=1}^n(\beta ^1_{ij}I_j+ \beta ^2_{ij} I^V_j)S_i - U^*_i(t)S_i, \end{aligned}$$

Here, \(U^*_i(t)\) is the vaccination rate in this model. In general, \(U_i^*(t)\) is constrained such that \(U_i^*(t) \le {\mathcal {U}}_i(t)\) for some function \({\mathcal {U}}_i(t)\)

The \(U^*_i(t)S_i\) term seeks to capture the fact that vaccination uptake will decrease even if the vaccination “effort” (or, equivalently, the doses available) remains constant. However, the rate at which uptake decreases is fixed by the model. For example, if the vaccination effort \(U_i^*(t)\) was equal to a constant \({\mathcal {U}}_i\) and was much quicker than the rate of infection, then the leading order equation is

$$\begin{aligned} \frac{dS_i}{dt} = -{\mathcal {U}}_iS_i \Rightarrow S_i(t) = S_i(0)e^{-{\mathcal {U}}_i t} \end{aligned}$$

and hence

$$\begin{aligned} \frac{dS_i}{dt} = -{\mathcal {U}}_iS_i(0)e^{-{\mathcal {U}}_i t} \end{aligned}$$

which means that vaccination uptake decreases exponentially. Even for some human pandemics, such as COVID-19, where demand remained high until a large proportion of the population had been vaccinated, as shown in Ritchie et al. (2020), such a model may be inappropriate.

The general vaccination model provides substantially more flexibility. For example, it is possible for a group to be completely vaccinated in the general case, whereas this is impossible in the standard case (while one may never be able to fully vaccinate a human population, it would be possible, for example, in a group of animals on a farm). Moreover, by bounding the vaccination rate \(U_i(t)\) above by some function of vaccination demand \(K_i(W(t),t)\), decreasing vaccination uptake can still be modelled.

2.3 Recovery of the Standard Model

The standard model is a special case of the general model, meaning that the results of this paper are applicable to both frameworks. To show this, one can solve the equation

$$\begin{aligned} \frac{U_i(t)}{N_i - W_i(t)} = U_i^*(t) \Rightarrow \frac{d}{dt}\bigg (\log (N_i-W_i(t)) + W_i^*(t)\bigg ) = 0, \end{aligned}$$
(4)

where

$$\begin{aligned} W^*_i(t):= \int _0^tU_i^*(s)ds. \end{aligned}$$

Thus, by integrating (4), and noting that \(W^*_i(0) = W_i(0) = 0\)

$$\begin{aligned} \log (N_i - W_i(t)) + W_i^*(t) = \log (N_i) \end{aligned}$$

and so

$$\begin{aligned} W_i(t) = N_i(1-e^{-W_i^*(t)}). \end{aligned}$$

The constraint \(U^*_i(t) \le {\mathcal {U}}_i\) is equivalent to \(U_i(t) \le (N_i-W_i(t)){\mathcal {U}}_i\) and so this can also be represented in the general model. Thus, given any standard vaccination policy \({\varvec{U}}^*\), it can be replaced by a general policy \({\varvec{U}}\) (although the converse does not hold as \(W_i(t) = N_i\) requires \(W_i^*(t) = \infty \)).

Moreover, note that \(W^*_i(t)\) is increasing in \(W_i(t)\). Thus, if a pair of general policies \({\varvec{U}}\) and \(\tilde{{\varvec{U}}}\) satisfy \(W_i(t) \le {\tilde{W}}_i(t)\) then this inequality is preserved by the corresponding standard policies as \(W^*_i(t) \le {\tilde{W}}^*_i(t)\). This property means that the theorems proved in this paper will hold for both models (as they will be proved using the general model).

3 Optimisation Problem

Now that the model has been formulated, it is possible to set up the optimisation problem that will be considered in the remainder of this paper.

3.1 Constraints on \(U_i(t)\)

In order to assist the proof of the theorems, it is necessary to make some (unrestrictive) assumptions on the vaccination rates, \(U_i(t)\).

Firstly, there are the physical constraints that for each \(i \in \{1,...,n\}\)

$$\begin{aligned} U_i(t)\ge 0\quad \text {and} \quad \int _0^tU_i(s)ds \le N_i \quad \forall t \ge 0. \end{aligned}$$
(5)

It is also necessary that \(U_i(t)\) is within the class of functions such that solutions to the model equations exist and are unique. Discussion of the exact conditions necessary for this to hold is outside the scope of this paper. However, from the Picard-Lindelöf Theorem (Collins 2006), a sufficient condition for this is that \(U_i(t)\) is a piecewise Lipschitz continuous function. While this is not a necessary condition, this illustrates that this assumption will hold for a large class of functions. However, it will be helpful throughout the course of the proof to explicitly assume two conditions on \(U_i(t)\) - namely, that it is bounded and that it is Lebesgue integrable on \(\Re \) for each i.

For the remainder of this paper, define the set of feasible vaccination policies, C, is the set of functions \({\varvec{U}}\) satisfying (5) such that unique solutions to the model equations exist with these functions as the vaccination policy and such that each \(U_i(t)\) is bounded and Lebesgue integrable on \(\Re \).

3.2 Optimisation Problem

The aim is to choose the vaccination policy \({\varvec{U}} \in C\) such that the total number of deaths (or any linear function of the infections in each subgroup) is minimised while meeting additional constraints on vaccine supply and vaccination rate. It is assumed that the maximal rate of vaccination at time t is A(t) and that there is a total (non-decreasing) supply of B(t) vaccinations that has arrived by time t. Thus, for each i, \(U_i(t)\) is constrained to satisfy

$$\begin{aligned} \sum _{i=1}^nU_i(t) \le A(t)\quad \text {and} \quad \sum _{i=1}^n W_i(t) \le B(t). \end{aligned}$$

As previously discussed, it is assumed that each infection of unvaccinated people in group i is weighted by some \(p_i\) and that the infection is no more serious for those that have been vaccinated, so that the weighting of an infection of a vaccinated person in group i is \(p_i\kappa _i\), where \(\kappa _i \le 1\). Thus, the objective function is

$$\begin{aligned} H({\varvec{U}}):= \sum _{i=1}^np_i\bigg (R_i(\infty ) + \kappa _i R^V_i(\infty )\bigg ) \end{aligned}$$

where, for example

$$\begin{aligned} R_i(\infty ) = \lim _{t \rightarrow \infty }(R_i(t)). \end{aligned}$$

Note these limits exist as \(R_i\) is non-decreasing and bounded by Lemma C.3. Hence, the optimisation problem is

$$\begin{aligned} \min \bigg \{\sum _{i=1}^np_i\bigg (R_i(\infty ) + \kappa _i R^V_i(\infty )\bigg )&: \sum _{i=1}^nU_i(t) \le A(t),\quad \sum _{i=1}^n W_i(t) \le B(t) \quad \forall i,t...\nonumber \\&\text {and} \quad {\varvec{U}} \in C\bigg \}. \end{aligned}$$
(6)

4 Main Results

The main results of this paper are as follows. Firstly, it is shown that the objective function is non-increasing in vaccination effort.

Theorem 1

Suppose that \({\varvec{U}},\tilde{{\varvec{U}}} \in C\). Suppose further that for each \(i \in \{1,...,n\}\) and \(t \ge 0\)

$$\begin{aligned} \int _0^t U_i(s)ds \le \int _0^t {\tilde{U}}_i(s)ds \end{aligned}$$

Then

$$\begin{aligned} H({\varvec{U}}) \ge H(\tilde{{\varvec{U}}}). \end{aligned}$$

Then, it is shown that if an optimal solution exists, there is an optimal maximal effort solution.

Theorem 2

Suppose that B is differentiable, and that there is an optimal solution \({\varvec{U}}\) to (6). Then, define the function

$$\begin{aligned} \chi (t):= \left\{ \begin{matrix} A(t) &{} \text {if} \quad \int _0^t\chi (s)ds < B(t) \\ \min (A(t),B'(t)) &{} \text {if} \quad \int _0^t\chi (s)ds \ge B(t) \end{matrix}\right. \end{aligned}$$

and suppose that \(\chi (t)\) exists and is bounded. Then, there exists an optimal solution \(\tilde{{\varvec{U}}}\) to the problem (6) such that

$$\begin{aligned} \sum _{i=1}^n{\tilde{W}}_i(t) =\min \bigg (\int _0^t \chi (s)ds,1\bigg ). \end{aligned}$$
(7)

Moreover, if \(\chi (t)\) is continuous almost everywhere, there exists an optimal solution \(\tilde{{\varvec{U}}}\) such that

$$\begin{aligned} \sum _{i=1}^n {\tilde{U}}_i(t) = \left\{ \begin{matrix} \chi (t) &{} \text {if } \int _0^t \chi (s)ds < 1 \\ 0 &{} \text {otherwise}\end{matrix}\right. \end{aligned}$$

It is perhaps concerning to the reader that the existence of \(\chi \) is left as an assumption in this theorem. However, while the exact conditions on its existence are beyond the scope of this paper, it certainly exists for a wide class of functions A(t) and B(t), as proved in Lemma B.11.

Finally, it is shown that this principle still holds if the cost of vaccination is considered.

Theorem 3

Under the assumptions of Theorem 2, consider a modified objective function \({\mathcal {H}}\) given by

$$\begin{aligned} {\mathcal {H}}({\varvec{U}}) = H({\varvec{U}}) + F({\varvec{W}}(\infty )) \end{aligned}$$

for any function F. Then, with \(\chi \) defined to be the maximal vaccination effort as in Theorem 2, there exists an optimal solution \(\tilde{{\varvec{U}}}\) such that, for some \(\tau \ge 0\)

$$\begin{aligned} \sum _{i=1}^n{\tilde{W}}_i(t) =\left\{ \begin{matrix} \int _0^t \chi (s)ds&{} \text {if } t \le \tau \\ \\ W_i(\tau ) &{} \text {otherwise} \\ \end{matrix}\right. . \end{aligned}$$

Moreover, if \(\chi \) is continuous almost everywhere, then there is an optimal solution \(\tilde{{\varvec{U}}}\) such that

$$\begin{aligned} \sum _{i=1}^nU_i(t) =\left\{ \begin{matrix} \chi (t)&{} \text {if } t \le \tau \\ 0 &{} \text {otherwise} \\ \end{matrix}\right. . \end{aligned}$$

5 Sketch Proof

The full proofs of Theorems 1, 2 and 3 can be found in Appendix A, with supplementary lemmas found in Appendix B and C. However, this section provides a high-level sketch of the main arguments.

5.1 Bounds on the Inter-Group Infectious Forces

Define

$$\begin{aligned} K_{ij}(t) = \frac{\beta ^1_{ij}}{\mu ^1_j}R_j(t) + \frac{\beta ^2_{ij}}{\mu ^2_j}R^V_j(t) \end{aligned}$$

and

$$\begin{aligned} L_{ij}(t):= \frac{\beta _{ij}^3}{\mu _i^1} R_j(t) + \frac{\beta _{ij}^4}{\mu _i^2}R^V_j(t). \end{aligned}$$

\(K_{ij}(t)\) can be interpreted as the total infectious force up to time t from the members of group j on the unvaccinated members of group i as

$$\begin{aligned} K_{ij}(t) = \int _0^t (\beta ^1_{ij}I_j({\tilde{t}}) + \beta ^2_{ij}I_j^V({\tilde{t}}))d{\tilde{t}}. \end{aligned}$$

Similarly, \(L_{ij}(t)\) can be interpreted as the total infectious force up to time t from the members of group j on the vaccinated members of group i.

The first part of the proof shows that increasing the vaccination effort will decrease these infectious forces. To facilitate the proof, some extra assumptions are made on the parameters (which will be removed in subsequent propositions).

Proposition 1

Suppose that \(U_i(t)\) and \({\tilde{U}}_i(t)\) are right-continuous step functions. Moreover, suppose that

$$\begin{aligned}{} & {} \beta ^1_{ij}> \beta _{ij}^3> 0 \quad \forall i,j \in \{1,...,n\}, \\{} & {} S_i(0)I_i(0) > 0 \quad \forall i \in \{1,...n\}. \end{aligned}$$

and that

$$\begin{aligned} W_i(t) < N_i \quad \forall t \ge 0 \quad \text {and} \quad \forall i \in \{1,...,n\} \end{aligned}$$

Then,

$$\begin{aligned} K_{ij}(t) \ge {\tilde{K}}_{ij}(t) \quad \text {and} \quad L_{ij}(t) \ge {\tilde{L}}_{ij}(t) \quad \forall t \ge 0. \end{aligned}$$
(8)

This proposition is proved by contradiction in two parts. Firstly, a time T is introduced, which is the infimum of the times where at least one of \(K_{ij}(t)< {\tilde{K}}_{ij}(t)\) or \(L_{ij}(t) < {\tilde{L}}_{ij}(t)\) for some i and j. As the infectious forces do not satisfy this condition in [0, T], one can show that, necessarily, they must all have been equal in [0, T], which means that one must have \(W_i(t) = {\tilde{W}}_i(t)\) for all \(t \in [0,T]\).

From here, the proof can proceed by a short-time linearisation, considering the small interval \([T,T+\delta ]\). The condition on \(U_i\) and \({\tilde{U}}_i\) being step functions allows for them to be considered constant in this interval. It can then be shown that (8) must hold in \([T,T+\delta ]\), which contradicts the definition of T and completes the proof.

5.2 A Proof for a Restricted Parameter and Policy Set

Proposition 1 can be extended to prove the result of Theorem 1 under the more restrictive set of conditions it introduced.

Proposition 2

Under the conditions of Proposition 1, for any \(t \ge 0\) and \(i \in \{1,...,n\}\)

$$\begin{aligned} I_i(t) + I^V_i(t) + R_i(t) + R^V_i(t) \ge {\tilde{I}}_i(t) + {\tilde{I}}_i^V(t) +{\tilde{R}}_i(t) +{\tilde{R}}_i^V(t) \end{aligned}$$

and

$$\begin{aligned} R_i(t) \ge {\tilde{R}}_i(t). \end{aligned}$$

Moreover, for any \(\lambda \in [0,1]\)

$$\begin{aligned} R_i(\infty ) + \lambda R^V_i(\infty ) \ge {\tilde{R}}_i(\infty ) + \lambda {\tilde{R}}_i^V(\infty ) \end{aligned}$$

and hence, the objective function is lower for \({\tilde{U}}\), provided the conditions of Proposition 1 are met.

This comes from finding \(S_i + S_i^V\) in terms of \(K_{ij}\), \(L_{ij}\) and W, and showing that \(S_i + S_i^V \le {\tilde{S}}_i + {\tilde{S}}_i^V\)—that is, that more people were infected in the \(U_i\) case. Taking limits, and using a similar approach to consider the number of unvaccinated infections then shows the required result.

5.3 Generalisation

This result can be generalised to the original set of parameters and vaccination policies by using the continuous dependence of the number of infections on the parameters and the vaccination policy.

From here, it is simple to weaken the inequalities on the parameters introduced in Proposition 1. The treatment of the vaccination policies requires more care, as it is not necessarily true that a Lebesgue intergrable \({\varvec{U}}\) can be approximated by step functions. However, its integral, \({\varvec{W}}\), can be approximated by the integral of step functions, and this allows the result of Proposition 2 to be generalised to Theorem 1.

5.4 Theorem 2

Theorem 2 is proved as follows. Firstly, one can show that, for any vaccination policy \({\varvec{U}}\) and \(t \ge 0\),

$$\begin{aligned} \min \bigg (\int _0^t\chi (s)ds,1\bigg ) \ge \int _0^t\sum _{i=1}^nU_i(s)ds, \end{aligned}$$

using the definition of \(\chi \) in terms of the constraints on \({\varvec{U}}\). This means that the total rate of vaccination given by \(\tilde{{\varvec{U}}}\) is at least as high as that given by \({\varvec{U}}\).

One can then show that \(\chi (t) \le A(t)\)

$$\begin{aligned} \int _0^t\chi (s)ds\le B(t) \end{aligned}$$

which means that \(\tilde{{\varvec{U}}}\) satisfies the vaccination constraints.

From here, one can transform any optimal vaccination policy \({\varvec{U}}\) into suitable \(\tilde{{\varvec{U}}}\). Initially, the quantities \({\tilde{W}}_i(t)\) are constructed. The details of this are left to the appendix but the general principle is that the policy \({\varvec{U}}\) is compressed in time so that the total number of vaccinations given out matches \(\min \bigg (\int _0^t\chi (s)ds,1\bigg )\). It may also be necessary to add additional vaccinations if the overall total differs—these can be assigned in proportion to the number of unvaccinated people in each group.

This construction ensures that the feasibility constraints \({\tilde{W}}_i \le N_i\) are satisfied. Moreover, one can show that \({\tilde{W}}_i\) is Lipschitz continuous, which allows for the construction of a derivative \({\tilde{U}}_i\) which integrates to \({\tilde{W}}_i\). Finally, one can show that \({\tilde{W}}_i(t) \ge W_i(t)\), meaning that, by Theorem 1, \(\tilde{{\varvec{U}}}\) must also be an optimal vaccination policy.

5.5 Theorem 3

The proof of Theorem 3 then follows from a similar construction to Theorem 2—the only difference is that no additional vaccinations are assigned by \(\tilde{{\varvec{U}}}\) compared to \({\varvec{U}}\).

6 Limitations of Theorem 1

It is helpful to consider the limitations of Theorem 1, as it does not prove that every conceivable cost function is non-increasing in vaccination effort. This will be illustrated through some examples based on theoretical COVID-19 outbreaks in the United Kingdom.

Using the work of Prem et al. (2017), one can split the UK into 16 age-groups (comprising five-year intervals from 0 to 75 and a group for those aged 75+) which mix heterogeneously. The contact matrices estimated in Prem et al. (2017) allow for the construction of a matrix \(\varvec{\beta }^*\), which will be proportional to each of the matrices \(\varvec{\beta }^{\alpha }\) in the model.

As illustrated in Liu et al. (2020), estimation of the basic reproduction number \(R_0\) for COVID-19 is complicated, and a wide range of estimates have been produced. For the examples in this paper, a reproduction number of 4 will be used, meaning that \(\beta ^1\) will be scaled so that the largest eigenvalue of the matrix given by

$$\begin{aligned} M_{ij} = \frac{\beta ^1_{ij}N_i}{\mu _i^1} \end{aligned}$$

is equal to 4. Note that the population of each group \({\varvec{N}}\)—normalised to have total sum 1—is taken from (UN 2019). Moreover, based on the estimates in Ram and Schaposnik (2021), the value of \(\mu _i^1\) and, in the first example, \(\mu _i^2\) will be set equal to \(\frac{1}{14}\).

To model the effectiveness of vaccination, the estimates of Dean and Halloran (2022) will be used so that \(\beta ^2 = 0.77\beta ^1\), (modelling the reduction in infectiousness), \(\beta ^3 = 0.3\beta ^1\) (modelling the reduction in susceptibility) and \(\beta ^4 = 0.77\times 0.3\times \beta ^1\) (assuming these effects are independent). Finally, the initial conditions used are \(S_i(0) = (1-10^{-4})N_i\) and \(I_i(0) = (10^{-4})N_i\) for each i, modelling a case where \(0.01\%\) of the population is initially infected. It should be emphasised however, that this model has purely been made for illustrative purposes and substantially more detailed fitting analysis would be required to use it for forecasting COVID-19 in the UK.

In both the subsequent examples, it will be assumed that \(0.5\%\) of the population is vaccinated homogeneously each day in the vaccination case. This will be compared to a case with no vaccination.

6.1 Infections Are Not Decreasing For All Time

While the overall number of infections will decrease as vaccination effort increases, the infections at a particular point in time will not. Figure 1 shows that the effect of vaccination is both to reduce, but also delay the peak of the infections. This is an important consideration when deciding vaccination policy, as increasing infections at a time in the year when hospitals are under more pressure could have negative consequences, and so it is important not to simply assume that vaccination will reduce all infections at all times.

Fig. 1
figure 1

A comparison of the total infections over time for a simulated COVID-19 epidemic in the UK, depending on whether a uniform vaccination strategy of constant rate is used

6.2 Deaths Are Not Decreasing For All Time

Perhaps most surprisingly, the total deaths in the epidemic may at some finite times (although not at \(t = \infty \)) be higher when vaccination occurs, at least under the assumptions of the SIR model. This is a rarer phenomenon, but is possible if vaccination increases the recovery rate as well as decreasing infectiousness.

For illustrative purposes, suppose that vaccination doubles the recovery rate (so that \(\mu _i^2 = \frac{1}{7}\)) and has no effect on mortality rates. Then, using Bonanad et al. (2020) to get age-dependent mortality rates for COVID-19, Fig. 2 shows that initially, the number of deaths is higher in the case of vaccination. This occurs because the higher value of \(\mu ^2\) means that vaccinated people move more quickly to the \(R^V\) compartment than their unvaccinated counterparts and so, while they will infect fewer people, when the number of infections is comparable in the early epidemic, this means that more people will die. Indeed, this property can still hold if vaccination reduced mortality rates (although this reduces the already small difference between the two further—in this example, one needs \(\kappa _i \gtrsim 0.9999\) for deaths to ever be lower in the non-vaccinated case).

Of course, this is not a realistic reflection of the course of an epidemic—the reason for \(\mu ^2\) being higher is that vaccinated people are likely to get less ill rather than dying more quickly—but it illustrates a potential limitation of the SIR framework. One possible way to avoid this problem would be to split the recovered compartment up into the truly recovered and dead subsections. Then, vaccination could increase the speed at which infected members of the population moved to the recovered compartment, but not the speed at which they moved to the dead compartment. This would remove the possibility of seeing the counter-intuitive behaviour of Fig. 2.

Fig. 2
figure 2

The difference between proportion of the population that has died by each time t in the case of vaccination and non-vaccination. Positive values indicate that the deaths are higher in the non-vaccination case

7 Discussion

It is comforting that the multi-group SIR model does indeed satisfy the condition that the final numbers of infections and deaths are non-increasing in vaccination effort. This shows the importance of ensuring that vaccinations are available as early as possible in a disease outbreak. To achieve this, it is important that good plans for vaccine roll-out and supply chains are available in advance of them being needed to ensure that maximum benefit from the vaccination program is obtained.

For \(n > 1\), there are, of course, many possible maximal-effort vaccination policies. The results of this paper, in effect, reduce the dimension of the space of possible vaccination policies from n to \(n-1\), as one can assume that an optimal policy satisfies the condition (7) in Theorem 2. However, choosing the correct groups to prioritise is still of crucial importance and can have a substantial impact of the effectiveness of the vaccination campaign (Fitzpatrick and Galvani 2021). Applying similarly rigorous techniques to finding the optimal vaccination policy is beyond the scope of this paper, although we extended the results of this paper to apply asymptotic techniques to understand the behaviour of the optimal solution under certain special cases in Penn and Donnelly (2023).

However, there are limitations to these results. Indeed, while the final numbers of infections and deaths are guaranteed to decrease, this is not necessarily true at a given finite time. In particular, vaccination can move the peak of the epidemic, and so it is important to consider the consequences of this, particularly if only a small number of lives are saved by vaccination.

Moreover, while this has not been discussed in this paper, it is also important to emphasise that these results only apply if vaccine efficacy does not decay over time. Indeed, if vaccination efficacy does decay significantly, then vaccinating the most vulnerable groups in a population very early may be worse than vaccinating them later, unless booster jabs are available. If the main epidemic occurs long after the vulnerable have been vaccinated, their immunity may have worn off significantly by the time that the majority of disease exposure occurs. Thus, in this case a more detailed analysis would be needed to determine the optimal vaccination rate.

The authors believe that future models for optimal vaccination should consider using the more general vaccination model introduced in this paper. This allows for greater flexibility in modelling the effect of decreasing demand. Of course, this modified model is slightly more complicated, and care needs to be taken to avoid numerical instabilities arising from the removable singularity in the \(\frac{U_i S_i}{N_i - W_i}\) term when \(W_i \rightarrow N_i\). However, it has been shown that many of the standard properties of SIR models, and indeed the results of this paper, still hold for this model, and so these extra technical difficulties appear to be a small price to pay for the significantly increased accuracy and potentially large difference between the optimal solutions for the two models.

The results of this paper could be extended to cover a wider range of disease models that are currently being used in the literature. In particular, the next step could be to prove the results for SEIR (Susceptible-Exposed-Infected-Recovered) models, and indeed models with multiple exposed compartments for each subgroup. This would help to build a general mathematical theory of maximal-effort vaccination that would provide evidence for the reliability of contemporary epidemiological modelling.

8 Conclusion

The results of this paper are summarised below:

  • Vaccinating at maximal effort is optimal for a multi-group SIR model with non-decaying vaccination efficacy.

  • The general vaccination model introduced in this paper provides greater flexibility in modelling the effect of decreasing vaccination uptake.

  • While vaccinating at maximal effort gives optimality, there can be finite times at which, according to the SIR model, infections or deaths are higher if vaccination has occurred.