1 Introduction

The world population undergoes successive epidemics of viral infections with important health, social and economical consequences. During the last decades these were SARS epidemic in \(2002-2003\) (Anderson et al. 2004; Lam et al. 2003), H5N1 influenza in 2005 (Chen et al. 2006; Kilpatrick et al. 2006), H1N1 influenza in 2009 (Girard et al. 2010; Jain et al. 2009), Ebola in 2014 (Frieden et al. 2014; WHO Ebola Response Team 2014), and currently, COVID-19 pandemic which continues already two years and further evolution of which remains unpredictable.

Nowadays, importance of mathematical modeling in epidemiology is generally accepted, but the outcome of this modeling is controversial. On one hand, classical compartmental models allow the evaluation of the main tendencies of epidemic progression. There are various developments of the epidemic models to multicompartment models (see, e.g., Brauer 2008; Giordano et al. 2020; Sharma et al. 2020), models with time varying or nonlinear disease transmission rate (d’Onofrio et al. 2020; Sun et al. 2008). Multipatch models (Bichara and Iggidr 2018; Lahodny and Allen 2013; McCormack and Allen 2007), multigroup models (Elbasha and Gumel 2021), spatiotemporal models (Ahmed et al. 2019; Filipe and Maule 2004) have been formulated and studied to understand various aspects of epidemic growth (see the monographs (Brauer et al. 2019; Martcheva 2015) and review articles (Hethcote 2000; Hurd and Kaneene 1993) for further details). However, the main questions about the prediction of epidemic outbreaks and their efficient managing remain unsolved. This can be partially explained by the unpredictable emergence of new viruses or virus variants, but the lack of understanding of epidemic progression and its economical consequences in a complex multiconnected modern society leads to an empirical try and error method clearly illustrated during COVID-19 pandemic (Supino et al. 2020).

Ongoing pandemic stimulated important modeling efforts directed to the application of the existing models and to their critical rethinking. Compartmental epidemiological models, like the classical SIR model, are based on the assumptions that newly infected individuals at time t appear with the rate proportional to the product of the numbers of susceptible individuals S(t) and infected individuals I(t) and that the recovery and death rates are proportional to the number of infected individuals. The first assumption is justified for homogeneous populations, but the second assumption has a limited applicability. Indeed, assuming that an average disease duration is \(\tau \), we conclude that the recovery and death rates at time t are determined by the number of infected individuals at time \(I(t-\tau )\) (disease onset), which can be very different from I(t), unless the epidemic progression is slow (basic reproduction number is close to 1). In a more detailed description, we do not consider a fixed disease duration but take into account that the recovery and death rates depend on the disease status of infected individuals, that is, on time passed after the disease onset.

The recovery and death rates can significantly vary depending on the individual disease progression (Github 2022). These factors are rarely taken into account in mathematical models (Feng et al. 2007; Hethcote and Tudor 1980), and further studies are needed to enlighten the significance of immunological factors to capture the incubation period (Culshaw et al. 2003; Leclerc et al. 2014; Vargas-De-León 2012), time-dependent immunity (Kyrychko and Blyuss 2005; Taylor and Carr 2009; Yuan and Bélair 2014) and other factors. Continuous dependence of the disease transmission rate on immunological parameters (e.g., instantaneous viral load) is also incorporated and studied using continuous time delay models (Gilchrist and Sasaki 2002).

In this work, we continue to study the influence of the disease time course on the epidemic progression. We propose a compartmental model based on integro-differential equations where we take into account that recovery and death rates at time t depend on the time interval \(t-\eta \) from the disease onset for the individuals infected at time \(\eta \). We determine the main features of epidemic progression and show that they are different in comparison with conventional compartmental models. Further, we illustrate the application of this modeling approach to the COVID-19 data.

The contents of the paper are as follows. In Sect. 2, we introduce the model and study the positiveness of solutions. Next, we show how it can be reduced to the conventional SIR model in some particular cases. In order to apply this model to the investigation of COVID-19 epidemic, we determine time-dependent recovery and death rates from the available data (Sect. 3). We compare the characteristics of epidemic progression in the data and in different models in Sects. 4 and 5. Time-dependent disease transmission rate is estimated in Sect. 6.

2 Model with Distributed Recovery and Death Rates

Recovery and death rates of infected individuals depend on time after the disease onset. In this section, we will derive a model based on the number of newly infected individuals and their recovery and death rates depending on time after infection. We will study some properties of this model and will show that conventional SIR model can be obtained from it under some particular assumptions.

2.1 Model Formulation

We propose an integro-differential equation model where the recovery and death rates depend on the time-since-infection of the infected individuals. Let J(t) be the number of newly infected individuals at time t, while S(t), I(t), R(t) and D(t) denote the total numbers of susceptible, infected, recovered and dead individuals at time t. The total number of infected at time t is given by the following expression:

$$\begin{aligned} I(t) = \int _0^t J(\eta ) d \eta - R(t) - D(t). \end{aligned}$$
(1)

We assume that the total population size remains constant, \(S(t)+I(t)+R(t)+D(t)=N\), that is, natural natality and mortality rates are assumed to be equal to each other. Using the equality \(I(t)+R(t)+D(t)=N-S(t)\) and differentiating equality (1), we obtain: \(\frac{d S}{dt} = - J(t)\). On the other hand, the rate of change of the susceptible population is given by the equation

$$\begin{aligned} \frac{d S}{dt} = - \beta \frac{S}{N} \; I \;\; (= - J(t)) , \end{aligned}$$

where \(\beta \) is the disease transmission rate.

Let \(r(t-\eta )\) and \(d(t-\eta )\) be the recovery and death rates depending on the time-since-infection \(t-\eta \). Then the number of infected individuals who will recover at time t is given by the expression:

$$\begin{aligned} \int _0^t r(t-\eta ) J(\eta ) d \eta . \end{aligned}$$

Similarly, we determine the number of infected individuals who will die at time t:

$$\begin{aligned} \int _0^t d(t-\eta ) J(\eta ) d \eta . \end{aligned}$$

Thus, the rate of change of the infected compartment I(t) is given by the following equation:

$$\begin{aligned} \frac{d I}{dt} = \beta \frac{S}{N} \; I - \int _0^t r(t-\eta ) J(\eta ) d \eta - \int _0^t d(t-\eta ) J(\eta ) d \eta . \end{aligned}$$

The rates of change of the recovered R(t) and the death compartment D(t) are given, respectively, by the equations:

$$\begin{aligned} \frac{d R}{dt} = \int _0^t r(t-\eta ) J(\eta ) d \eta ,\,\,\frac{d D}{dt} = \int _0^t d(t-\eta ) J(\eta ) d \eta . \end{aligned}$$

Hence, we obtain the following model:

$$\begin{aligned} \frac{d S}{dt}= & {} - \beta \frac{S}{N} \; I, \end{aligned}$$
(2a)
$$\begin{aligned} \frac{d I}{dt}= & {} \beta \frac{S}{N} \; I - \int _0^t r(t-\eta ) J(\eta ) d \eta - \int _0^t d(t-\eta ) J(\eta ) d \eta , \end{aligned}$$
(2b)
$$\begin{aligned} \frac{d R}{dt}= & {} \int _0^t r(t-\eta ) J(\eta ) d \eta , \end{aligned}$$
(2c)
$$\begin{aligned} \frac{d D}{dt}= & {} \int _0^t d(t-\eta ) J(\eta ) d \eta , \end{aligned}$$
(2d)

where \(J(t)\,=\,\beta S(t)I(t)/N\). This system of equations should be completed by the initial condition \(S(0)=N\), \(I(0)=I_0 > 0\), \(R(0)=0\), \(D(0)=0\) and \(J(t)=0\) for \(t \le 0\). We will study below some properties of this model and will apply it to assess the epidemic progression.

The proposed model is capable of capturing the features of multicompartment models consisting of symptomatic and asymptomatic compartments implicitly. Their explicit consideration assumes that the individuals belonging to two compartments have different strength of infectivity and difference in time required to recovery. We explain below that r(t) and d(t) follow gamma distribution. Without any loss of generality, we can assume that asymptomatic individuals can recover much earlier than symptomatic individuals. The distributed recovery rate takes care of the time difference between the recovery of individuals belonging to two different compartments. Multicompartment epidemic models for COVID-19 also include exposed compartments and they are less infectious than the infected individuals. This aspect is taken into account by calculating the rate of infectivity from the time series of daily infected. Available data for COVID-19 infection do not differentiate between exposed, symptomatic and asymptomatic infected individuals; hence, we can consider them as a single compartment (Ghosh et al. 2022).

2.2 Positiveness of Solutions

Since Eq. (2b) contains negative integral terms, we should verify that the solution of system (2a)–(2d) remains positive. From (2a), we observe that, if \(S(t_*) =0\) at some time \(t_*\) then \(\left. \frac{dS}{dt}\right| _{t=t_*} = 0\). This shows that \(S(t) \ge 0\) for all \(t >0\). From (2c), (2d) we get that R(t), D(t) are increasing functions. Hence, R(t) and D(t) also remain positive for all t. Next, we prove that \(I(t)>0\). Take some \(t_0 > 0\). Then from (1) we have

$$\begin{aligned} I(t_0) = \int _0^{t_0} J(\eta ) d \eta - R(t_0) - D(t_0). \end{aligned}$$
(3)

Integrating (2c), (2d) from 0 to \(t_0\) with \(R(0)=D(0)=0\) and taking their sum, we get the equality

$$\begin{aligned} R(t_0) + D(t_0) = \int _0^{t_0} \bigg ( \int _0^{t} \big ( r(t-\eta ) + d(t-\eta ) \big ) J(\eta ) d \eta \bigg ) dt. \end{aligned}$$

Changing the order of integration, we obtain

$$\begin{aligned} R(t_0) + D(t_0)= & {} \int _0^{t_0} \bigg ( \int _{\eta }^{t_0} \big ( r(t-\eta ) + d(t-\eta ) \big ) dt \bigg ) J(\eta ) d \eta . \end{aligned}$$
(4)

Since the integral \(\int _{\eta }^{t_0}( r(t-\eta ) + d(t-\eta ) ) dt\) gives the proportion of recovered and dead individuals from time \(\eta \) to \(t_0\) among those infected at time \(\eta \), it follows that it is less than 1. Consequently,

$$\begin{aligned} R(t_0) + D(t_0) < \int _0^{t_0} J(\eta ) d\eta . \end{aligned}$$

Together with (3), this inequality gives that \(I(t_0) > 0\). Therefore, I(t) remains positive for all t. This conclusion completes the proof of positiveness of solution of system (2a)–(2d).

2.3 Reduction to the SIR Model

In this section, we show that model (2a)–(2d) can be reduced to conventional SIR model under some assumptions. Consider the recovery and death rates in the form

$$\begin{aligned} r(t-\eta )= \left\{ \begin{array}{ccc} r_0 &{},&{} t-\tau< \eta \le t\\ 0 &{},&{} \eta< t -\tau \end{array} \right. , \;\;\; d(t-\eta )= \left\{ \begin{array}{ccc} d_0 &{},&{} t-\tau< \eta \le t \\ 0 &{},&{} \eta < t -\tau \end{array} \right. , \;\;\; \end{aligned}$$
(5)

where \(\tau >0\) is disease duration and \(r_0\) and \(d_0\) are some constants. Substituting these functions in (2c) and (2d), we get

$$\begin{aligned} \frac{dR}{dt} = r_0 \int _{t-\tau }^t J(\eta )d\eta , \;\; \frac{dD}{dt} = d_0 \int _{t-\tau }^t J(\eta ) d\eta . \end{aligned}$$
(6)

Integrating these equalities from \(t-\tau \) to t, we obtain

$$\begin{aligned} R(t)-R(t-\tau ) = r_0 \int _{t-\tau }^t \bigg (\int _{s-\tau }^s J(\eta ) d\eta \bigg )ds , \\ D(t)-D(t-\tau ) = d_0 \int _{t-\tau }^t \bigg (\int _{s-\tau }^s J(\eta ) d\eta \bigg )ds. \end{aligned}$$

Since we assume that the disease duration is \(\tau \), then (1) can be written as follows:

$$\begin{aligned} I(t) = \int _{t-\tau }^t J(\eta ) d\eta -(R(t)-R(t-\tau ))-(D(t)-D(t-\tau )), \end{aligned}$$
(7)

where \((R(t)-R(t-\tau ))\) and \((D(t)-D(t-\tau ))\) represent the number of recovered and dead during the time interval \((t-\tau ,\;t)\), respectively. Hence, from (7), we have

$$\begin{aligned} I(t) = \int _{t-\tau }^t J(\eta ) d\eta -(r_0+d_0)\int _{t-\tau }^t\bigg (\int _{s-\tau }^s J(\eta ) d\eta \bigg )ds. \end{aligned}$$
(8)

Now, from (2b) and (8),

$$\begin{aligned} \frac{d I}{dt}= & {} \beta \frac{S}{N} \; I - (r_0+d_0)\int _{t-\tau }^t J(\eta ) d \eta \\= & {} \beta \frac{S}{N} \; I -(r_0+d_0)\bigg [I(t)+(r_0+d_0)\int _{t-\tau }^t \bigg (\int _{s-\tau }^s J(\eta ) d\eta \bigg ) ds \bigg ]. \end{aligned}$$

Assuming that \((r_0+d_0)\) is small enough, we neglect the term involving \((r_0+d_0)^2\). Hence, we obtain

$$\begin{aligned} \frac{d I}{dt} \approx \beta \frac{S}{N} \; I - (r_0+d_0)I. \end{aligned}$$
(9)

In this case, system (2a)–(2d) is reduced to conventional SIR model

$$\begin{aligned} \frac{d S}{dt}= & {} - \beta \frac{S}{N} \; I , \end{aligned}$$
(10a)
$$\begin{aligned} \frac{d I}{dt}= & {} \beta \frac{S}{N} \; I - ( r_0+d_0) I , \end{aligned}$$
(10b)
$$\begin{aligned} \frac{d R}{dt}= & {} r_0 I , \;\;\;\; \frac{d D}{dt} \; = \; d_0 I. \end{aligned}$$
(10c)

Thus, assuming uniform distribution of recovery and death rates (5) and that they are small enough, we can reduce model (2a)–(2d) to the classical SIR model. However, in general, these assumptions do not hold, and we need to take into account more realistic recovery and death rate distributions.

3 Estimation of Recovery and Death Rate Functions

3.1 Gamma Distribution

In this section, we estimate the recovery r(t) and death d(t) rate functions in the case of COVID-19 epidemic. The data for 120 recovered patients and 31 dead individuals from Ref. (Github 2022; Verity et al. 2019) were used to fit a gamma distribution. Note that there are no recovery or death during the first two days after infection. The maximums of these distributions are reached between 13 and 18 days for recovery and 10-15 days for death (Fig. 2). For some individuals, the recovery time is quite long. These distribution functions for recovery and death take into account asymptomatic, symptomatic and hospitalized compartments. Individuals recovered within 7 to 10 days from infection can be considered as asymptomatic either due to less viral load or due to strong immune response. On the other hand, death after significant time period from the day of infection can be assumed to be contribution from the hospitalized compartments. Further, in the literature on epidemic modeling, the choice of gamma function to model distributed recovery period is well known (Bailey 1954; Chowell et al. 2009; Lloyd 2001).

We estimate the mean time from the disease onset to recovery as 17.85 days and the mean time to death as 13.19 days. The best-fitted gamma distribution corresponding to recovery, which is shown by the red curve in Fig. 1a is given by the expression

$$\begin{aligned} f_1(t)=\frac{1}{b_1^{a_1} \Gamma (a_1)} t^{a_1-1} e^{-\frac{t}{b_1}} \end{aligned}$$

with the estimated parameter values \(a_1=8.06275\) and \(b_1=2.21407\). Similarly, the best-fitted gamma distribution corresponding to death shown by the red curve in Fig. 1b is given by:

$$\begin{aligned} f_2(t)=\frac{1}{b_2^{a_2} \Gamma (a_2)} t^{a_2-1} e^{-\frac{t}{b_2}}, \end{aligned}$$

where \(a_2=6.00014\) and \(b_2=2.19887\).

These functions are normalized in such a way that the total probability of recovery and death equals 1. We set \(r(t) = p_0 f_1(t), d(t) = (1-p_0) f_2(t)\), where \(p_0\) is the survival probability. Its value is estimated from the data as \(p_0=0.97\) (Paul and Lorin 2021).

Fig. 1
figure 1

Probability distribution of recovery a and death b as a function of time (in days) after the onset of infection. The red curves show the best fit gamma distributions (the values of parameters are given in the text) (Color figure online)

Fig. 2
figure 2

Probability distribution of recovery a and death b as a function of time (in days) after the onset of infection. The red curves show the best fit (the values of parameters are given in the text) (Color figure online)

3.2 Bimodal Gamma Distribution

Instead of the gamma distribution, some other distribution functions can be used to describe the recovery and death rates. It is observed that in some cases there are two groups of recovered (dead) individuals, where one group has a shorter time interval to recovery (death) and another group a longer time period. In such cases, to obtain a better parametrization of the recovery and death rate functions, we can consider a bimodal gamma distribution, that is, a linear combination of two different gamma distributions. In (Paul and Lorin 2021), the distribution of recovery and death as functions of onset-to-recovery and onset-to-death are estimated using the COVID-19 data in Canada. The corresponding data are shown in Fig. 2 by the blue bars. We have fitted these data by bimodal gamma distributions \({\mathcal {F}}_1\) and \({\mathcal {F}}_2\) (red curves in Fig. 2) corresponding to recovery and death rate functions, respectively:

$$\begin{aligned} {\mathcal {F}}_1(t)=\frac{0.91}{b_1^{a_1} \Gamma (a_1)} t^{a_1-1} e^{-\frac{t}{b_1}} + \frac{0.09}{d_1^{c_1} \Gamma (c_1)} t^{c_1-1} e^{-\frac{t}{d_1}}, \\ {\mathcal {F}}_2(t)=\frac{0.94}{b_2^{a_2} \Gamma (a_2)} t^{a_2-1} e^{-\frac{t}{b_2}} + \frac{0.06}{d_2^{c_2} \Gamma (c_2)} t^{c_2-1} e^{-\frac{t}{d_2}}, \end{aligned}$$

where the best-fitted parameter values are as follows: \(a_1=32.52447\), \(b_1=0.65547\), \(c_1=150.40545\), \(d_1=0.26171\) and \(a_2=36.02855\), \(b_2=0.57511\), \(c_2=140.11379\), \(d_2=0.27636.\)

3.3 Sensitivity Analysis

Parameters in the recovery and death distributions presented above are estimated from the individual data which can vary depending on country, time period, and on the virus variant. We will estimate the sensitivity of the model outcomes (maximal number of infected \(I_m\) and time to the maximal number of infected \(t_m\)) to the shape and scale parameters \(a_1\), \(a_2\), \(b_1\), \(b_2\). For this purpose, we use variance-based sensitivity analysis with the Monte Carlo numerical procedure described in (Saltelli et al. 2008) for computing the full set of first-order sensitivity indices \({\mathcal {S}}_j\) for \(j=1,2,3,4\) corresponding to the parameters \(a_1\), \(a_2\), \(b_1\) and \(b_2\), respectively. We have estimated the parameters \(a_1\), \(a_2\), \(b_1\) and \(b_2\) from the individual level data given in (Github 2022). Then we use a set of sample points obtained by using Latin hyper-cube sampling in the neighborhood of these estimated parameter values and perform numerical simulation as described in (Saltelli et al. 2008).

Fig. 3
figure 3

The left panel shows the first-order sensitivity indices corresponding to the model outcome \(I_m\) and the right panel to \(t_m\) in the case of gamma distribution (Color figure online)

The first-order sensitivity indices are shown in Fig. 3 and summarized in Table 1. This sensitivity analysis shows that the model outcomes \(I_m\) (maximal number of infected) and \(t_m\) (time to the maximal number of infected) are most sensitive to the scale parameter \(b_1\) in the gamma distribution for the recovery rate.

A similar method is used to perform the sensitivity analysis for the parameters \(a_1\), \(a_2\), \(c_1\), \(c_2\), \(b_1\), \(b_2\), \(d_1\) and \(d_2\) involved in the bimodal gamma distribution. The corresponding first-order sensitivity indices are shown in Fig. 4 and summarized in Table 2. We can observe that \(b_1\) is the most sensitive parameter to \(I_m\) and \(t_m\) as compared to other parameters.

Table 1 First-order sensitivity indices \({\mathcal {S}}_i\) (gamma distribution)
Fig. 4
figure 4

The left panel shows the first-order sensitivity indices corresponding to the model outcome \(I_m\) and the right panel to \(t_m\) in the case of bimodal gamma distribution (Color figure online)

Table 2 First-order sensitivity indices \({\mathcal {S}}_i\) (bimodal gamma distribution)

4 Comparison with the SIR Model

We showed in Sect. 2 that classical SIR model can be obtained as a particular case of distributed model (2a)–(2d). We will compare dynamics of epidemic progression in the two models using the estimated recovery and death rates.

Since the estimated average time to recovery is 17.85 days and to death 13.19 days, we take average disease duration as 16 days. The corresponding value in SIR model is \(r_0+d_0 \approx 1/16\). We set \(p_0=0.97\), that is, out of 100 infected individuals, 97 infected will recover. This estimate matches with most of the COVID-19 epidemic data from various countries (Worldometer 2022; Paul and Lorin 2021).

Fig. 5
figure 5

Comparison between the solutions of the system (2a)–(2d) (blue curves) and SIR model (red curves): a the number of susceptible individuals S(t), b the number of infected individuals I(t). In both models \(N=10^7\). The values of other parameters for the SIR model: \(\beta =0.3\), \(r_0+d_0=1/16\), \(I(0)=1\); and for model (2a)–(2d): \(\beta =0.3\), \(a_1=8.06275\), \(b_1=2.21407\), \(a_2=6.00014\), \(b_2=2.19887\), \(p_0=0.97\), \(S(0)=N-1\), \(I(0)=1\) (Color figure online)

Though the parameters of the two models correspond to each other, system (2a)–(2d) and SIR model (10a)–(10c) give different dynamics of epidemic progression (Fig.  5). We notice that the maximal number of infected individuals I(t) is much higher for system (2a)–(2d) as compared to the SIR model (10a)–(10c), while time to the maximal number of infected \(t_m\) is less.

Comparison of the final size of epidemic \(S_f\), maximal number of infected \(I_m\) and the time to the maximal number of infected \(t_m\) between system (2a)–(2d) with gamma distribution and the SIR model are shown in Fig. 6 for different values of parameters. As before, the maximal number of infected individuals \(I_m\) in model (2a)–(2d) is much higher than for the SIR model (10a)–(10c), time \(t_m\) and the final size \(S_f\) are less for the distributed model. This difference can be explained by the fact that the recovery and death rates are uniformly distributed during the disease duration for the SIR model (Sect. 2), contrary to the gamma distribution in (2a)–(2d). Therefore, there is a shift to earlier recovery and death for the SIR model.

Fig. 6
figure 6

Comparison of the maximal number of infected individuals \(I_m\) a, time to reach the maximal number \(t_m\) b and the final size of epidemic \(S_f\) c between the system (2a)–(2d) (blue curves) and the SIR model (red curves) for different values of \(\beta \). The values of parameters: \(N=10^7\), \(I_0=1\), for the SIR model \(r_0+d_0=1/16\); and for the system (2a)–(2d): \(a_1=8.06275\), \(b_1=2.21407\), \(a_2=6.00014\), \(b_2=2.19887\), \(p_0=0.97\) (Color figure online)

Similarly, we compare system (2a)–(2d) with bimodal gamma distribution with the conventional SIR model. In this case, the estimated mean time from onset-to-recovery is 22.63 and the mean time from onset-to-death is 21.80. Hence, the average disease duration is taken approximately 22.2 days, and \(r_0+d_0 \approx 1/22.2\). All other parameters are kept the same as above. The properties of the final size of epidemic, maximal number of infected individuals, time to maximum are similar to the previous case and not shown here.

5 Model Validation with Epidemiological Data

In order to validate the model with distributed recovery and death rates, we compare the results of modeling with the epidemiological data. Distributed recovery \(r(t-\eta )\) and death \(d(t-\eta )\) rates are estimated in Sect. 3.1 from the data in China in February 2020 (Github 2022; Verity et al. 2019). Once these functions are determined, we take the number J(t) of daily infected individuals from the epidemiological data and find the sum of daily recoveries and deaths from the expression

$$\begin{aligned} \Sigma (t) = \int _0^t r(t-\eta ) J(\eta ) d \eta + \int _0^t d(t-\eta ) J(\eta ) d \eta . \end{aligned}$$
(11)

These results are compared with the sum of recoveries and deaths in the data. Figure 7 shows the result of such comparison for China from January 23, 2020, to April 15, 2020, with the data from (Worldometer 2022) (7-day moving average).

Fig. 7
figure 7

In the left panel, the blue curve shows the number \(\Sigma (t)\) of recovered and dead in the distributed model, the magenta curve corresponds to \(\sigma (t)\) in the SIR model, and the black dots correspond to the 7-days moving average of daily recoveries and death in China. The right panel shows the corresponding cumulative recovery and death (Color figure online)

Fig. 8
figure 8

\(\Sigma (t)\), \(\sigma (t)\) are plotted for different countries, using the gamma distributions for recovery and death rates as estimated in Sect. 3.1. In the left panel, the blue curves correspond to \(\Sigma (t)\), the magenta curves correspond to \(\sigma (t)\) and the black dots correspond to the 7-day moving average of daily recovery and death in different countries. The right panel shows the corresponding cumulative recovery and death. a, b: France; c, d: Italy; e, f: Sweden (Color figure online)

Recoveries and deaths can also be determined as a proportion of active cases \(\sigma (t)= (r_0+d_0) I(t)\) as it is done in the SIR model. Here I(t) is taken from the data and \(r_0+d_0 = 1/16\). In agreement with the results of the previous section, SIR model overestimates the sum of recovered and dead.

A similar comparison is done for other countries (Fig. 8). It is important to mention here that we used the same gamma distribution as determined before from the data for China (Sect. 3.1).

Next, we consider the bimodal gamma distribution determined above (Sect. 3.2) and calculate \(\Sigma (t)\) for a longer period of time from March 10, 2020, to June 16, 2020 (Fig. 9), than used for the determination of the distribution parameters. As before, we compare the results with the SIR model and observe that it overestimates the total recovery and death.

Fig. 9
figure 9

In the left panel, the blue curve corresponds to \(\Sigma (t)\), the magenta curve corresponds to \(\sigma (t)\) and the black dots correspond to the 7-day moving average of daily recovery and death in Canada. The right panel shows the corresponding cumulative recovery and death (Color figure online)

Thus, the model with gamma distribution gives a good description of the recovery and death in different countries compared with the epidemiological data, while the SIR model overestimates it.

6 Estimation of the Time-Dependent Disease Transmission Rate \(\beta (t)\)

In this section, we will consider time-dependent transmission rate, \(\beta (t)\) and will estimate it from the COVID-19 epidemiological data. Dynamics of the transmission rate can help in the understanding of epidemic progression (Mummert 2013).

Theorem 1

For the model (2a)–(2d), the time-dependent transmission function \(\beta (t)\) is given by the following expression:

$$\begin{aligned} \beta (t) = \frac{N J(t)}{I(t) \bigg ( N-\int _0^t J(\eta ) d\eta \bigg )}. \end{aligned}$$
(12)

Proof

We have

$$\begin{aligned} \beta (t) \frac{S(t)}{N} \; I(t) \, =\, J(t) \,\Rightarrow \, \beta (t) \,=\, \frac{N J(t)}{I(t) S(t)}. \end{aligned}$$
(13)

Now, we also know that

$$\begin{aligned} I(t) = \int _0^t J(\eta ) d \eta - (R(t) + D(t)) . \end{aligned}$$

Using \(S(t)=N-(I(t)+R(t)+D(t))\) in the previous equation, we get

$$\begin{aligned} S(t)= N-\int _0^t J(\eta ) d \eta . \end{aligned}$$

Substituting this expression into (13), we obtain (12). \(\square \)

As an illustration of this theorem, we consider the COVID-19 data taken from (Worldometer 2022) for new daily cases and total active cases. In order to decrease the irregularity of data, we take the 7-day moving average of J(t) and I(t). Note that \(\int _0^t J(\eta ) d\eta \) represents the cumulative number of infected. Consequently, we can determine the function \(\beta (t)\) using formula (12).

We consider the COVID-19 infection data for a span of approximately 450 days. We use the data for India from March 7, 2020, for France from March 2, 2020, for Italy from February 21, 2020, and for Sweden from March 23, 2020, and up to May 20, 2021, for all the four countries (Worldometer 2022). Then we plot \(\beta (t)\) for four countries with the help of (12) and plot in Fig. 10. The initial date corresponds to the first reported case in a given country (marked with vertical dashed lines). Initial transient observed in case of India may be related to the inaccuracy of the reporting strategy. Growth and decline in \(\beta (t)\) at different time correspond to various social restrictions as well as onset of a new outbreak. It is interesting to note that the declining pattern for two neighboring European countries, France and Italy, are similar in the beginning. However, an increasing peak for \(\beta (t)\) may indicate that the relaxation of lockdown restriction in France was more rapid compared to other countries.

Fig. 10
figure 10

Time varying \(\beta \) for COVID-19 in France, India, Italy, Sweden calculated by formula (12) for the model with gamma distribution and the parameter values: \(a_1=8.06275\), \(b_1=2.21407\), \(a_2=6.00014\), \(b_2=2.19887\), \(p_0=0.97\) (Color figure online)

We can note from the presented results that \(\beta (t)\) oscillates according to increasing or decreasing epidemic waves. Furthermore, average value of this function is different in different countries. As such, it is about 0.1 in India and about 0.05 in France. In order to interpret dynamics of time-dependent transmission rate \(\beta (t)\), we simplify expression (12) assuming that \(\int _0^t J(\eta ) d \eta \ll N\). This assumption is justified since the total number of infected remains in most countries much less than the total population. Then \(\beta (t) \approx J(t)/I(t)\). The same expression for \(\beta (t)\) can be obtained from the SIR model if \(S \approx N\).

In order to give further estimates of \(\beta (t)\), suppose that disease duration is \(\tau \). Then \(I(t)=\int _{t-\tau }^t J(\eta ) d\eta \), that is, the individuals infected at time \(t-\tau \) recover or die at time t but not before. Hence, we obtain approximate formula

$$\begin{aligned} \beta (t) = \frac{J(t)}{\int _{t-\tau }^t J(\eta ) d\eta } \; . \end{aligned}$$

Set \(J(t) = J(\tau ) e^{\lambda (t-\tau )}\) and then substituting in above equation, we find \(\beta (t) = \lambda /(1-\exp (-\lambda \tau ))\). If \(\lambda =0\), then \(\beta (t) = 1/\tau \), that is, the disease transmission rate is inversely proportional to the disease duration. This estimate is in agreement with an average disease duration 16 days determined in Sect. 2. For \(\lambda >0\), \(\beta \) characterizes the rate of growth of newly infected individuals, and for negative \(\lambda \), the rate of decay.

7 Discussion

Ongoing COVID-19 pandemic has stimulated scientific research in various disciplines ranging from economy to education (Volpert et al. 2020). A wide variety of modeling approaches are considered in the recent literature (see, e.g., Rahimi et al. 2021; Sharma et al. 2020 for more detail). However, validation of these models is complicated by the uncertainty of the data, especially for asymptomatic cases.

Another shortcoming of conventional epidemiological model is that they consider recovery and death rate as a given proportion of active infected cases at the same moment of time. Clearly, this is a strong assumption which can lead to a large error in the evaluation of epidemic progression. In order to overcome this issue, we propose in this work a new type of immunoepidemiological models based on the daily number of infected individuals and their time-distributed recovery and death rates. Distributed recovery and death rates are evaluated from the data from China and Canada. They give a reliable description of data for different countries and time periods. We note that the parameters of gamma distribution can depend on the virus variant. This question can be addressed in the future studies when the time-dependent recovery rate is available in particular for the Omicron variant.

We compare this approach with the SIR model with appropriate recovery and death rates. It is clearly seen that the SIR model overestimates the daily recoveries and deaths which, in turn, underestimates the daily number of infected, maximal and total numbers of infected individuals. The recovery and death rate functions are estimated with the limited real data from China and Canada on the number of days spent since infection before recovery and death (Github 2022; Paul and Lorin 2021). These data sets are used to estimate the parameters involved with two different parametrization of the recovery and death rate functions. Time-since-infection-based recovery and death rates implicitly take into account mild and severe infection which can be considered as symptomatic and asymptomatic compartments. Numerical validation of the proposed model with the COVID-19 epidemic data from five different countries indicates that the parametrization of recovery and death rates with gamma function effectively captures the daily and cumulative recoveries and deaths, although the real data show large irregularity.

It is important to mention here that the proposed modeling approach can be used to predict the disease progression accurately if we have specific data for the first days in order to estimate the parameters involved in the recovery and death rate functions. Availability of such kind of data is a challenging issue in the beginning of epidemic. However, we should highlight that the estimates of r(t) and d(t) with the data from China during the onset of COVID-19 epidemic works well to study the disease progression in France, Italy and Sweden. Having these information, the proposed modeling approach can be used to predict the maximal number of infected and the time to maximal number of infected. This predictions can be used to estimate the required number of hospital beds and readiness of medical facilities based upon the appropriate information about the rate of hospitalization and severity of the viral strain.

We have also described a method to calculate time-dependent infectivity rate based upon the daily incidence data. It clearly indicates that the rate of transmission of infection from one individual to another not only depends upon the fixed transmission rate, but rather it is solely related with the time period over which one individual remain infected. The next challenging issue will be to estimate the rate of infection transmission depending on time-dependent viral load for different virus variants.