1 Introduction

In this study, we develop a mathematical model for estimating and predicting the exponentially decaying daily case fatality rate (CFR) of coronavirus disease 2019 (COVID-19) in various nations. The CFR is defined by dividing the number of deaths over a period of time by the total number of people confirmed with the disease [1] and can be used to measure disease severity and prognosis [2,3,4]. Public health policy-makers are interested in estimating the CFR of a pandemic [5]. Therefore, numerous researchers have attempted to estimate CFR using various approaches [6, 7]. There are CFR estimation models using statistical models [8, 9] or machine learning techniques [10]. Generally, the CFR depends on numerous factors, such as the quality of patient care [11]. We estimate the termination time of the pandemic disease, such as COVID-19, using epidemiological models for the disease and correlating the decrease in CFR. However, the CFR may be skewed by the rapid growth of infections in a given phase or area during the pandemic. Hence, we should consider the delay between confirmation of the disease and death due to the disease. Previous studies estimated the delay between the diagnosis of the disease and death due to the disease [12,13,14,15,16]. Liang et al. [17] systematically studied the association between the COVID-19 vaccine coverage and CFR. Hoffmann and Wolf [18] assessed the correlation between the proportion of elderly individuals among the confirmed cases and country-specific CFR. In [9], Shiferaw estimated the dynamics of CFR using Markov switching autoregressive (MSAR) models. Yang et al. [19] applied simple linear and exponential regression methods to compute the CFR of the COVID-19 pandemic outbreak in China.

However, it is challenging to capture the time-dependent CFR of a pandemic outbreak using a single exponential coefficient because it contains multiple exponential decays, that is, fast and slow decays. Consequently, this study aims to devise a mathematical model for forecasting the exponentially decreasing CFR of a pandemic in a country during its declining phase. The proposed methodology is general, and therefore, it can be used in similar pandemic outbreaks in the future to estimate and predict the exponentially decaying CFR of novel coronavirus diseases.

The outline of this paper is as follows: In Sect. 2, the proposed algorithm is described. In Sect. 3, computational simulations are performed and results are presented. Finally, conclusions are given in Sect. 4.

2 Proposed algorithm

The proposed algorithm consists of three major steps as follows:

  1. Step 1:

    Pre-smoothing epidemiological data

  2. Step 2:

    Computing daily case fatality rate \(FR_i\)

  3. Step 3:

    Fitting \(FR_i\) with fatality rate \(FR^N\)

In Step 1, we perform preprocessing by smoothing the daily new confirmed cases and death data, \(\Delta C\) and \(\Delta D\) to 7-day averages, \(\Delta _7 C\) and \(\Delta _7 D\), respectively. To minimize the discrepancies that may be contained in the data, Hwang et al. [20] and Saraiva et al. [21] used the 7-day moving average data for the number of confirmed cases and deaths due to COVID-19, respectively. The daily new confirmed cases \(\Delta C_i\) and deaths \(\Delta D_i\) are defined from the global cumulative numbers of the confirmed cases C and death cases D, as follows:

$$\begin{aligned} \Delta C_i=C_i-C_{i-1},~~\Delta D_i=D_i-D_{i-1}. \end{aligned}$$
(1)

The global cumulative numbers of the confirmed cases C and death cases D are shown in Fig. 1a. Figure 1b shows the daily new confirmed cases \(\Delta C_i=C_i-C_{i-1}\) and daily new deaths \(\Delta D_i=D_i-D_{i-1}\). The smoothed daily confirmed cases and deaths are defined using a 7-day moving average as follows:

$$\begin{aligned} \Delta _7 C_i = \frac{1}{7} \sum _{k=-3}^3 \Delta C_{i+k},~~ \Delta _7 D_i = \frac{1}{7} \sum _{k=-3}^3 \Delta D_{i+k}, \end{aligned}$$
(2)

which are shown in Fig. 1c.

Fig. 1
figure 1

Schematic diagram of data smoothing for given case data: a cumulative number of cases, b number of daily cases, and c smoothed number of daily cases

In Step 2, let us define the time series of the daily case fatality rate \(FR_i\) at time \(t_i\) as follows:

$$\begin{aligned} FR_i=\frac{\Delta _{7} D_i}{\sum _{k=-M}^M w_k \Delta _7 C_{i-d+k}}, \end{aligned}$$
(3)

where \(\Delta _{7} D_i\) and \(\Delta _7 C_{i-d+k}\) are the numbers of deaths and confirmed cases at times \(t_i\) and \(t_{i-d+k}\), respectively. Here, d is the time delay from confirmation of the disease to death due to the disease, and \(w_k\) denotes the weights satisfying \(\sum _{k=-M}^M w_k=1\). To determine the weight values \(w_k\) for \(k=-M, \ldots , 0, \ldots , M\), we first choose the standard deviation \(\sigma \) and M. Subsequently, using the Gaussian distribution

$$\begin{aligned} G(k)=\frac{1}{\sqrt{2 \pi }\sigma } e^\frac{k^2}{2\sigma ^2}, \end{aligned}$$

we compute \(w_k\) for \(k=-M, \ldots , 0, \ldots , M\), i.e.,

$$\begin{aligned} w_k=\frac{G(k)}{\sum _{j=-M}^M G(j)}, \end{aligned}$$

which implies \(\sum _{k=-M}^M w_k=1.\) Figure 2a, b schematically shows G(k) and \(w_k\), respectively. Figure 2c shows schematically the daily case fatality rate \(FR_i\), Eq. (3).

Fig. 2
figure 2

a Gaussian filter, b discrete weight function, and c schematic for the algorithm of fatality rate

Step 3 is a data-fitting step for the daily case fatality rate. For an accurate fit, we assume that the daily case fatality rate consists of a summation of multiple functions that decrease exponentially over time. For some reference time \(t_0\) , we define the time-dependent fitting function for case fatality rate of COVID-19 as follows:

$$\begin{aligned} FR^N(t)=\sum _{j=1}^N \alpha _j e^{-\beta _j (t-t_0)}, \end{aligned}$$
(4)

where \(\alpha _j\) and \(\beta _j\) are positive constants and N is the number of multiple decay rates. Because it is difficult to fit the real epidemic data with a single decay rate over a long period of time, we use the sum of the multiple decay rates. To compute \(\alpha _j\) and \(\beta _j\), we apply the fitting function \(\textbf{lsqcurvefit}\) in MATLAB R2022a, which is a nonlinear curve-fitting function in a least-square sense [22]. Employing the fitting function \(\textbf{lsqcurvefit}\), we obtain optimal parameters \({\varvec{\alpha }}=(\alpha _1, \ldots , \alpha _N)\) and \({\varvec{\beta }}=(\beta _1, \ldots , \beta _N)\) which minimize the following cost function:

$$\begin{aligned} {\mathcal {E}}({\varvec{\alpha }},{\varvec{\beta }})= \frac{1}{2}\sum _{i=1}^p [FR_i-FR^N(t_i)]^2, \end{aligned}$$
(5)

where p is the number of the given daily case fatality rates.

3 Computational experiments

Because the CFR of COVID-19 may vary significantly in different nations, we consider the multiply exponentially decaying daily CFR of COVID-19 in four different nations, i.e., the Republic of Korea, the USA, Japan, and the UK. For sources of data, we collected the daily numbers of the confirmed cases and deaths from WHO [23] from May 01, 2020, to December 31, 2021.

Let \({\textbf{u}}=(u_1,u_2, \ldots , u_P)\) be a time series. Subsequently, we define \(l_2\)-norm of \(\textbf{u}\) as follows:

$$\begin{aligned} \Vert {\textbf{u}} \Vert _2 = \sqrt{\frac{1}{P} \sum _{i=1}^{P} u_i^2}. \end{aligned}$$

Let the initial values of \(\alpha _1\) and \(\beta _1\) be \(\alpha _0\) and \(\beta _0\). We set \(\alpha _0\) and \(\beta _0\) as follows to obtain optimal parameters \({\varvec{\alpha }}\) and \({\varvec{\beta }}\): \( \alpha _{0}=(1/q)\sum _{i=1}^q FR_{i}\) and \(\beta _{0}=0.01\). Here, \(q=10\) is used. Previous studies [24,25,26,27,28,29] estimated the time to death after being infected with COVID-19 as 4 to 18 days. Therefore, we use \(d=10\) in this study.

3.1 Determination of the number of multiple rates, N

For a given tolerance tol, if \(\Vert {\textbf{FR}}^{N+1}-{\textbf{FR}}^N \Vert _2<tol\), then we choose N as the number of multiple decay rates, where \({\textbf{FR}}^{N+1}=(FR^{N+1}(t_1),\ldots , FR^{N+1}(t_P))\) and \({\textbf{FR}}^N=(FR^N(t_1), \ldots , FR^N(t_P))\). Figure 3a shows the time-dependent case fatality rate \(FR^N(t)\) with four different N values using the data from Republic of Korea. Figure 3b shows \(\Vert {\textbf{FR}}^{N+1}-{\textbf{FR}}^N \Vert _2\) for \(N=1,2,3\). Here, \(M=3\), \(\sigma =1\), and \(d=10\) are used.

Fig. 3
figure 3

a Time-dependent case fatality rate \(FR^N(t)\) with four different N values using the data from Republic of Korea. b Plot of \(\Vert {\textbf{FR}}^{N+1}-{\textbf{FR}}^N \Vert _2\) for \(N=1,2,3\). Here, \(M=3\), \(\sigma =1\), and \(d=10\) are used

Table 1 shows the optimized parameters \({\varvec{\alpha }}\) and \({\varvec{\beta }}\) for each N with \(tol=1.\text{ e- }6\). As N increases, parameters \({\varvec{\alpha }}\) and \({\varvec{\beta }}\) are converged.

Table 1 Optimal values of \({\varvec{\alpha }}\) and \({\varvec{\beta }}\) with respect to N

3.2 Computation of \(FR^N\)

In this section, the fatality rate trend is estimated. Given a tolerance, we calculate N, and we minimize the cost function (5) to estimate the multiple fatality rates. We use the Republic of Korea data to help understand the proposed algorithm. Figure 4 shows the results of the algorithm employed for the Republic of Korea data. Figure 4a, b shows the cumulative numbers of cases in the Republic of Korea and the smoothed numbers of daily cases by Step 1, using the proposed algorithm, respectively. Figure 4c shows the daily case fatality rate FR obtained via Step 2 and \(FR^n\) obtained with the given tol by Step 3 in the proposed algorithm.

Fig. 4
figure 4

a Cumulative number of cases, b smoothed number of daily cases, c daily case fatality rate FR and estimates of \(FR^n\)

Figure 5 shows the fitting of \(FR^N\) for each country to N determined by \(tol=1.\text{ e- }6\). From the computational results, we can observe that if \(tol=1.\text{ e- }6\), then the determined value of N is \(N=3\) for the Republic of Korea, Japan, and the UK, whereas \(N=4\) for the USA. Using N, \({\varvec{\alpha }}\) and \({\varvec{\beta }}\) from Fig. 5, we estimate \(FR^N(t)\) after the fitting data period.

Fig. 5
figure 5

Determination of N: a the Republic of Korea, b the USA, c Japan, and d the UK. Here, tol = 1.e-6, \(M=3\), \(\sigma =1\) and \(d=10\) are used

Figure 6 shows the trend of the fatality rate by country. In Fig. 6, the shaded region is a fitting \(FR^N\) with the given data from Jun 1, 2020, to December 31, 2021, and the unshaded region is the \(FR^N\) estimated using the optimization parameters found in the shaded part. The optimization parameters are found using tol = 5.e-5. The \(FR^N\) becomes 0.002 for the Republic of Korea on April 5, 2024; the USA, November 15, 2025; Japan, April 28, 2039; and the UK, October 28, 2022.

Fig. 6
figure 6

Estimated trend of \(FR^N\) for each country

4 Conclusion

In this study, we developed a mathematical model to estimate and predict the rapidly decreasing daily CFR of COVID-19 in four nations: South Korea, the USA, Japan, and the UK. As we move into the post-COVID-19 world, we must be prepared for another pandemic outbreak in the future. CFR tended to gradually decrease after increasing several times during the COVID-19 pandemic outbreak. It is essential to ascertain the end of the pandemic or when the number of reported cases becomes negligible to plan for the future. The main contributions of this study are presenting the mathematical model for the multiply exponentially decaying daily CFR with multiple decay rates, automatically computing the number of multiple decay rates under a given tolerance, and predicting the fatality rate trend in the future. Furthermore, the proposed method is general and can be applied to other pandemic diseases. We performed numerical experiments to validate the proposed method with COVID-19 data from four different nations.