1 Introduction

Atypical pneumonia cases were first reported in Wuhan, Hubei Province of China since December 2019 [1,2,3,4]. These cases were later found to be caused by a coronavirus, which became officially known as “SARS-Cov-2”, with the World Health Organization (WHO) naming the disease “COVID-19” on February 2020. This novel coronavirus, which is assumed to have originated from certain bats [5], was subsequently confirmed to be transmitted from human-to-human in China [2, 6,7,8,9,10] and around the world [11,12,13,14,15,16]. Based on available data on numbers of infected cases, COVID-19 seems to be more contagious than other coronaviruses such as SARS or MERS [17,18,19,20]. According to the WHO, as of 2 January 2021, there have been a global total of 82,594,195 confirmed cases of COVID-19 of which 1,819,107 resulted in death. The COVID-19 pandemic has had a significant impact on all aspects of people's life. Individually, it has affected people’s scope of activities [21] and their mental health [22]. At national levels, it has affected countries´ economies and development [23].

In many provinces of China’s mainland, and most countries worldwide, multiple waves of the epidemic have been observed [24,25,26,27,28,29,30,31,32,33,34]. Consequently, the WHO declared COVID-19 a pandemic on 11 March 2020. With this, it has been suggested that along with the wide spatial distribution of the virus, efficient global transportation has contributed to the recurrent nature of COVID-19. Additionally, government decisions on the stringency of epidemic control policies and its social and economic support have had important consequences, particularly in the face of the recurrent outbreaks [35]. All this considered, to have a better understanding of mechanisms underlying multiple waves of an epidemic outbreak.

Previous modeling studies have shown that the early phase of the epidemic outbreak inside or outside of China can be depicted by descriptive models (e.g., logistic models) [36], dynamical models such as SIR (Susceptible Infected Recovered Model), SEIR (Susceptible Exposed Infected Recovered Model) or SEIDR (Susceptible Exposed Infected Dead Recovered Model) [3, 4, 37,38,39,40,41,42] or using deep learning techniques [43,44,45]. However, these previous works only depicted one epidemic process and none of these models can be reliably used to predict whether the outbreak will be recurrent in the future. The main reason could be that it is not yet clear what exactly causes the recurrent outbreaks of COVID-19. Understanding the mechanisms underlying recurrent outbreaks allows us to directly predict whether the outbreak will happen again in the future. The current available data does include multiple waves, making it possible to develop a model addressing the limitations of previous studies.

In this study, we proposed a new two-fold framework, composed by a descriptive model and a dynamic analytical model. Based on the public data on COVID-19 dynamics and its recurrent outbreaks within China’s mainland from 1 January 2020 to 2 January 2021, we fitted them by the new descriptive model and analyze the relationship between the extent of recurrent outbreaks and the index for cases imported from overseas. We then applied the descriptive model to available global public data on COVID-19 in 43 other countries and analyzed the relationship between the extent of recurrent outbreaks and governmental control measures in order to provide evidence for policymaking and clinical practice. According to the assumption that the extent of recurrent outbreaks may be related to the index for cases imported from overseas and relaxation of the governmental control measures. The dynamic SEIDR model was constructed to simulate the phenomenon of recurrent outbreaks and trace the development of the epidemic further shows its implication in policy making. Finally, to emphasize the significance of studying recurrent outbreak of the COVID-19, we also analyzed the relationship between the extent of recurrent outbreaks and the consumer confidence index.

2 Method

2.1 Data sources

2.1.1 Epidemic data

The cumulative number of confirmed COVID-19 cases in China’s mainland, from 1 January to 30 July 2020, was obtained from the National Health Commission of China and the provincial health commissions of 30 provincial administrative regions (excluding Tibet, because the only confirmed infected COVID-19 case in Tibet was declared as having recovered on 12 February). The data for the number of imported infected cases from abroad in each province are available from the Chinese website of Sina. All data are publicly available. All cases were laboratory-confirmed, following the standards published by the National Health Commission of China. The basic testing procedure has been described in detail in previous works [1, 5]. The first wave of acquired dataset was analyzed in an initial study [36]. The data may involve systematic estimation errors since asymptomatic infections are not included in the data. With this in mind, the available data is representative of the current state of the epidemic.

We selected 43 countries outside China with severe coronavirus epidemic outbreaks where data were available for the period from 1 January 2020 to 2 January 2021. The population data included in this study amounts to 46.4% of the world population (64.8% with China included) and 83.8% of the infected cases documented globally (84.0% with China included). The data for COVID-19 cases in these countries were obtained from situation reports on the official website of the World Health Organization, which is publicly available. The data used in this study include the cumulative number of reported laboratory-confirmed COVID-19 cases globally. The 43 countries are Australia, Japan, Malaysia, Philippines, Republic of Korea, and Singapore in the western Pacific region; Austria, Belgium, Bulgaria, Czech Republic, Denmark, Finland, France, Hungary, Germany, Italy, Lithuania, Norway, Poland, Portugal, Russian Federation, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, the Netherlands, the United Kingdom, and Ukraine in Europe; India and Thailand in Southeast Asia; the Islamic Republic of Iran and Lebanon in the eastern Mediterranean region; Argentina, Brazil, Canada, Mexico, Panama, Peru, and the United States of America, in the Americas; and Ethiopia and South Africa in Africa. All the laboratory-confirmed cases are determined using the WHO standards.

2.1.2 The indices of government policy

Indicators of government response were taken from The Oxford Covid-19 Government Response Tracker [46]. This dataset tracks individual policy measures across 17 indicators, and miscellaneous notes are organized into four groups: containment and closure policies, economic policies, health system policies and miscellaneous policies. According to these indicators, we can produce three indices to provide an overall impression of government activities. Each of these indices illustrates a number between 0 and 100 that reflects the level of government responses. In this way, we can quantify which policies a government has implemented, and to what degree. The Stringency index is mainly used in this work (all closure indicators, plus health system policies that record public information campaigns), records the strictness of “lockdown style” policies that primarily regulate people’s behavior. Overall, the higher the score is, the more government interventions were implemented.

2.1.3 The consumer confidence index

To include well-being indicators relating to economic impact within the epidemic based on government policies, we included the Consumer Confidence Index (CCI) as a measurable indicator. Positive changes in consumer confidence will lead to economic growth, while negative changes impede the economic development of a country [47].

The CCI used in this study is a composite monthly index, covering consumer expectation and consumer satisfaction extracted from an open dataset, thus measuring the consumers' degree of optimism and pessimism about the current economic situation and expected future economic trends. This scale ranges between 0 and 200 (or − 100 and 100, depending on country), where 200 indicate extreme optimism, 0 is extreme pessimism, and 100 is neutrality. We adopted a transformed index to represent the change of the consumer confidence when confronting the recurrent outbreaks of the COVID-19, which is named the Recurrent Consumer Confidence Index. This index is calculated from the time of lowest point of consumer confidence since the outbreak of the epidemic. We assume that government policy strictness impacts consumer confidence. Additionally, from trends observed in this data, we expect, that once consumer confidence rebounds when polies relax and declines again when another wave emerges and policies become stricter again. Hence, we calculated this index by subtracting the current value from the highest value during this time period. The higher the index values, the greater the policy change, and the less optimistic consumers are. The CCIs from 26 countries (the United States, Mexico, Canada, Argentina, the United Kingdom, Spain, Italy, Germany, France, Belgium, Sweden, the Netherland, Portugal, Ukraine, Poland, Austria, Denmark, Czech Republic, Finland, Hungary, Lithuania, Slovakia, Slovenia, Thailand, China, Australia) were employed in our analysis.

2.1.4 The GDP index

We assume that any country’s gross domestic product (GDP) can be a sign of consumer’s confidence to some extent. To this end, we adopted the quarterly GDP index of the first two quarters of 2020, which is the percentage change from the previous quarter. We adopted this data from the public website of the Organization for Economic Co-operation and Development (OECD) and explored its relationship with Consumer Confidence Index (CCI). “GDP index” of 22 countries (the United States, Canada, Argentina, the UK, Spain, Italy, Germany, France, Belgium, Sweden, Netherland, Portugal, Poland, Austria, Denmark, Czech Republic, Finland, Hungary, Slovakia, Slovenia, China, Australia) were employed in our analysis.

2.2 Models

2.2.1 Descriptive model of epidemics: multiple waves sigmoid model

By 2 January 2021, we had observed the recurrent outbreaks of COVID-19 in many countries. Thus, the previous single wave logistic model can no longer appropriately depict the whole picture of the current dynamics of the epidemic outbreak. For this reason, we have developed a revised version of the previous possible solutions to this dilemma. A logistic function has been used in previous studies to depict the cumulative daily number of COVID-19 cases in the early phase of the epidemic outbreak [36]. Here, we used the logistic model of multiple waves sigmoid model to solve this problem. We reasoned that the entire time series data could be divided into multiple waves, which is denoted with a sigmoid function. Using the multiple waves sigmoid model, the cumulative number of the infected cases in any given region I(t) can be calculated by:

$$ I(t) = \mathop \sum \limits_{i = 1}^{N} \frac{{A_{i} }}{{1 + e^{{k_{i} \left( {t - t_{i} } \right)}} }} $$
(1)

where t denotes time, Ai denotes the maximum infected cases in the ith wave of the epidemic outbreak, ki denotes the logistic growth rate in the ith wave of the epidemic outbreak, ti denotes the semi-saturation period in the ith wave of the epidemic outbreak, and N means the total waves of the outbreaks of COVID-19 in one region.

2.2.2 Dynamical model on epidemics: SEIDR model

With the result analyses of the descriptive model (detailed explanation as in Sect. 3), we assume that the extent of recurrent outbreaks may be related to the index for cases imported from overseas, relaxation of the governmental control measures and the temperature in a region. Then we simulated the COVID-19 outbreak using the SEIDR model, which is an infectious disease dynamics model developed from previous SIR or SEIR model [3, 4, 37,38,39,40,41,42]. This model has five classes; (susceptible (S), exposed (E), infected (I), dead (D) and recovered (R) population), which are shown in Eqs. (2)–(6).

$$ \frac{{{\text{d}}S\left( t \right)}}{{{\text{d}}t}} = - \beta_{1} C\left( t \right) \cdot {\text{Temp}}\left( t \right) \cdot S\left( t \right) \cdot E\left( t \right) - \beta_{2} C\left( t \right) \cdot {\text{Temp}}\left( t \right) \cdot S\left( t \right) \cdot I\left( t \right) $$
(2)
$$ \frac{{{\text{d}}E\left( t \right)}}{{{\text{d}}t}} = \beta_{1} C\left( t \right) \cdot {\text{Temp}}\left( t \right) \cdot S\left( t \right) \cdot E\left( t \right) + \beta_{2} C\left( t \right) \cdot {\text{Temp}}\left( t \right) \cdot S\left( t \right) \cdot I\left( t \right) - \gamma_{1} E\left( t \right) - \alpha E\left( t \right) + T\left( t \right) $$
(3)
$$ \frac{{{\text{d}}I\left( t \right)}}{{{\text{d}}t}} = \gamma_{1} E\left( t \right) - \gamma_{2} I\left( t \right) - mI\left( t \right) $$
(4)
$$ \frac{{{\text{d}}D\left( t \right)}}{{{\text{d}}t}} = mI\left( t \right) $$
(5)
$$ \frac{{{\text{d}}R\left( t \right)}}{{{\text{d}}t}} = \alpha E\left( t \right) + \gamma_{2} I\left( t \right) $$
(6)

In the equations above, S(t), E(t), I(t), D(t) and R(t) were the number of susceptible, exposed, infected, and recovered individuals at time t; \(\beta_{1}\) denotes the transmission rate between susceptible and exposed population. \(\beta_{2}\) is the transmission rate between susceptible and infectious population. Since exposed people are initially asymptomatic, they may easily blend in with the uninfected population, \(\beta_{1}\) should be larger than \(\beta_{2}\). \(\gamma_{1}\) is the transmission rate between exposed and infectious population. \(\gamma_{2}\) is the transmission rate between infectious and recovered population. \(\alpha\) is the transmission rate between exposed and recovered population. \(m\) is the death rate. \(C\left( t \right)\) is a variable that shows the strictness of government policy on prevention and control [48, 49] (Eq. 7) which is modeled as the summation of an exponential decay function and a Gaussian function. If it approaches 0, it means the government has very strict policies to control the epidemic and vice versa. \(T\left( t \right)\) is a variable that shows the number of imported exposed cases from other regions [48] (Eq. 8) which is modeled as a Gaussian function. \({\text{Temp}}\left( t \right)\) is a variable that shows the extent of the change of temperature that affects the infectious ability of the virus [50, 51] (Eq. 9).

$$ C\left( t \right) = e^{{ - t^{2} /\sigma_{0}^{2} }} + W_{C} \cdot e^{{ - \left( {t - t_{C} } \right)^{2} /\sigma_{C}^{2} }} $$
(7)
$$ T\left( t \right) = W_{T} \cdot e^{{ - \left( {t - t_{T} } \right)^{2} /\sigma_{T}^{2} }} $$
(8)
$$ {\text{Temp}}\left( t \right) = [W_{{{\text{Temp}}}} \cdot \cos (2\pi f_{0} t) + 1]/2 $$
(9)

In these equations, \(\sigma_{0}\) denotes the rate at which a government policy reaches peak strictness. \(t_{C}\) denotes the time point at which government policies were most relaxed. \(\sigma_{C}\) denotes how long the relaxation of the government policy will last. \(W_{C}\) denotes the extent of the relaxation of the government policy. \(t_{T}\) denotes time point of the peak of imported cases. \(\sigma_{T}\) denotes the duration of the peak of imported cases. \( W_{T}\) denotes the weight number of imported cases. \(W_{{{\text{Temp}}}}\) denotes the weight of the change of temperature that affects the infectious ability of the virus. \(f_{0}\) denotes the change frequency of the temperature.

All simulations and data analyses were implemented with custom software written in custom scripts with MATLAB (version R2020a). We solved the equations numerically, with a time resolution of 1 day using the Euler method. We set the summation of S, E, I, D and R is 10000, \( \beta_{1}\) is 0.25, \(\beta_{2}\) is 0.2, \(\gamma_{1}\) is 0.1, \(\alpha\) is 0.01, \(\gamma_{2}\) is 0.05, \(m\) is 0.0001, \(\sigma_{0}\) is 45, \(\sigma_{C}\) is 45, \(\sigma_{T}\) is 20, \(t_{C}\) is 160, \(t_{T}\) is 140. These parameters are fixed throughout all the simulations. We manipulated the \(W_{C}\) in the range of 0 to 0.3, \(W_{T} \) in the range of 0 to 20 and \(W_{{{\text{Temp}}}}\) in the range of 0 to 1 to explore the effects of government policy and imported infected cases, respectively.

3 Results

Based on the cumulative number of confirmed cases in 30 provinces of China’s mainland, as well as in 43 countries worldwide, all data were fitted with the our descriptive model (Figs. 1a, 2, 3a, 4).

Fig. 1
figure 1

Relationship between the recurrent outbreaks of COVID-19 and imported infected cases. The top panel of a shows the epidemic in Anhui province with a low ratio of imported cases, which could be explained by one-wave model. The bottom panel a shows the epidemic in Fujian province with a high ratio of imported cases, which could be explained by a three-wave model. The horizontal axis in this panel denotes the xth day after 1 January 2020. The vertical axis in this panel is the cumulative number of infected cases (black dots). The blue line is the fitting curve by the one-wave sigmoid model; the green line is the fitting curve by the two-wave sigmoid model; the red line is the fitting curve by the three-wave sigmoid model. b Illustrates the distribution of the goodness of fit in 30 provinces in China´s mainland. c Shows the scatter plot of the index for multiple waves and the ratio of the cumulative number of imported infected cases from abroad with all infected cases in each province

Fig. 2
figure 2

Time series of COVID-19 infected cases in 28 selected provinces of China. Each panel represents one provincial administrative unit for the multiple waves of the epidemic. The horizontal axis in each panel denotes the xth day after 1 January 2020. The content in each panel is the same as Fig. 1a

Fig. 3
figure 3

Relationship between the recurrent outbreaks of COVID-19 and relaxation of government policies. The top panel a shows the epidemic in Argentina with low extent of relaxation of the stringency index, which has weak recurrent outbreak strength. The bottom panel a shows the epidemic in South Africa with high extent of relaxation on the stringency index, which has strong recurrent outbreak strength. The horizontal axis in this panel denotes the xth day after 1 January 2020. The left vertical axis in this panel is the cumulative number of infected cases (black dots). The blue line is the fitting curve by the one-wave sigmoid model; the green line is the fitting curve by the two-wave sigmoid model; the red line is the fitting curve by the three-wave sigmoid model. The gray curve shows the time series of stringency index corresponding to the right vertical axis; b illustrates the distribution of the goodness of fit by the model in 44 countries around world. c Shows the scatter plot of recurrent outbreak index and relaxation of government measures (stringency index) for 44 countries

Fig. 4
figure 4

Time series of COVID-19 infected cases with policy indices in 44 countries. Each panel represents one country for the multiple waves of the epidemic and indices of government response. The content in each panel is the same as Fig. 3a

3.1 Recurrent outbreaks of COVID-19 strongly correlated to the imported infected cases from overseas in China’s mainland

As of 2 January, 2021, laboratory-confirmed COVID-19 infections have been reported in each of the Chinese provincial administrative regions and a total of 87,485 confirmed cases in China’s mainland by 2 January 2021, including 4634 deaths and 82,085 cases recovered. We first took Anhui and Fujian provinces as examples. In Anhui province, the number of the imported cases is small, and there has only been one wave of the epidemic (Black dots in the top panel of Fig. 1a), which could be explained by the model (Blue curve in the top panel of Fig. 1a). In Fujian province, the number of the imported cases is relatively large. The raw data show that there have been three waves in the course of the COVID-19 outbreak, which our model could capture precisely (Fig. 1a bottom). The blue curve shows the first wave of the epidemic (N = 1), the green curve shows the summation of the first two waves (N = 2), and the red curve shows the summation of the three waves (N = 3). We then defined a “Goodness of fit” index to indicate how much variance the model could account for including a different number of waves. The model can well explain the data in all provinces of China’s mainland (Goodness of fit > 0.95) (Fig. 1b). The number of data points (367 data points) is much more than that of model parameters (e.g., nine parameters for a three-wave logistic model), and each model parameter has actual practical meaning, so the over-fitting phenomenon will most likely not to be a factor in our model as the number of waves increase.

We applied this model to 30 provinces in China’s mainland (Fig. 2). Around half of the provinces experienced recurrent outbreaks (e.g., Shanghai, Gansu and Inner Mongolia). It has been reported [52] that imported infected cases could dominate a new wave of the epidemic outbreak. To investigate this assumption and test whether our model can reasonably explain the recurrence phenomenon, we collected data on the number of imported infected cases. We analyzed the correlation between the number of imported infected cases and the extent of recurrent outbreaks of COVID-19. The extent of recurrent outbreaks of COVID-19 is indexed with Eq. 10.

$$ {\text{Recurrent Outbreak Index}} = \frac{{A_{1} }}{{\mathop \sum \nolimits_{i = 1}^{N} A_{i} }} $$
(10)

where Ai denotes the maximum cumulative number of infected cases in ith wave, and N denotes the number of epidemic waves, which ranges from 0 to 1. The Recurrent outbreak index depends on the proportion of the maximum number of infected cases in the first wave. We considered two extreme cases. The first is if there is only one wave of epidemic in a region, in which case we closer to 1. The other is if there are multiple waves of epidemic in a region, and each wave of epidemic situation causes a similar number of confirmed cases, in this case the indicator would be closer to 0.

We found a significant negative correlation between the recurrent outbreak index and imported infected cases (r = 0.67, p < 0.0001) (Fig. 1c). Specifically, the more cases imported from abroad, the stronger the extent of recurrent outbreaks. This relationship indicates the practical importance of controlling positive cases from abroad, since prevention and control policies are not similar across countries. Hence, the increase in imported infected cases could be one important contributing factor to new COVID-19 outbreaks.

The influence of the government policy, may also be important in the recurrence of epidemic outbreaks. Because of the strict stringency policy coherence across provinces and the clear definition of local and imported cases in China, China is a good example to study the influence of imported infected cases to the extent of recurrent outbreak of COVID-19. Globally, countries have different policies towards the COVID-19, which provides an opportunity to explore the impact thereof on the pandemic To explore the relationship between government policy and the extent of recurrent outbreaks, we generalized this model using global data.

3.2 Recurrent outbreaks of COVID-19 strongly correlated to the relaxation of the government policies in 44 countries around the world

Having verified that the proposed model works with the data of China’s mainland, we further expanded this model to include publicly available data at a global level. Figure 3a shows two typical examples of instances with multiple waves (Argentina and South Africa), which aligned with our model as well. Beyond this, the model could also explain the data in 44 countries very well (Goodness of fit > 0.95). Specifically, we plotted the temporal dynamics of stringency policy indices (gray curve in Fig. 3a). For these indices, the larger the stringency index values, the stricter the government’s prevention and control measures are. The results clearly indicate that when governments’ response relaxed (Fig. 3a, the declining gray curves), the epidemic reoccurred. To explore whether this relationship holds on a larger scale, we tested this model on data from 44 countries (including China) globally (Fig. 4) and found that this model fits the public data from each country consistently (Goodness of fit > 0.95) (Fig. 3b).

Our previous results indicated a strong positive relationship between recurrence and relaxation of government policies (as shown in Figs. 3a and 4). To further quantify the extent of correlation, we performed a correlation analysis between the extent of the multiple waves of the epidemic outbreak and the policies. We investigated whether policy relaxation influences recurrent outbreaks. We further defined a policy relaxation index as the ratio of the minimum value after reaching the maximum to the maximum. This could be considered the relative policy index and ranges from 0 (consistent policy) to 1 (inconsistent policy). We found a strong negative correlation between both relaxation of stringency (r =  − 0.47, p = 0.0012, Fig. 3c) and recurrent outbreaks. So, these results indicate that inconsistencies of the government prevention and control policies could be another important factor in recurrences of the epidemic.

So far, based on the public dataset of China and other 43 other countries around world, we have evidence suggests that imported infected cases from overseas and relaxation of government's prevention and control policies have strong correlation with the recurrence of COVID-19. However, such correlation analysis is not causal. Therefore, we developed the SEIDR model to simulate the phenomenon of recurrent outbreaks, and we further manipulated the model parameters on imported cases and government policies to see how the epidemic develops.

3.3 SEIDR model can reproduce recurrent epidemic outbreaks

According to the assumption that the extent of recurrent outbreaks was related to the index for cases imported from overseas and relaxation of the governmental stringency policy in a region, we simulated the outbreak of COVID-19 through the SEIDR model (Fig. 5a). There are five classes in the model (susceptible (S), exposed (E), infectious (I), dead (D) and recovered (R) population). The initial value of S and I is set at 9999 and 1 respectively (assuming that the population is 10,000). The value of E, R and D is set at 0 initially. We mainly considered two case in this simulation, one is the influence of the imported case that depends on the parameter WT (see Eq. 8), and the other is the influence of the relaxation of government policies that depends on the parameter WC (see Eq. 7). We firstly kept WC and WTemp at 0 and did the simulation with different WT (ranging from 0 to 20). We found that as the WT increases, a new wave of epidemics is more likely to emerge (Fig. 5b) since there are more infected cases imported (Fig. 5c). We also calculated the recurrent outbreak index just as the analysis in real data (Eq. 10). We found that the recurrent outbreak index is negatively correlated with the oversea imported cases (Fig. 5d), which is the same as the modeling results in real data (Fig. 1c). We then kept WT and WTemp = 0 and did the simulation with different WC (ranging from 0 to 1). In the model, the smaller the C(t) is (Eqs. 23), the stricter the government prevention and control measures will be. To unify the index in simulation and real data, we defined government control index (Fig. 5f) as 1 − C(t) from the simulation. The relaxation of government policy is defined as the value of WC. The larger WC is, the more relaxed the government policy is. We found that as the policy is more relaxed, the emergence of a new wave becomes more apparent (Fig. 5e). The recurrent outbreak index is also negatively correlated with quantity of overseas imported cases (Fig. 5g), which is the same as the results using the real data (Fig. 3c).

Fig. 5
figure 5

Simulation of the recurrent outbreak of COVID-19. a Is the model structure of the SEIDR model. Individuals are divided into the following five classes (susceptible (S), exposed (E), infectious (I), dead (D) and recovered (R) population). b Shows the simulated time series of the cumulative infected case in a different weight of oversea imported (different color) setting WC and WTemp is 0. c Shows the T(t) in a different weight of oversea imported (different color). d Is the scatter plot of the weight of oversea imported (WT) and recurrent outbreak index. e Shows the simulated time series of the cumulative infected case in a different extent of inconsistent policies (different color) setting WT and WTemp is 0. f Shows the C(t) in a different extent of inconsistent policies (different color). g Is the scatter plot of relaxation of government policy and the recurrent outbreak index. h Shows the simulated time series of the cumulative infected cases in a different extent of inconsistent policies (different color) setting WT and WC is 0. i Shows the Temp(t) in a different extent of variations (different color). j Is the scatter plot of temperature index and the recurrent outbreak index. (Color figure online)

After demonstrating the validity of this dynamic model with results from real data, we further looked at the impact of temperature in recurrent outbreaks. We assumed that lower temperatures possibly increase the infectivity of the virus [50, 51]. Keeping WT and WC = 0, we ran the simulation with different WTemp (ranging from 0 to 1) and found that as the temperature increases, the emergence of a new wave becomes more apparent (Fig. 5h, i). The recurrent outbreak index is also negatively correlated with variation of the temperature (Fig. 5j).

4 Discussion

We modeled the multiple waves of COVID-19 for 30 provinces in China’s mainland based on the data from 1 January 2020 to 2 January 2021 during the recurrent outbreaks. Further, we applied this model to explain the data from 44 countries (including China). For different provinces of China, a strong positive correlation was found between the extent of recurrent outbreak and the extent of imported infected cases. For different countries around the world, strong correlations were found between the recurrent outbreak index and different types of government policy. Additionally, these phenomena were fully replicated manipulating weight of imported cases and government policy parameters of our adapted SEIDR model.

4.1 Comparison with previous work

In this study, mathematical modeling from both descriptive [2, 53,54,55] and dynamical [29] aspects were used to study the mechanism of epidemic recurrence systematically. We proposed this two-fold framework for the first time, which will help us to have a more comprehensive understanding of the mechanism of recurrent outbreaks. We not only studied the epidemic situation on one wave [2, 36, 38, 56], but also multiple waves. To test the robustness of this approach, we focused on the analysis of epidemics in dozens of regions and countries on many continents instead of only tested in specific countries or regions [27,28,29, 57, 58].

The proposed descriptive logistic model could depict the multiple waves of the infected cases of COVID-19 around the world. The summation of multiple logistic functions can accurately depict available public data. Each period wave of the epidemic can be described by three parameters: the maximum number of cases (Ai), logistic growth rate (ki), and semi-saturation period (ti). The logistic growth rate is deemed to be related to the spread of the virus, and the semi-saturation period is deemed to be related to government response [36]. Using these model parameters, we can capture the underlying mechanisms possibly contributing to the increase of the cumulative infected cases during different waves. Another important advantage of this model is its adaptive capacity. The data used fit the proposed model globally. In the case where there are new outbreaks, this model can still be used to capture outbreaks by increasing the value N (Eq. 1), as long as we have the correct assumption for each wave of the epidemic. This model can be a useful tool to explore the relationship between recurrent outbreaks, which considers governmental policies as well as economic activities, showing a high capacity to measure the macroscopic behavior of an epidemic.

In addition to the proposed descriptive model, we also integrated the dynamic SEIDR model (developed from the SEIR model) [4, 37,38,39]. We first discussed the mechanisms of recurrent outbreaks using this model and calibrate the relevant model parameters (changing with time), which could replicate our results from the descriptive model. Moreover, we found that the simulation data (Fig. 5b, e, h) that shows the phenomenon of recurrent outbreak could be fitted as the summation of sigmoid functions, which is a relatively simple and effective approximation to the real numerical solutions. Since there is no analytical solution for such complex nonlinear dynamics (Eqs. 29), we also provided a way to approximate the real solutions.

4.2 The multi-dimension impact of recurrent outbreaks of COVID-19

The outbreak of the COVID-19 epidemic has had a significant impact on all aspects of society, including the economy [22, 59], tourism [60,61,62], education [63,64,65], diet [21, 66], mental health [22, 67,68,69,70], among many. When further waves emerge, it is bound to affect all aspects of people's lives.

There is a clear relationship between the extent of recurrent outbreaks and relaxation of government policy or imported infected cases from overseas. Generally, the stricter the policy of prevention and control, the less frequent the outbreaks of COVID-19 will be. However, regulations on different aspects of the socioeconomic life have yielded contradictory results. On the one hand, policies on implementing the lockdown, active contact tracing after a positive diagnosis, and restrictions on gatherings are constructive and conducive policies, which can significantly decrease the number of new waves of the epidemic. On the other hand, however, policies such as monetary stimulus to the economy, providing income support to citizens who lost their jobs can be destructive due to long-term strain on the economy and therefore exacerbate recurrent outbreaks of the epidemic. We assume that fiscal stimulus during pandemic might encouraged more face-to-face interactions, boosting transmission rates. These destructive contributors are what we should aim to avoid in reality.

In the natural aspect, temperature may be another important index. A previous study has shown that the temperature-driven changes could cause recurrent insect outbreaks [71]. From our simulation, we also found that the temperature index had a significant impact on the recurrence of an outbreak. Moreover, we noticed that in the scatter diagram of Figs. 1c and 3c. Some individual points are in the lower-left corner, they are not in the law of the actual black curve, they are more like some outliers (see Figs. 1c and 3), which could indicate that the recurrent outbreak of the epidemics in some regions or countries may be due to some mechanisms not explored in our study (e.g., temperature). In China, due to the consistent strict control of the epidemic by government, the epidemics in some provinces in the north were recurrent by the end of 2020. In most countries around the world (Fig. 4), there is a new wave occurred in around September, which is the starts of the autumn. The temperature starts to go down which may increase the spread property of the virus. Hence we considered that the imported infected cases, inconsistent government policies and the variation of the temperature could be main contributors to trigger a new epidemic wave.

Furthermore, we also collected the data of the Consumer Confidence Index (CCI) from 26 countries globally (see Methods) to explored the influence to the economy by the recurrent outbreaks of COVID-19. Consumer confidence covaries with recurrent outbreaks and changes in GDP (Fig. 6). In most countries, the consumer confidence index continued to decline during the early wave of the first phase of the epidemic. When the situation began to improve, the consumer confidence index rose, but it fell again when the epidemic broke out again. Therefore, we calculated the change in consumer confidence as the recurrent consumer confidence index. We found that recurrent CCI is significantly correlated to the extent of recurrent outbreaks (r =  − 0.47, p = 0.015) (Fig. 6a). Even more, the relaxation of the stringency index shows a clear positive correlation with the recurrent CCI (r = 0.42, p = 0.033) (Fig. 6b), which indicates that strict and consistent prevention and control measures may carry more value in maintaining the consumers’ economic status expectations. Furthermore, changes in CCI show a clear positive correlation with the change of GDP (r = 0.69, p < 0.0001). This indicates that strict and consistent prevention and control measures may be meaningful in improving the national economy. Our results show the strong relationship between recurrent outbreak and recurrent consumer confidence (Fig. 6). Meanwhile, the change of recurrent consumer confidence has so far severely affected the change in seasonal GDP of a country, which could have profound impacts on long-term outlooks for economies.

Fig. 6
figure 6

Recurrent outbreak index and stringency index are strongly correlated with the recurrent consumer confidence index. a Shows the scatter plot between recurrent outbreaks of COVID-19 and recurrent consumer confidence. b Shows the scatter plot between relaxation of stringency index and recurrent consumer confidence. c Shows the scatter plot between change of consumer confidence index and change of GDP index in the first three quarters

4.3 Implication of our study on controlling recurrent outbreaks of COVID-19

Applying control theory and mathematical models can provide specific scientific reference for governments. The above results indicate that although early emergence from strict control measures will help restore the order of daily life and boost economic activity, the consequential recurrent outbreaks will damage consumer confidence, and thus harm the economy. While extreme isolation of the population will suppress the virus, this comes at a profound cost. More factors need to be taken into consideration when balancing policy stringency and epidemic control measures.

The long-term impact of a recurrent pandemic outbreak has been demonstrated by previous diseases [72, 73], which is a phenomenon that can also be observed in diseases caused by bacteria [74,75,76]. For COVID-19, it is crucial to forecast the recurrent outbreaks, due to its global reach and impact on individual livelihoods [27, 77], as well as on the economy [23, 78, 79] and mental health [67, 69, 80]. Before a vaccine becomes available, how we can co-exist with this virus in the meantime is a new and pertinent question. How a balance between economic losses and casualties as a result of the resurgence of the epidemic is an open question that should be addressed by future research.

5 Conclusion

Practically, we found two main mechanisms that strongly correlate with the extent of a recurrent outbreak of the epidemic that; the sudden increase of imported cases from overseas and the relaxation of local governments’ epidemic prevention policies. We quantified this using a novel descriptive multi-wave model. These recurrent outbreaks affect consumer confidence and significantly influence changes in GDP.

Theoretically, we simulated the main results from a compartmental dynamical model (SEIDR Model) and tested the causal effect on the emergence of a new wave of the outbreak including 3 factors; (1) the impact of imported infected cases, (2) the relaxation of the government policy, and (3) the variation of the temperature index.

In the future, further investigation of the mechanisms underlying epidemic recurrence using including more factors (e.g., virus mutation), to help predict the trajectories of epidemics and increase our knowledge.