1 Introduction

COVID-19 outbreak has hit the world so hard; thousands of lives were lost; world economy was hammered. This current pandemic might be eased if the government of a country is prepared with appropriate health safety measure taken early. One of the most effective safety measures [1], proven successful to certain extent by some countries, is called lockdown that comprises mainly of social distancing, staying at home and closing borders from domestic and international travellers. The protection over the susceptible individuals who are vulnerable and be able to be infected, by this lockdown scheme is to stop or slow down the spread of disease by reducing the social activities and interactions between people in a society. In theory, if everybody’s mobility is restricted and totally isolated, there is no chance for the virus to have passed from one to another. However, this means a huge loss in economy, as productivity and activities are brought to standstill. It is hence a challenging problem to find a compromise between thwarting the economy and keeping the number of infected patients manageable, with the aim of preventing the medical infrastructure from collapse. So, that new patients could be admitted to hospital for medical treatment, while existing patients would have discharged from hospital freeing up the limited resources, e.g. beds and respiratory aid devices. To decide on the intensity and duration of lockdown, multiple factors are considered; these include the prediction of the future number of infected people in a population, which usually comes from a disease spreading model called SEIR (susceptible, exposed, infected, and recovered) [2]. SEIR is the most basic model that depicts how a contagious disease spreads from a handful of carriers in a population at the beginning to a growing percentage of people who get cross-infected in the future. Almost every country is adopting this SEIR model or its modified versions as a reference for making decision about imposing and lifting lockdown.

On the other hand, there is another proposal called “herd immunity” [3] - it advocates that natural resistance to the virus can be achieved after a significant portion of population has become immunized from their previous information. Therefore, the virus will either be eradicated, or the spread can be slowed down, because the chain of spread can be cut when the number immunized individuals has reached 60% [4]. By then the suspectable individuals especially those who are elders or possess pre-conditions can be spared. UK and Sweden governments had been considering the herd immunity strategy [5], while the rest of the world are practicing lockdown and social distancing to protect their citizens from the disease. In either case, SEIR model is used as a simulator for estimating the numbers of susceptibles, exposed, infested and recovered, and their inter-dependent changes as the virus spreads over time.

At the time of writing this article, COVID-19 is still prevalent as a pandemic, taking many lives. Just United States alone, the number of infected surpasses 1.5 million, with more than 100,000 deaths.Footnote 1 Should appropriate protection actions be taken in time by a capable president, the course of infection and fatality rates would be very different. It is observed that since the first outbreak was confirmed in Wuhan, China, the prognosis and the characteristics of COVID-19 have changed [6]. Some of the most noticeable changes about COVID-19 are 1) asymptomatic patients [7,8,9,10] and (2) relapse of recovered patients [11,12,13].

Asymptomatic patients are those infected with the coronavirus, but not showing visible symptoms such as spiking fevers and coughing. There exist a certain number of people who have been infected without knowing it. In epidemiology, it is not uncommon for an infection without any obvious symptoms, while the carrier is able to spread the disease. Up to 25% of infections of common flu happen with no symptoms [14]. It is not a medical anomaly as the immune system in our body often displays a latency to battle the infection, and symptoms are side effect of the fight. The virus in this case can be spread by normal exhalations through which tiny droplets are produced from breaths. Some studies found that respiratory related virus can exist during the pre-symptomatic stage for longer than a week. Without exhibiting coughing and fever which are the usual characteristics of COVID-19, carriers can go undetected and transmit the virus to people everywhere he goes, contributing to the pandemic.

In an effort of serosurveillance, blood samples from 10,000 volunteers were collected and analyzed in the United States [15]. The aim of serosurvey is to measure the number of undetected cases of COVID-10 infection including those pre-symptomatic, symptomatic and asymptomatic patients for understanding the characteristics of the SARS-CoV-2. It was found by surprise that up to 20% of residents in New York City, they might have already been infected with COVID-10 previously, from their blood samples. They did not even know that they were infected before and their immune systems had defeated off the virus.

On the other hand, thousands of people have recovered from COVID-19, since more than three million cases of infection were reported from worldwide. Generally, it is believed as a hype that those who recovered from the disease would have developed an antibody immune [16]. Although the recovery rate is encouraging at approximately 95% in developed countries, in reality those who were infested and recovered can get sick again through a second infection [17]. In other words, there is no absolute immunity despite a patient had recovered from COVID-19. So far, there is no evidence proving that COVID-19 patients who have recovered are guaranteed to have antibodies which protect them from a second infection. Also, currently it is unknown about how much more or less contagious a second infection is, to the suspectable group of people. The existence of asymptomatic carriers and the potential threat of contagion from a second infection in local community, had bluntly refute the assumptions of the family of SEIR models, making them look overly idealistic and outdated. The shortcomings of the current models, the uncertainty about the new level of contagiousness, and its associated risks gave new impetus to the need of a new disease spreading model.

Furthermore, some classical epidemic models (SIR, SEIR) are not suitable to the evolution of SARS-CoV-2. Therefore, the SEIR model [18] is proposed to be extended to a new adaptive SEIARD model. The advantage of this model design is to allow a neural network predictor to fit the model, so as to obtain a high degree of fitting and maintain a good tracking of epidemic trends [19]. In our modelling, isolation for the suspected population and infection during the incubation time are taken into consideration. Multiple groups make the model more adaptable to the real situation. In view of the spreading characteristics of this outbreak, we considered the mutual transformation among different groups to make the model circular and communicative. For example, the model allows the recovered individual returns to the susceptible population to account for the insufficient evidence that antibody will indeed be developed after contracting the virus once. Since, our model takes into account of the death population, in order to maintain the stability of the population, we introduced compartments of births and deaths to accommodate our model internal motivation. Making the model more convincing, we also take the infection during quarantine and contagious of asymptomatic patients into consideration and form a complex network flow.

The paper is organized as follow. In section 2, the factor of susceptible population, whether they are separately gender, age and smoking or not are analyzed. In section 3, some clinical characteristics of SARS-CoV-2 are divided into three main parts: treatments and outcomes of patients infected with COVID-19, baseline characteristics of patients and laboratory findings of patients infected with SARS-CoV-2 on admission to hospital. The significance difference of parameters/symptoms in different forms is studied to get the correlation with severe patients. In section 4, the transmission path of SARS-CoV-2 is described, so to show how the virus spread and prevent the transmission of virus between people by people. In section 5, using the infected population dataset, the incubation period of infected population by gender and age is analyzed. In section 6, the adaptive SEIARD model is proposed to simulate the evolution of the outbreak of SARS-CoV-2 in United States. Due to the poor fitting, in the default setting, neural network is used to improve the fitness of the curve. The neural network predicts the tendency of multi-groups, such as suspected population, exposed population, infected people, asymptomatic population, cured people and dead patients. In section 7, the simulation data of the training set is compared with the actual data. The goodness of fit and error values are then used as indicators to evaluate the experimental results. In section 8, for the testing set, comparing the simulation data and actual data in the testing set are also used as evaluation indicators for the improved model to further test the prediction effect of the model. In section 9, in order to test the applicability of the model, we compared the epidemic data of two countries with different cultural backgrounds (the United States vs Singapore) to further verify that the model is highly self-adaptable. In section 10, the index of risk assessment from the adaptive SEAIRD model in section 6 helps us judge how the epidemic developed. Therefore, we can get the tendency of different index from the adaptive model to help the epidemic prevention department make measures to control the epidemic in advance, so as to prevent disastrous situations from happening. In section 11, the potentially lethal features are discussed, to effectively release ICU beds to those in need and it is crucial to reduce mortality. Section 12 concludes the paper.

2 Population at risk (susceptible population)

In the current global outbreak of new coronavirus, research on “who are susceptible” has become a key issue. In real life, people’s susceptibility to SARS-CoV-2 is different, and we need to focus on the protection of particularly susceptible people, so as to avoid the emergence of “super spreaders”. In this paper, we determine whether it is a susceptible factor by comparing the gender, and age of different states in U.S. with the infected and fatalities, and studying the significant difference (P value) of the smoking population and patients with chronic diseases compared with the infected population with COVID-19.

With the collected epidemic dataset in different states of U.S., the gender ratio, infected population and fatalities cases are visualized to obtain a heatmap graph. From Fig. 1, we found that the proportion of men and women in each state has a certain relationship with the infection rate. In several states in U.S. with a high infection rate, such as South Dakota, Utah, Vermont, New York, New Jersey, Louisiana, Massachusetts, and Connecticut, the ratio of male to female is mostly equal. In addition, the proportion of women in the District of Columbia is highest comparing with other states. On the contrary, the proportion of men in Alaska is highest, and the first one has a higher infection rate than the second. These phenomena indicate that states with mostly equal ratios of male to female and the proportion of females is slightly higher than the proportion of male would have higher infection rates. It is also found that the death rates in all states are generally high.

Fig. 1
figure 1

Number of deaths and infected population by gender

In the collected data, there are three age groups (age 0-25, age 26-54 and age 55+) in different states. There are many young people between the ages of 0-25 in Utah, and their infection rate is relatively high. District of Columbia has the highest ratios of middle-aged from 26 to 54 years old women, which its infection rate is highest. The proportion of people over 55 years old is generally large in all states, which shows that old people have higher susceptibility to the virus due to weaker immunity, the fact shows that the SARS-CoV-2 is susceptible to all age groups. From Fig. 2, since the influence of age on the infection rate will be different at different ages, it needs to be analyzed by age group.

Fig. 2
figure 2

Age distribution in the U.S. states

Through analysis, the middle-aged women and elderly population is relatively weak and the risk of infection is greater than other groups. They should be taking more attention on disease prevention.

By normalizing the smoking rate and mortality rate, and then calculating the error values, the results are shown in Fig. 3. We can find that the errors between smoking rate and infection rate or mortality rate are similar. As shown in Fig. 4(a), the residual value of smoking rate and mortality rate is roughly between −1 ~ 1. Due to the high similarity between the infection rate / mortality rate with the smoking rate, we only study the residual value between the smoking rate and the infection rate. As shown in Fig. 4(b), most of the residual values between the smoking rate and the infection rate are fluctuating between −1 and + 1. The fact that most of the residual values between the smoking rate and the infection rate or mortality rate of each state have not obvious mathematical statistical significance indicates that the smoking rate is not obvious correlation to the infection rate and mortality rate.

Fig. 3
figure 3

Error between normalized smoking rate and fatalities rate

Fig. 4
figure 4

Residual case order plot (smoking rate vs infected rate & smoking rate vs fatalities rate)

3 The clinical characteristics of SARS-CoV-2

SARS-CoV-2 has posed a serious threat to global health. In the early stages of the epidemic, Through the study of more than 5000 cases of infection, we found that the most common clinical features of SARS-CoV-2 patients at the time of onset were fever, dry cough, tiredness, sore throat and difficulty in breathing, and some people may still experience other symptoms, including diarrhea, runny nose, nasal congestion and pains [20]. In the heat map, the yellow part indicates the presence of the disease, and the dark blue part indicates the absence of the disease. As shown in the Fig. 5, we can see that after being infected with COVID-19, patients most often have these symptoms, such as difficulty in breathing, dry cough and tiredness.

Fig. 5
figure 5

Five Major Symptoms of SARS-CoV-2

In initial performance, based on clinical characteristics to determine whether is SARS-CoV-2 patients, most patients confirmed with SARS-CoV-2 experienced breathing difficulty, dry cough and tiredness. Some people experienced other symptoms such as runny nose and nasal congestion. From Fig. 6, we can see that patients infected with COVID-19 may also have other symptoms, such as runny nose and nasal congestion.

Fig. 6
figure 6

Other symptoms

Table 1. shows the difference of the treatments and their outcomes between ICU patients and no ICU patients infected with COVID-19. In terms of complications, acute respiratory distress syndrome, acute cardiac injury, acute kidney injury, acute kidney injury, secondary infection and shock have significant statistical difference (P value<0.05). For example, acute respiratory distress syndrome (ARDS) is the main complication of critical patients. Among patients, critical patients with acute respiratory distress syndrome account for 85%, while ordinary patients account for 4%. In terms of complications, the use of corticosteroid of critical patients count for 46%. 23% of the patients who used continuous renal replacement treatment were critical patient. In terms of oxygen support, only 8% of critical patients use nasal cannula to inhale oxygen. The use of other support equipment (non-invasive ventilation or high-flow nasal cannula, invasive mechanical ventilation and invasive mechanical ventilation and ECMO) accounts for a large proportion.

Table. 1 Treatments and outcomes of patients infected with COVID-19

From the Tables 2, 3 and 4, almost all patients have common symptoms such as fever, cough, myalgia or fatigue, sputum production, headache, hemoptysis, and diarrhea. The study finds that critical patients have difficulty breathing after around 11 days (median time) from illness onset, while the ordinary patients need 6 days. The highest temperature range of severe patients is respectively 37.3-38 (23%), 38.1-39.0 (54%), and 39.0 ~ (23%). Severe patients have a greater proportion of symptoms of dyspnea, which accounts for 92%. Patients in ICU need almost 8 days to transfer from the first admission. The systolic pressure of patients in ICU reaches 145.0, which indicates that severely ill patients have high blood pressure. Respiratory rate of patients in ICU is higher than 24 breaths /min, which accounts for 62%.

Table 2 Selected baseline characteristics of patients infected with COVID-19
Table. 3 Laboratory findings of patients infected with SARS-CoV-2 on admission to hospital
Table 4 Parameter Value Setting

SARS-CoV-2 patients have different white blood cell counts. Severe patients have more white blood cells than normal patients(>10 × 109/L), which account for 54%. Neutrophil Count (10·6 × 109/L) and Lymphocyte count (0·4 × 109/L) have been reported, and they are different from ordinary patients. Prothrombin time of patients (PT) in ICU need 12.2 s, which shows that the SARS-CoV-2may cause coagulation factor damage. Comparing to the ordinary patients, the D-dimer of severe patients have extraordinarily high content (2.4 mg/L) and an extremely elevated D-dimer is uniquely associated with severe disease, mainly including VTE, sepsis and/or cancer [21, 22]. Severe patients have a lower albumin level (27.9 g/L), which indicates severe patients may have malnutrition. Alanine aminotransferase (ALT) is an enzyme found primarily in the liver and kidney [23]. Normally, a low level of ALT exists in the serum. ALT is increased with liver damage and is used to screen for and/or monitor liver disease, and serve patients have a high content of ALT (49 U/L) [24]. Total bilirubin of sever patents infected with SARS-CoV-2 have a high content (14.0 mmol/L), which their bilirubin levels are higher than normal and it is a sign that either severe patients’ red blood cells are breaking down at an unusual rate or that their liver isn’t breaking down waste properly and clearing the bilirubin from severe patients’ blood [25]. Severe patients have high content of Lactate dehydrogenase (LDH), which reach around 400 U/L. and if any cell is abnormally damaged, LDH will increase [26]. Procalcitonin has an important role in evaluating and managing lower respiratory tract infections (LRTIs) in adults (including pneumonia, acute bronchitis, and acute episodes of chronic obstructive pulmonary disease). The procalcitonin of severe patients with SARS-CoV-2 is ≥0·5, which account for 25%, and it shows a high procalcitonin level, it is likely severe patients have a serious bacterial infection such as sepsis or meningitis. The higher the level, the more severe a patient may have been suffering from the infection [27].

4 SARS-CoV-2 transmission path

SARS-CoV-2 mainly spreads through droplets and contact, the SARS-CoV-2 quickly spread to the whole country and other countries. [28]

Respiratory tract infections can be spread by droplets of different sizes: when the droplet diameter is greater than 5-10 μm, they are called infectious droplets, and when the diameter is less than 5 μm, they are called infectious droplet nuclei [29]. The virus is released together with the droplets (sneezing, coughing, saliva, etc.) of the infected person. They can adhere to the surface of the object after the droplets settle, or the water in the droplets evaporates into droplet nuclei. The virus can spread to any places far away. Nearby people are infected by inhaling the virus through mouth and nose [30].

We can learn how to prevent from infecting SARS-CoV-2 through viral transmission by droplet nucleus: Wearing masks could shield off the spread of viruses by establishing physical barriers between virus and people. In the prevention and control of respiratory infectious diseases, the mask not only prevents the virus carrier from spraying droplets outwards, reducing the volume and spread speed of droplets in the air; it also blocks the virus droplet nucleus and reduces the risk of inhalation by people.

5 Incubation period

As Figs. 7, 8 and 9 shown, the incubation period refers to the period from when the pathogen invades the body until the earliest clinical symptoms or signs appear. In different populations, the immune response is different, and the symptoms may vary too. For example, children may have a weaker immune response than adults, and it is less likely to show symptoms. In addition, asymptomatic after viral infection may also be related to the individual’s unique constitution. These people are all asymptomatic infected persons, which is why some patients with SARS-CoV-2 have no clinical symptoms but are infectious.

Fig. 7
figure 7

Droplet nucleus transmission

Fig. 8
figure 8

Timeline of from exposure to coronavirus onset by gender

Fig. 9
figure 9

SARS-CoV-2 natural history

By studying the relationship between incubation period and gender, it is found that women have a longer incubation period than men. Although, the women were not diagnosed in the early stage, they might have already been infected; her symptoms are mild or asymptomatic. In response to this situation, the epidemic prevention department should take differentiated measures to control disease against women as soon as possible. For example, during the screening process, a nucleic acid test should be performed directly on women with a clear history of exposure. Regardless of whether they have symptoms, the isolation period of women under medical observation should also exceed 14 days.

As Fig. 10 shown, in the data set under study, there is no epidemic data of ages between 30 and 40 years old. Therefore, the study of people between 30 and 40 years old is not considered. The World Health Organization (WHO) clearly stated that “from the age of 40, the higher the age, the higher the risk of severe illness.” Therefore, data of 40 years old are used as a demarcation point. According to the data set, it is roughly divided into young people between 15 and 30 years old, and middle-aged and elderly people aged 40+. The proportion of patients in need of hospitalization also increases dramatically with age. The proportion of patients aged 40-49 who are hospitalized is 4.3%, and the proportion of patients aged 60-69 is 11.8%. WHO considers “population over 60 years old” as a “high-risk group” [31]. Here, we continue to subdivide the 40+ year-old population into those between 40 and 50 years old. Since the 50-year-old population is closer to 60 years old, we will consider the 50-70 year-old population as another subject to be studied. In France, more than two-thirds (71%) of hospital patients who died of COVID-19, and half (51%) of hospitalized patients were at least 70 years old. In some countries, such as the United Kingdom, the government advises people over 70 to comply with strict quarantine measures [32]. Therefore, people over 70 years old are considered as a special population to be studied. The incubation period between the age 15 ~ 30 years old is about 2-4 days. Adolescents have greatly strong immunity. When the virus enters the body, the immune molecules respond quickly, attacking the virus-infected host cells excessively, causing hypersensitivity, resulting in lung cells to be killed by their own immunity, and then causing pneumonia [33]. Severe symptoms and possible secondary infection with other microorganisms, etc. The 40 ~ 50 years old people have longer incubation period from 1 ~ 22 days. The incubation period of 50-70 people is from 2 to 52 days, which shows that the 50-70 years old are well tolerated by the virus. The incubation period of the 70-90 years old is up to 38 days. The incubation period of the elder above 50 years old is up to more than one month, which may be related to the weakened immunity of the elder. Their defensive ability is reduced, resulting in no severe rejection against the virus, making the symptoms appear very slowly. In this case, the patient will not stop activities and be hospitalized, the longest incubation period will become longer. Since, the incubation period is long and there are many mild patients, so stricter isolation and prevention and control measures are needed to control the spread of the epidemic, which will have a significant impact on the short-term economy.

Fig. 10
figure 10

Timeline of from exposure to coronavirus onset by age

The incubation period is of great significance in epidemiological research. First, the pre-epidemic period of novel coronary pneumonia (NCP) can be clarified during the isolation period, which is to determine the contact period for quarantine, quarantine and medical observation of contacts; The length of the patient ‘s infection is used to trace the source of infection, determine the route of transmission, and evaluate the effectiveness of prevention and treatment measures. Again, the future development of a new coronary pneumonia vaccine can also determine the time of immunization according to the length of the incubation period.

6 Adaptive SEAIRD model

The objective of this paper is mainly to analyze the epidemic by collecting the data feed of confirmed cases, cures, and deaths within 100 days from January 22th to April 30th from the United States. Considering that the data before February 29th is seriously distorted, we selected the time series data feed after February 29th to build the model. These data are all from a website published by an official media. There may be some bias with the actual situation, but the paper mainly focuses on building an epidemic model using the current dataset, focusing on the research and judgment of epidemic future trends, and more precise values of epidemic may only eventually have a slight impact on the trend.

Some of the proposed traditional infectious disease models are similar to the SEIR model, and with some multiple solving functions added. Analyzing these classical models from another point of view, the current state can completely determine the future curve trend in these models, which it is a Markov process. The problem with these types of classical model is that they are only a statistical feature (like machine learning). The complexity of communication in modern society cannot be reproduced. By improving the original SEIR model, it makes the infectious disease model mapped on a huge social network. The modified model is used to simulate and analyze more detailed propagation. The model can roughly show the process of SARS-CoV-2 from infection, onset to the end, its core lies in differential equations. For this task, Matlab software is used to calculate differential equations.

As shown in Fig. 11, the classic SEIR infectious disease model is developed into a hierarchical infectious disease model. The deficiencies that need improvement in the model from the perspective of epidemic prevention measures and simulation results are analyzed. Then the neural network was applied to search for the optimal values of several parameters in the model, and the finite difference method was used for numerical simulation. Finally, the model was applied on the data of United States as well as other countries where the number of people infected with COVID-19 is relatively small. By comparing the results of the two countries, it was found whether the results are quite close to the actual data, so as to verify whether the model is widely applicable. And from this, we can draw conclusions about the importance of early prevention and treatment of people infected with infectious diseases.

Fig. 11
figure 11

Flow chart of overall analysis

Based on the classic Susceptible-Exposed-Infectious-Recovered (SEIR) infectious disease epidemiology model, this article introduces the concepts of “Silent Skill”, that is asymptomatic patients (A), marking it as A(t) in the model. At the same time, this model differentiates recovered patients (R) into recovered patients (R) and the dead (D), marking them as R(t) and D(t) respectively. The other three core populations in this model are: S = susceptible, marking it as S(t); E = exposed, marking it as E(t); I = infected, marking it as I(t);

The core parameters are set as follows: A case of United States is chosen as the virus outbreak area for SARS-CoV-2 scenario simulation. In the initial stage of the outbreak, we assume that the number of susceptible population is S(0) = 1 × 105; The initial infection cases (Patient Zero) is set to I(0) = 20; Initial exposed population: E(0) = 100; In reality, the initial values of other groups are all A(0) = R(0) = D(0) = 0; The total statistic meets N=S + E + I + A + R. In the real case, it is absolutely impossible for the dead (D) to return to any other populations, so the number of deaths who withdrew the calculation from the model. However, due to reasons such as the large population base and the high mobility of people in the United States, in order to maintain the stability of our susceptible population to keep a sufficient number of people running in the simulation and maintain a balance and maintain the mobility of the model, the concepts of lower birth rate is introduced, natural mortality rate, and temporary immunity of the recovered population. This is also one of the innovations of our model. We recalculated the total statistic population of the model, considering that except for a few out-of-control epidemics, in most cases, the number of infected people (including asymptomatic infected people) and the recovered population accounted for a very small proportion of the total number of models. The number of infected people can be ignored, so the total number of people is N = S + E;

The SEAIR framework diagram shows how each group moves among each node of the model. The solid red line in the diagram in Fig. 12, shows how the SEIR model transform to a SEIAR(D)S (susceptible-exposed-infected-recovered (death) -susceptible) model. In the model, the recovered people may become susceptible people again (that means recovered people do not obtain lifelong immunity).

Fig. 12
figure 12

Adaptive SEIARD dynamic model

The diagram of the Adaptive SEIARD model is shown in Fig. 12.

It can be seen from Fig. 11 that adaptive SEIARD model is a two-way closed model. The susceptible populations are constantly input to the infected population, and at the same time the infected population are also input to the recovered population in both directions. Finally, the recovered population will return to the susceptible population with a low probability. When the birth rate and natural death rate are the same, the ratio of reverting to the susceptible after recovery set here is very low, the number of susceptible and infected people will eventually drop to 0, and the data will become negative after it falls below 0. At the same time, all the population will first become the recovered population with a high probability. But due to the interoperability between the recovery population and the death population, as time goes by, all people will eventually become dead at a certain time. This is the limitation of the model. By adjusting the ratio of reverting to the susceptible after recovery, we can make the model constantly self-loop.

At the initial stage of the development of the epidemic, the rate of exposure to infection (φ) represents the average number of daily contacts with infectious patients with susceptible people;

Exposure rate of susceptible people (β) controls the probability that a susceptible group contacts an infected person or any infectious group during the incubation period and represents the possibility of transmitting disease between a susceptible group and an infectious group. The parameters will be constantly adjusted during the simulation to adapt it to the actual development of the epidemic.

Probability of susceptibility to isolation (ρ) is the probability that a susceptible group is isolated.

Infection rate of people in incubation period (ε) is the infectious rate of the infected in the incubation period. It can be calculated by 1 / TI. The exposed population during the incubation period (E) in the SEIARD model refers specifically to the period when it is infected but not yet infectious. The average duration of this period is TE. The general understanding of the incubation period is the period of infection but no symptoms (In section 4, we know that the average incubation period of patients with SARS-CoV-2 is about 10 days). At the end of the incubation period, patients may have a certain infection rate a few days before onset, and generally should not be infectious in the early stage. Therefore, in an asymptomatic incubation period of average 10 days we assumed, the infectious incubation period is 6.5 days on average (Referenced the relevant data of SARS), and the last 3.5 days are contagious which is marked as Infectious period (TI). Instead of the time from the patient’s illness to recovery, the average duration of the infectious period, TI, refers to the time when a patient is contagious and will not be exposed to a specific susceptible group, such as hospitalization or staying in bed at home. Under this understanding, the heavier the symptoms of patients, the shorter the average infectious period TI. Because the patient will be hospitalized or lost his ability to move quickly, and the virus with milder symptoms may have a longer infection period. When establishing the SEIR model for various epidemics, the duration of the infection period is usually 3 to 5 days, or even shorter.

Probability that the people in incubation period will turn negative is (β1). After contacting the infected person (A or I), by sampling pharyngeal swabs of suspected cases, the exposed population still showed negative, and we do not exclude the existence of many false negative patients. This unstable negative test result may be due to the amount of virus secreted by the patient in the early stage is very little, or it may be related to the quality of the reagent kit, but also related to the characteristics of the SARS-CoV-2 itself, sampling site, sampling volume, transportation and storage links, etc. The testing conditions in the laboratory and the operation of personnel may be the contributing factors too. It is composed of many reasons and is very complicated.

Probability of exposure to infection (α) indicates the number of people in the infectious period (including asymptomatic infected persons) (I + A) per unit time, and the number of people who transitioned from the incubation period to the infectious period. It is the number of people who transits from the incubation period to the infection period. This infection rate in the paper is similar to most current research results.

Isolation rate of susceptible population (ω), the number of susceptible people is an important subject of our research. Since there is currently no vaccine, there is no way to artificially reduce the proportion of susceptible people. So, it is difficult for us to change this part at present. If we do not take any measures, the virus will infect more people, and the proportion of susceptible people will drop, which is equivalent to the forest fire burning up the trees as an analogy. Without the susceptible people, the disease will disappear soon. However, realistically susceptible people are those we need to protect, so preventive measures must be applied. Isolation of infected and latent persons can be achieved through symptom tracking (according to the clinical characteristics of SARS-CoV-2 in section 2) + diagnosis + isolation of patients. We call this measure as surveillance contagious cases. So we can only isolate infected people / latent people, or only susceptible people (increasing ω value), or both.

Ratio of symptomatic infections to all infections (η), This ratio can reflect the proportion of asymptomatic infections from the side. Some researchers have shown that the spread of asymptomatic infected persons cannot be underestimated. It is also a dangerous source of infection and we cannot help but guard against it. The current model assumes that the proportion of asymptomatic persons in the total infected population is 0.1, but the proportion of true asymptomatic patients is likely to be higher than expected. By studying the rate of asymptomatic infections and tracking asymptomatic infections, we can prevent the flattening epidemic curve from starting to rise again.

This ratio, proportion of untreated patients with SARS-CoV-2 (p1), can help us study how many of the infected people belong to mild infectious symptoms. Most mild patients have not reached the point of seeking medical assistance due to mild symptoms and may also avoid screening methods such as body temperature testing. Some studies indicate that 30% -60% of infections with SARS-CoV-2 have mild symptoms, but their ability to spread the virus is quite strong. These mildly infected people may trigger a new round of outbreaks.

This ratio, percentage of infected people being treated (p2), is an important ratio that can effectively improve the utilization rate of critically ill beds and help reduce mortality.

Removal rate of infected people (mortality rate) γ, which represents the rate of death removal. γR is the rate of recovery removal. The rate of both depends on the average duration of infection.

Probability of Quarantined Infected Person Receiving Treatment (γq). This ratio can reflect the hospital bed capacity and the removal rate of infected patient from the side.

Rate of Cured Patients Turning Positive (μ). For most patients who turn positive after recovery, the chance of repeated infection is very small, and this only shows that only a few of them have not fully recovered, the immune function is very poor, and the risk of infection cannot be ruled out. So we set it to μ = 0.1; .

In the adaptive SEIARD model, ξ is the ratio of recovered people returning to a susceptible state due to loss of immunity.

Birth Rate (ν1) and Natural Mortality Rate (ν2). Both of them have nothing to do with disease. It can simulate a constant population.

Here special emphasis is made on some newly introduced parameters:

Due to the rapid growth of the epidemic, SARS-CoV-2 consumes susceptible people, and the virus needs to find a new batch of susceptible people. The immunity gained by people infected with the SARS-CoV-2 will weaken over time, allowing the recovered population to return to the susceptible state. Here the ratio coefficient of reverting to the susceptible state after recovered(ξ) is introduced.

The study found that newborns belong to susceptible groups. In order to better maintain the sustainability and stability of the model, we introduce the birth rate ν1 and mortality rate ν2, ν1 = ν2 (It only consider the susceptible and exposed groups with a large base here).

Some assumptions are made for the model to work as follow:

  • The infection only transmits between people.

  • There are no specific drugs or vaccines for treatment at this stage.

  • Fully mixed: Equal opportunity for each individual to contact others.

  • Medical equipment has certain limits.

  • After being cured, the patient may still become infected. There are even cases where patients die suddenly after being cured.

Here are some parameters defined before describing the problem.

A similar stochastic model is used to initiate the initial value to start the simulation. Such as initial number of susceptible people (S0):1 × 105; Initial number of exposed people (E0):100; Initial number of infected people with symptoms (I0):20 (It is assumed that half of mild infected people I1(0) who have not received treatment and half of severe patients I2(0) who have received treatment).

After the initial parameters are set for the model, the initial values are used for curve fitting and continuous iterative optimization.

Adaptive SEIARD dynamic model belongs to a kind of dynamic model of infectious diseases. It can also be classified as a physical model (From another perspective, it is also an empirical summary). This research work analyzes the construction process of the model as follows.

Then the process of virus transmission can be characterized by the differential equations of the changes in the population of these 9 main groups. These 9 groups are all functions that follow the change of time, where t is set to a unit of time, and the following formula are derived:

$$\left\{\begin{array}{c}\frac{dS}{dt}={\upsilon}_1N-{\upsilon}_2S-\left(1-\rho \right)\beta \varphi \left(\varepsilon E+I+A\right)S-\rho \left(1-\varphi \right)\beta \left(\varepsilon E+I+A\right)S-\rho \varphi \beta \left(I+\varepsilon E+A\right)S+\omega {S}_q+{\beta}_1E+\xi R\\ {}\frac{dE}{dt}=\left(1-\rho \right)\beta \varphi \left(\varepsilon E+I+A\right)S-\alpha E-{\beta}_1E-{\upsilon}_2E\\ {}\frac{dA}{dt}=\alpha \left(1-\upeta \right)E-\left({\gamma}_R+\gamma \right)A-{\upsilon}_2A\\ {}\frac{d{I}_1}{dt}=\alpha \upeta {p}_1E-{\theta}_1{\gamma}_q{I}_1\\ {}\frac{d{I}_2}{dt}=\alpha \upeta {p}_2E-{\theta}_2{\gamma}_q{I}_2+{\gamma}_I{I}_q-{\gamma}_R{I}_2\\ {}\frac{dI}{dt}=\alpha \upeta E+\mu R-{\gamma}_q{I}_2-{\gamma}_R{I}_2-\gamma I\\ {}\frac{dR}{dt}={\gamma}_RA+{\gamma}_R{I}_2-\left(\mu +\gamma \right)R-{\upsilon}_2R-\xi R\\ {}\frac{dD}{dt}=\left(R+I+A\right)\gamma \\ {}\frac{d{S}_q}{dt}=\rho \left(1-\varphi \right)\beta \left(\varepsilon E+I+A\right)S-\omega {S}_q\\ {}\frac{d{E}_q}{dt}=\rho \varphi \beta \left(I+\varepsilon E+A\right)S-\alpha {E}_q\\ {}\frac{d{I}_q}{dt}=\alpha {E}_q+{\theta}_1\gamma {I}_1+{\theta}_2\gamma {I}_2-{\gamma}_I{I}_q\\ {}I={I}_1+{I}_2\kern1.75em {p}_1+{p}_2=1\\ {}N=S+E\end{array}\right.\kern1em$$
(1)

However, to list such similar equations we need an idealized condition, which are more important.

First of all, in this paper, we assume that the total number of people in the model remains balanced. That is S(t) + E(t) + I(t) + R(t) + A(t)+υ1N(t)=K, and K is a constant in unit time at time t.

For susceptible people, without isolation, the number of infectious patients contacting the susceptible people (E(t), A(t), I(t)) is separately proportional to the total number of susceptible people S(t). We successively set the coefficient to be (1-ρ)βφε, (1-ρ)βφ, (1-ρ)βφ, Therefore, the number of people infected by all patients (groups with certain infectivity) in unit time at time t is (1-ρ) βφ (ε × E (t) + I(t) + A(t)) × S (t), which is a constant value at time t.

In isolation, the proportional coefficients of 3 groups respectively are ρ (1-φ) βε, ρ (1-φ) βε, ρ (1-φ) βε, therefore, the number of susceptible people per unit time at time t is ρβφ × (ε × E(t) + I(t) + A(t)) × S(t).

Considering the situation where susceptible people contacted infectious patients and were isolated, the proportion coefficients of each infected people (E(t), A(t), I(t)) contacted and isolated are ρφβε, ρφβ, ρφβ, Therefore, the number of Eq per unit time at time t is ρφβ× (I(t) + ε × E (t) + A(t)) × S(t).

After the quarantine period expires, the isolated susceptible population (Sq) are released quarantine and returns to the susceptible population (S) with a coefficient of ω, the number of Sq in the unit time at time t is ω × Sq. The coefficient of the exposed population not infected with the SARS-CoV-2 and returned to the susceptible population is β1, which the number of people E (t) → S (t) in unit time at time t is β1 × E(t). The coefficient of recovered people to the susceptible people is ξ, and the number of people R (t) → S (t) in unit time at time t is ξ × R(t).

For exposed people (E), at time t, the number of people from S (t) to E (t) per unit time is (1-ρ)βφ(ε × E (t) + I(t) + A(t)) × S(t).The number of people flowing out of the exposed population is divided into two main branches, and the number of people who flow to the asymptomatic group (A) at time t is α × (1-η) × E (t), The other one goes to symptomatic infection patients, and symptomatic infection patients are divided into mild infections (no treatment but strong infectivity) and severe patients (receiving hospital treatment). The proportional coefficients respectively are αp1, αp2, where p1+ p2 = 1.

For asymptomatic infected people (A), since the asymptomatic infected person does not have any symptoms in a long time, it is assumed that the asymptomatic infected person has not received any treatment and isolation measures. As Fig. 13 shown, at time t, asymptomatic patients can heal themselves through their super-strong immunity, and no symptom has ever occurred all the time like in the scenario 2. That is the process: A (t) → R (t), the conversion coefficient is γ, and the number of people per unit time at t is γ × A (t). Otherwise the asymptomatic infected person finally failed to defeat the SARS-CoV-2 and died of sudden onset that is A (t) → D (t) like the scenario 1, the conversion coefficient is γq, and the number of people per unit time at t is γq × A(t).

Fig. 13
figure 13

Structure diagram of asymptomatic patients in two scenarios

For all symptomatic infections I(I1 + I2), our new model considers the problem of limited beds, and isolates mild patients I1 (mild, not receiving treatment) and severe patients I2 (severity, receiving treatment) with different probability, which we can describe them as I1(t) → Iq(t) or I2(t) → Iq(t). Their conversion coefficients respectively are θ1γq, θ2γq, and the number of I1 people per unit time at t is αηp1E (t) -θ1γq I1 (t). There are some mild patients with the SARS-CoV-2. They may not be seeking medical treatment or home cultivation for a long time, and they are in an active transmission state after the onset. In fact, the longer infection period is the main reason why the basic infection rate of the SARS-CoV-2 is significantly higher than SARS. When the hospital bed is released, the coefficient from Iq to I2 is γI, and the number of severe patients I2 per unite time at time t is αηp2E(t) + γIIq(t)-θ2γqI2(t). For the overall symptomatic patient (I), the conversion coefficient from R (t) → I (t) is μ, and the number of I people per unit time at time t is μ × R (t) in this way; The conversion factor from I (t) → D (t) is γ, which the number of people per unit time at t is γI (t); Therefore, the number of all symptomatic patients per unit time at time t is αηE(t) + μR(t) + γIIq-γRI2-γI through all sources.

For the recovered population (R), when R (t) → D (t), the rate of removal per unit time is γ, and the number of recovered people moving out at a certain point is γR (t). Then the number of recovered people R (t) is γR A(t) + γRI2(t)-(μ + γ + ε)R-υ2R in the end.

For the dead population, the three main sources of inflow to the dead respectively are asymptomatic patients without treatment, that is A (t) → D (t); symptomatic severe patients after treatment, that is I2(t) → D(t); and sudden death of recovered population, that is R (t) → D (t). Their removal rate is γ, and the number of people per unit time at t is (R + I + A) γ.

For the other three compartments of people in isolation, they respectively are isolated susceptible people (Sq), isolated people in incubation period (Eq) and isolated infections (Iq). Among them, the conversion coefficient from Eq into Iq by accidental infection is α, at a certain time between the two, the conversion number is α × Eq(t). According to the analysis, the conversion number of S → Sq at time t is ρ(1-φ)β(εE + I + A)S-ωSq; The conversion number of S → Eq at time t is ρφβ(I + εE + A)S-αEq; Similarly, the conversion number of Eq → Iq at time t is αEq + θ1γI1 + θ2 γI2-γIIq.

Therefore, it can be known that the mutual mechanism of action (MOA) among these 10 populations as follows: Probability of implementation of isolation measures (ρ), probability of exposure to infection (φ) between susceptible and infected people (including the exposed in incubation period (E), symptomatic patients (I) and asymptomatic patients (A) and the infection rate between the susceptible population and each infected population simultaneously act on the total susceptible population. They are the people also involved isolated susceptible people(Sq) and isolated people in incubation period(Eq); The infection rate (α) of the E population and the probability (η) of dominant symptoms (I) act on the exposed population (E) at the same time; The removal rates of asymptomatic recessive patients respectively are the recovery rate (γR) and mortality (γ), the coefficient of transfer to recessive patients from exposed populations (E) together with recovery rate (γR) and mortality (γ) also simultaneously act on asymptomatic patients (A); Coefficients (αηp1, αηp2) of patients with dominant symptoms (I) transferred from exposed population (E), probability of recovery population becoming positive (μ), recovery rate of severe patients (γR), removal rate (γ), and their respective isolation rate (θ1γq, θ2γq) and hospital admission rate (γI) act on patients with dominant symptoms at the same time; The outflow probability of the recovery population becoming positive, the susceptibility rate of the recovery population (ξ) and the recovery coefficient (γR) of different populations (A and I) also affect the recovery population. But at the same time, this is a two-way cyclic mechanism. They influence and interact with each other.

To simplify the differential equations and establish a numerical simulation model, the above differential equations are expressed in discrete form as follows:

$$\left\{\begin{array}{c}S\left(t+1\right)=S(t)+{\upsilon}_1N(t)-{\upsilon}_2S(t)-\left(1-\rho \right)\beta \varphi \left(\varepsilon E(t)+I(t)+A(t)\right)S(t)-\rho \left(1-\varphi \right)\beta \left(\varepsilon E(t)+I(t)+A(t)\right)S(t)\\ {}-\rho \varphi \beta \left(I+\varepsilon E(t)+A(t)\right)S(t)+\omega {S}_q(t)+{\beta}_1E(t)+\xi R(t)\\ {}E\left(t+1\right)=E(t)+\left(1-\rho \right)\beta \varphi \left(\varepsilon E(t)+I(t)+A(t)\right)S(t)-\alpha E(t)-{\beta}_1E(t)-{\upsilon}_2E(t)\\ {}A\left(t+1\right)=A(t)\alpha \left(1-\upeta \right)E(t)-\left({\gamma}_R+\gamma \right)A(t)-{\upsilon}_2A(t)\\ {}{I}_1\left(t+1\right)={I}_1(t)+\alpha \upeta {p}_1E(t)-{\theta}_1{\gamma}_q{I}_1(t)\\ {}{I}_2\left(t+1\right)={I}_2(t)+\alpha \upeta {p}_2E(t)-{\theta}_2{\gamma}_q{I}_2(t)+{\gamma}_I{I}_q(t)-{\gamma}_R{I}_2(t)\\ {}I\left(t+1\right)=I(t)+\alpha \upeta E(t)+\mu R(t)-{\gamma}_q-{\gamma}_R{I}_2(t)-\gamma I(t)\\ {}R\left(t+1\right)=R(t)+{\gamma}_RA(t)+{\gamma}_R{I}_2(t)-\left(\mu +\gamma \right)\mathrm{R}(t)-{\upsilon}_2R(t)-\xi R(t)\\ {}D\left(t+1\right)=D(t)+\left(R(t)+I(t)+A(t)\right)\gamma \\ {}{S}_q\left(t+1\right)={S}_q(t)+\rho \left(1-\varphi \right)\beta \left(\varepsilon E(t)+I(t)+A(t)\right)S(t)-\omega {S}_q(t)\\ {}{E}_q\left(t+1\right)={E}_q(t)+\rho \varphi \beta \left(I(t)+\varepsilon E(t)+A(t)\right)S(t)-\alpha {E}_q(t)\\ {}{I}_q\left(t+1\right)={I}_q(t)+\alpha {E}_q(t)+{\theta}_1\gamma {I}_1(t)+{\theta}_2\gamma {I}_2(t)-{\gamma}_I{I}_q(t)\end{array}\right.$$
(2)

Among them, S (t), E (t), I (t), A (t), R (t) and D (t) respectively indicate that the number of each population in a susceptible state, incubation period, infectious period, and asymptomatic infection, recovery state and inanimate state per unite time at time t. N (t) represents the sum of the number of people in each population at time t. The formula is as follows: N (t) = S (t) + E (t). Therefore, we get 12 discrete equations that reflect the iterative relationship between 6 main state quantities and 5 derived states. If all the parameter values are known, the value at time t can be derived from time t-1, the entire sequence can be obtained from the starting point.

For a mature model, the reliability of the output results is entirely determined by the accuracy of the input parameters. So in the paper, we start with the specific situation of the SARS-CoV-2, focusing on evaluating, verifying, and correcting the input values of the model from multiple angles to improve the model output and get reliable results.

The following process is the basis for adjustment of the parameters in accordance with the actual situation.

6.1 Optimization result of fitting curve

The initial date of the data on which curve fitting was done is on February 29th 2020. The number of susceptible people is set to 100,000. In order to predict the development of the epidemic, SEIARD model is applied to perform a simple fit on the actual data stream. A deep learning method based on the epidemic data of the United States was trained to simulate the epidemic trend. The trend is then refined iteratively to optimal.

As shown in Fig. 14, in the Quantile-Quantile Plot (Q-Q Plot), the two axes are the Quantile function of the first set of data (simulation of adaptive SEAIRD model) and the second set of data (actual data). Comparing the quantiles of the two sets of data together, we can see whether the two are “similar”, and whether the scatters almost fall on a line similar to y = x, indicating that the two distributions are similar. It shows that the predicted value of the model can well predict the trend of actual data. However, the curve fitting effect is not good. There is no deep learning to do the optimized fitting, so the model has a poor fitting effect on the obvious fluctuations in the epidemic data in the United States.

Fig. 14
figure 14

Simulation of infected cases

Therefore, the neural network method is introduced to improve the fitness of our model. The result is shown in Fig. 15.

Fig. 15
figure 15

Simulation of infected cases after neural fitting

The final optimized performance results are shown as bellowed:

The backpropagation (BP) neural network is trained iteratively. Fig. 16(a) shows that the training (epoch) stops after 1000 times. The test results are shown in Fig. 15. By optimizing the fitting, the simulation curve (red solid line) begins to approach the real data curve (blue line). and when the epoch stops, the best training performance value is 1287.267, which it is the optimal performance result that can be trained. As shown in Fig. 16(b), most errors of the error histogram graph are between (−26.69, 181.5). Compared with the huge epidemic data stream, the error here is very small and can be ignored. As Fig. 16(c) shows, the training dataset causes an over-fitting. Fig. 15 shows that the two curves almost overlap. It may also be that the accumulative infected curve is relatively smooth and there are not too many glitches to cause an excessively high degree of fit. From Fig. 16(d), we found that the error of fitting curve is overall small; the biggest error appears in the testing dataset, which is shown directly in Fig. 16(c).

Fig. 16
figure 16figure 16

The performance of neural fitting and error analysis

As shown in Fig. 17, comparing the quantiles of the two sets of data together, the scatters almost fall on a line (y = x), indicating that the two distributions are similar. It shows that the predicted value of the model can well predict the trend of actual fatalities data. But the degree of fitting is mediocre.

Fig. 17
figure 17

Simulation of Fatalities

After using the neural network, as shown in Fig. 18, the degree of curve fitting is greatly improved.

Fig. 18
figure 18

Simulation of Fatalities after Neural Fitting

In Fig. 19(a), the fitting is stopped at the epoch 600. At that time, the training performance is 51.8461(MSE). As shown in Fig. 19(b), most errors in the error histogram are between (−1.179, 1.57). And It can fit the target data well, which is shown as the Fig. 19(c). The error between the target dataset and output dataset is small.

Fig. 19
figure 19figure 19

The training performance of daily fatality cases

7 Comparative simulation experiment of models

In order to test the performance of adaptive SEAIRD model proposed in this article, accumulative confirmed and new confirmed data from March 1st to March 29th are selected as control group of the simulation experiment. Each model respectively performs 29 iteration experiments and the model’s simulation data are recorded in terms of Root Mean Squared Error (RMSE) and R-squared (R2) between the actual data feed and simulation data feed.

Since SEIR model is a relatively mature and commonly used epidemiological prediction model, the infectious diseases studied in this model have a certain incubation period. Healthy people who have been in contact with patients do not become ill immediately but become carriers of pathogens. Compared with other traditional infectious disease models such as the SIR model, the SEIR model further takes latent persons into consideration, and is closer to the transmission method of the new coronavirus. Therefore, in this paper, the classical SEIR model is chosen to replace other traditional infectious disease models such as SI, SIR, SIRS and SEIR models for comparing with the improved model. The results from different models are tabulated in Table 5:

Table. 5 Time Series Statistics Population of Confirmed Cases in Different Model

In order to analyze the modeling effects of the three models, R-squared (R2) and Mean Squared Error (RMSE) are used for comparison. R-squared (R2) is a statistical measure of how close the data are to the fitting curve. It is also known as the coefficient of determination [34], or the coefficient of multiple determination for multiple regression. In general, the higher the R-squared (R2), the better the model fits your actual data.

The formula of R2 are as follows:

(3)

The RMSE value is used to measure the error between the fitted value and the real data [35]. The more the RMSE value tends to 0, the better the fitting effect.

The formula of RMSE are as follows:

$$RMSE=\sqrt{\frac{1}{n}{\sum}_{t=1}^n{\left({X}_t-{Y}_t\right)}^2}$$
(4)

The R-squared (R2) and Mean Squared Error (RMSE) of Adaptive SEAIRD model, classical SEIR model and SEAIRD model proposed in this paper is respectively shown as Table 6, The ranking of the goodness of Fit of the three models is Classical SEIR Model < SEIRD Model < Adaptive SEAIRD Model (Larger Values are Optimal). And the error ranking of these three models is Classical SEIR Model < SEAIRD Model < Adaptive SEAIRD Model (Smaller Values are Optimal). It can be seen from the comparison results that the Adaptive SEAIRD model proposed in this paper has a better goodness of fit (R2) and a smaller error value (RMSE). It shows that the adaptive SEAIRD model has a better fitting effect, can better fit the trend of the accumulative diagnosed cases in the United States, and is more in line with the propagation rule of the epidemic development.

Table. 6 Accuracy of Different Model (Accumulative Confirmed)

Similarly, the development trends of newly confirmed cases in the United States under each model are compared. The numerical simulation results are shown in Table 7:

Table. 7 Time Series Statistics Population of New-confirmed Cases in Different Model

Among them, R2 and RMSE of the new confirmed cases of Classical SEIR model, SEAIRD Model and Adaptive SEAIRD Model are shown in Tables 8 and 9. It is found that the R2 value of SEAIRD Model and Adaptive SEAIRD Model is far superior to Classical SEIR Model, and Adaptive SEAIRD Model is slightly higher than SEAIRD Model, which can be regarded as approximately equal to each other. However, in terms of error value RMSE, the ranking of these three models is Classical SEIR Model <SEAIRD Model <Adaptive SEAIRD Model. Adaptive SEAIRD Model is far superior to the former two models. It can be seen that the effect of these three models more apparently from Fig. 20 (b). This essentially demonstrates that the network complexity and adaptability of the new model is higher than that of the Classical SEIR Model, which is more in line with the current trend of epidemic in United States.

Table 8 Accuracy of Different Model (New Confirmed)
Table. 9 Verify Performance of Different Model
Fig. 20
figure 20

Fitting Curve of Different Model

8 Model prediction comparison experiment

In order to objectively measure the prediction effect of these three models, the testing dataset is used to verify each model. The optimal parameters of the model obtained by the previous training are introduced into the differential equation to predict the change of daily confirmed cases and accumulative confirmed cases over time. The prediction effect is shown in Fig. 21. The x-axis (absicissa) is the number of diagnoses, and the y-axis (ordinate) is the time variable T. The changes of each model in the time period of T (30 ≤ T ≤ 57) are being predicted, thus further verifying the superiority of the model proposed in this paper.

Fig. 21
figure 21

Experiment of Validation Model

It can be seen from the graphs above:

The yellow area is the testing area for verifying the performance of the model, and the scattered points are the actual data from testing dataset. The solid green line in Fig. 21 (a) is the prediction curve of the Classical SEIR Model, and its prediction effect is the worst; the blue solid line is the prediction curve of the SEAIRD Model which of the prediction effect is followed by the adaptive SEAIRD model; the solid purple line is the prediction curve of Adaptive SEAIRD Model, which can well predict the change of actual SARS-CoV-2 data.

In Fig. 21 (b), the solid orange line is the prediction curve of Classical SEIR Model. Compared with the prediction curves of the other two models, its prediction effect is the worst; while SEAIRD Model (blue curve) and Adaptive SEAIRD Model (pink solid line) can well predict the change of the actual cumulative diagnosis data.

In order to more accurately describe the prediction performance of these three models in the testing dataset, the performance of the three models in Daily Confirmed Cases and Accumulative Confirmed Cases by evaluating indicators R2 and RMSE are quantitatively analyzed. It can be clearly seen that in Daily Confirmed Cases, the prediction result of Adaptive SEAIRD Model is more accurate than Classical SEIR Model and SEAIRD Model. In accumulative confirmed cases, the prediction effect of Adaptive SEAIRD Model is almost the same as SEAIRD Model, and the prediction effect of Classical SEIR Model is the worst.

Considering the epidemic development process of different populations, the prediction curves of other populations are shown below:

In Figs. 22 and 23, the fitted line can roughly reflect the development trend of the epidemic in the next two months or so. The results have predicted that the spread of SARS-CoV-2 based on mathematical model when the epidemic turning point appeared.

Fig. 22
figure 22

Simulation of other parameters in SEIARD model

Fig. 23
figure 23

Simulation of other parameters in SEIARD model after neural fitting

In the model, since, the birth rate and mortality rate we set are low (0.1), which is not enough to offset the number of deaths which withdraw from the model, the total statistics population of models rises first and then gradually stabilizes. The trend causes the initial daily births and daily natural deaths to gradually decrease. After the model is stabilized, the daily number of births and the number of daily deaths start to run steadily at a lower number. The change curve of the number of susceptible people (S) is depicted. As the number of infected people and people who have been cured with antibodies increases, the number of susceptible people becomes smaller and smaller, and the speed of transmission will decrease. Since, people in the infectious period have less chance of contacting susceptible people. As a result, the number of susceptible groups needs to be described in the model. The curve of susceptible people is shown as above. The extra-long period of infection of the new coronavirus may be mainly because it has a large number of mild patients (I2) and asymptomatic patients (A). These patients will not seek medical treatment or take measures in time and will be in an active infection state for a long time. Asymptomatic patients may be the main reason for the rapid spread of the SARS-CoV-2, and it is also the core of the early prevention and control of the epidemic. Therefore, studying the change in the number of asymptomatic patients has become an important aspect of this study. From Figs.22 and 23, we can see that the number of asymptomatic infections increased rapidly about 1.5 months (worst prediction result) or 1 month (optimistic forecast results) slowly declined. It shows that effective isolation measures have much benefit to the control of asymptomatic infections. The optimized adaptive SEIARD model can effectively predict the epidemic trend of SARS-CoV-2, confirming the public health interventions implemented from February 29th 2020, (isolation of susceptible people, isolation of people who have been in contact with infected people, and isolation of patients due to medical capacity removal to receive quarantine) effectively controlled the further development of the epidemic. Some asymptomatic patients are still in the incubation period when they are tested, and the symptoms lag behind. The median incubation period is about 4.5 days, up to three weeks; while other asymptomatic patients may not have symptoms until the virus disappears. But both asymptomatic infections cannot be ruled out as infectious. Undoubtedly, finding an asymptomatic infected person is obviously a little harder than finding a confirmed case. Doing a good job in monitoring, tracking, isolation and treatment of asymptomatic infected people, which it tests the ability of precise prevention and control and the fineness of social management and governance.

9 Model applicability

By comparing the epidemic simulation result of two countries with different cultural backgrounds, the correctness and effectiveness of the adaptive SEAIRD model is verified, and its self-adaptability and portability are further explored [36]. In this study, the Singapore epidemic data is used to conduct an adaptability verification experiment. The Singapore epidemiological data of COVID-19 comes from WHO and other official websites related to the epidemic. Compared with the United States, Singapore has adopted the American plan in the early stage, but Singapore has done a good job in the rapid screening and tracking patients. Unlike some countries, Singapore has not implemented blockade measures, but it has controlled the spread of the epidemic well. Table 10 is a comparative study of actual epidemic data and simulated data in Singapore.

Table. 10 Simulation Results of Singapore Epidemic Data

As shown in Fig. 24, this study is based on the method established by simulation data, and the use of actual data can more objectively reflect the characteristics of the incidence and outbreak of COVID-19, including the incidence level, trend of change, and turning point of the epidemic. The study uses the same time period as the US epidemic data studied in this article, and the evaluation results may be more reliable. By using the adaptive SEAIRD model to fit the actual epidemic data in Singapore, the optimal main parameters of the epidemic are successfully obtained,. Subsequently the development trend of the epidemic is fully and accurately predicted. It is reconfirmed that the adaptive SEAIRD model is reliable in predicting the spread of infectious diseases. This model and method can obtain more reliable data to determine virus infection characteristics.

Fig. 24
figure 24

Singapore outbreak simulation data

10 Risk assessment

The indicators used to measure the severity of the epidemic, including Contact Rate, Infection Rate, Confirmed Rate and Death Rate are listed in Table 11. Through these indicators, we can quantitatively measure the trend of the epidemic and make necessary anti-epidemic measures when the epidemic develops to a critical point. These five risk assessment indicators are used as target parameters to objectively assess the infectious strength of SARS-CoV-2 and to predict the scale and peak time of patients, implementing necessary prevention and control measures by decision makers, assessing the impact on the economy and how investors respond all have important practical significance. Using the iterative equations and estimated parameters established above, we can predict the future development of the epidemic.

Table. 11 The indices of risk assessment

The worst prediction result is discussed here. As Fig. 25, the infection rate and confirmed rate in U.S. keep raising and trend to flattening in the end. This is a risk signal, indicating that the number of infected people in the United States will continue to grow at a high growth rate. The government should take measures to reduce the infection rate. At the same time, we see that the mortality rate is increasing exponentially. The government needs to take measures to strengthen medical conditions. The model introduces isolation measures, so we can see that the contact rate drops after reaching a certain height. It shows that the isolation measures can control the epidemic situation to a certain extent.

Fig. 25
figure 25

Simulation of the indices of risk assessment

To take prevention and control measures is to reduce these 5 parameter values, such as infection rate, confirmed rate, cured rate, death rate and contact rate, thereby reducing the number of infection cases. Although the number of newly confirmed cases is still rising now, in the epidemic transmission model, as long as the basic infection number R0 is lower than 1, the epidemic will not be out of control [37]. By observing infection rate curve and confirmed rate curve in Figs. 26, it can be found that the future maximum infection rate will reach about 8 (worst situation) or 7 (optimistic situation), and the turning point of newly confirmed cases will be 5 months later (worst situation) or 4 and a half months (in the optimistic situation) appears. The rapid increasing in infection rate and confirmation rate in early stage may be due to the increase in the efficiency of diagnosis and the backlog of a large number of patients to be tested in the early stage, which will soon drop after a period of stability.

Fig. 26
figure 26

Simulation of the index of risk assessment after neural fitting

Fig. 27
figure 27

Lymphocytes counts of different severity

Fig. 28
figure 28

Neutrophil counts of different severity

Fig. 29
figure 29

Alanine Aminotransferase Content of Different Severity

Fig. 30
figure 30

Aspartate aminotransferase content of different severity

Fig. 31
figure 31

Total bilirubin content of different severity

11 Potentially lethal features

The following factors may cause infection with SARS-CoV-2: High Lymphocytes Counts, High Neutrophil Counts, High Alanine Aminotransferase, High Aspartate Aminotransferase (AST), High Total Bilirubin.

In Fig. 27, high levels of lymphocytes indicate that your body is infected or other inflammatory diseases [38]. This is a danger signal from body functions. Patients in intensive care unit have the most SARS-CoV-2 in the body at this time. Appearing high content of lymphocytes is the normal immune function of the body, and the body releases a high content of lymphocytes to resist virus invasion [39].

In Fig. 28, since neutrophils are often associated with inflammatory infection and tissue damage [40], neutrophils represent the deterioration of tumor induced by cancer-related inflammation. This also explains that patients with malignant tumors are more likely to be infected [41].

In Fig. 29, high levels of (Alanine Aminotransferase) ALT can indicate liver damage from hepatitis, infection, cirrhosis, liver cancer, or other liver diseases [42,43,44]. SARS-CoV-2 can cause damage to the liver and may even cause liver failure and cancer.

In Fig. 30, when a liver is damaged, the body will allow more (Aspartate Aminotransferase) AST to enter one’s blood, thereby a high AST level is a sign of liver damage if the level of AST is high [43, 44].

In Fig. 31, Elevated level of Total Bilirubin may indicate liver damage or disease. By detecting total bilirubin, the occurrence of acute liver failure can be identified in time [45]. For infectious diseases, the control and prevention strategy should be implemented from three aspects as a means of intervention to stop or slow down the spread: source of infection, transmission route, and susceptible people.

  • improve epidemic information monitoring

  • isolate the diagnosis and treatment of infection sources

  • speed up the diagnosis of suspected cases

  • standardize the management of close contacts

  • pay attention to the prevention and control of cluster infections

  • pay attention to the prevention and control of returnees

  • strengthen community prevention and control

12 Discussion

In this paper, MATLAB is used to establish an infectious adaptive SEAIRD model with incubation period. The model is used to inversely calculate the parameters during the propagation process [46, 47] of SARS-CoV-2 over time to obtain important parameters such as the infection rate, exposure rate, and removal coefficient of the SARS-CoV-2 in the United States. The calculation results of the model are basically consistent with the actual data. Inverse deduction of parameters explains the SARS-CoV-2 propagation process well, indicating that the adaptive SEAIRD model can be used for data fitting, trend prediction and process simulation of SARS-CoV-2 propagation [48, 49]. Obtaining the relevant forecast data of this epidemic [50, 51], a comparative analysis with the actual data is made. The conclusions about this research are as follows:

Factors such as isolation measures to the model are added, which constantly improve the model construction, and revise the parameter estimates based on more epidemic data. As a result, the improved model is made to be more adaptable.

In the selection of the model, this study selected the classic SEIR model [52, 53] to simulate the development of the epidemic. By comparing the goodness of fit (R2) and RMSE between the SEIR Model and the improved adaptive SEAIRD Model, the superiority of our model can be explained.

This approach uses two indicators to evaluate the effectiveness of the model [54, 55]. The first is the evaluation of the model’s fitting effect, and the second is the evaluation of the model’s prediction effect. R2 and RMSE are used for quantitative evaluation, and the training dataset and testing dataset of the epidemic in United States are used to cross-validate the model. The model has a good fitting effect on the actual data in the early stage of the epidemic, and the prediction effect is better than the traditional infectious disease model.

13 Conclusion and future directions

Basic models such as SIR and SEIR that have been used popularly during the period of SARS may fall short in COVID-19 prediction. Behind COVID-19, SARS-CoV-2 is one of the trickiest pathogens, its virus characteristics are found to be different from its previous versions. Asymptomatic patients and relapse from recovery are two factors that cause much uncertainty in the traditional prediction models. Therefore, a new model called SEIARD is proposed in this paper that has taken these two factors into account.

The complex propagation model proposed in the paper analyzes the pros and cons of improved models and methods, and reveals future research directions and potential applications. In SEIARD, there are 4 pros as follows: (1) Detailed parameters, capable of simulating a variety of interactive sources of infection. (2) By improving the existing model that only considers that one person in a single network is affected by other people, and the process of spreading infection is independent, we propose the idea of constructing a virus spreading model in multiple networks (3) The adaptive SEAIRD model can gather various groups into an equation-based model, so the model does not need to treat everyone as an individual. In the case of not requiring high resolution, the calculation in the model will be simpler and faster. (4) In the “equation-based” model, individuals are divided into different groups. But when the group is divided into detailed, more representative, and more persuasive, the model becomes more complicated.

We further subdivide the compartment module into 10 categories. Based on the Classical SEIR Model, a new module (asymptomatic patients) is added to the original compartment module of the infectious disease model with incubation period. We also subdivide the patients with symptoms into mild infected population (I1) and severe infected population (I2) and divide those who are removed into recovered population (R) who recovered after infection and dead population(D) [56]. Due to the relatively long transmission time of infectious diseases, in our model, we take the birth rate and natural mortality into account and proposed the situation of re-immunization in temporary immunization to better reflect the reality by the model. It is possible that the recovered population who has lost immunity will feedback to the susceptible population (S) instead of exiting the system. It is also considered that the impact of isolation measures on the epidemic, increased the overall complexity of the model, and studied the development of COVID-19 in more detail. In our new model, fixed parameters are not used to solve the dynamics equations. Instead neural networks are used for analyzing the trend of existing real data in order to obtain the optimal parameters to predict the development of the epidemic. Using dynamic parameters could reduce the uncertainty of dynamics equations to a certain extent. Nonlinear Ordinary Differential Equations (ODEs) of different models are simulated for predicting the development of the epidemic. It is found that by comparing two indexes (R2 and RMSE), the classic SEIR model and SEAIRD model are not as good as the adaptive SEAIRD Model in terms of fitting effect, especially in the later stage of prediction. The goodness of fit of the former two (0.60624 in classical SEIR model<0.94651 in SEIRD model<0.98404 in adaptive SEAIRD model) has dropped significantly, and the RMSE value (4132.234828 in classical SEIR model>1523.0608 in SEAIRD model>831.9189 in adaptive SEAIRD model) is far inferior to the improved model. In this research, the trend of various risk assessment indicators is also studied, which is helpful for the relevant epidemic prevention and control departments to take corresponding intervention measures. They have important theoretical insights and practical significance for preventing the outbreak of infectious diseases.

Limitations and future Directions: From the current research, there is still a lack of synchronization in the simulation of infectious disease transmission models. Among them, the time dynamic model of the infectious disease model is not synchronized with the spatial dynamic model. For example, in the SEAIRD model, the method in the model focuses on the time change of the epidemic but ignores the process of spatial change. In theory, the forecasting model may accurately predict the development of the epidemic within one to two weeks in advance. However, due to inherent uncertainty and lack of accurate information, the accuracy of the forecast has been reduced. In order to minimize the impact of incomplete information and incorrect assumptions, we need to perform hundreds of independent runs on each parameter, and make subtle adjustments to the input parameters each time. This “sensitivity analysis” will reduce the error of the model results when a single input changes. But the calculation time will be longer. In theory, we compare the actual cases with the predicted results, but the actual reported data is not accurate. According to the actual epidemic report, few researchers have been able to assess the accuracy of predictions made during the outbreak period or after the outbreak.

Adaptive SEAIRD model can be further improved and perfected in the following aspects:

  1. 1)

    Some factors are not considered in the model such as measures like regional blockade and case tracking during the epidemic, imported cases abroad, some people who are not in contact with SARS-CoV-2 patients carry antibodies against SARS-CoV-2 and are immune to SARS-CoV-2, some recessive infections who carry SARS-CoV-2 but are not infectious and the population density of different regions, and other unknown quantities (such as the number of people moved out) in the model cannot be accurately predicted. The model needs to be further refined and considered to be improved into a segmented form.

  2. 2)

    Through model calculations, we also found that there is a lack of detailed data that truly reflects the actual situation of SARS-CoV-2, and it is impossible to reasonably determine the specific values ​​of some parameters. It can only be obtained by inverse deduction of the model. The analysis of the model requires more accurate data.

  3. 3)

    Adaptive SEAIRD model can only reflect the spread process of infectious diseases over time, but not the space spread process. Therefore, we will improve this model in the future and plan to propose a time-space spread model of infectious diseases.