Background

To date, tuberculosis (TB) is regarded as one of the world’s deadliest diseases caused by a single infectious agent, second only to acquired immunodeficiency syndrome (AIDS) caused by human immunodeficiency virus (HIV) [1, 2]. TB is caused by Mycobacterium tuberculosis (MTB), which can be expelled into the air when the infected people cough, talk, sneeze or sing [3]. In most situations, MTB generally affects the lungs of infected individuals. Tuberculosis is so highly contagious that the susceptible are likely to develop TB even when they inhaled tiny particles containing MTB. The MTB is very small and can stay in the air for a long time and keep moving. The immune system is an essential defense mechanism that limits the growth and spread of MTB. If the immune system cannot suppress their growth, they will most likely spread throughout the body [4]. In fact, not all people infected with TB will be sick at once. Some people’s incubation period may last from 1 year to several years. Nowadays, reliable TB tests do not exist [5], which causes many undiagnosed infectious and increases the difficulty in controlling TB.

It was reported in the Global Tuberculosis Report 2018 that there were 1.3 million deaths caused by TB [2]. People may be infected with TB in all countries and age groups, but overall, 90% were adults (aged ≥15 years), 9% were people living with HIV (72% in Africa). Most cases happened in following eight countries: India (27%), China (9%), Indonesia (8%), the Philippines (6%), Pakistan (5%), Bangladesh (4%), Nigeria (4%), South Africa (3%). WHO listed all of these countries and 22 other countries as the top 30 countries with the high burden of TB, which account for 87% of the world’s cases. On the contrary, the global cases of the WHO European Region (3%) and the WHO American Region (3%) account for only 6% [2].

Although the number of people infected with TB has been declining, there are about 10 to 20 thousands cases of TB every year in the last 20 years (see Table 1) [6], and the death rate is between 0.05 and 0.07 in America. Therefore, it is essential for researchers to explore the factors related to the infection, outbreak, and epidemic of TB so that we could take more measures to protect people from TB.

Table 1 Population, tuberculosis Cases, Case Rates per 100,000 Population, Deaths, and Death Rates per 100,000 Population, and Percent Change: The United States, 1984-2017. Where, \(Rate^{i1}_{j}=\frac {Number^{i1}_{j}}{Number_{j}}\), \(Number^{i2}_{j}=\frac {Number^{i1}_{j}-Number^{i1}_{j-1}}{Number^{i1}_{j-1}}\), \(Rate^{i2}_{j}=\frac {Rate^{i1}_{j}-Rate^{i1}_{j-1}}{Rate^{i1}_{j-1}}=(\frac {Number^{i1}_{j}} {Number_{j}}-\frac {Number^{i1}_{j-1}}{Number_{j-1}})/\frac {Number^{i1}_{j-1}}{Number_{j-1}}\), (i=1,2) and j is the year. For example, 2016, 2.9=9253/3.2307×103; −3.1=(9253−9547)/9547; −3.8=(9253/3230.7−9547/3207.4)/(9547/3207.4). The data of population from [7] and others from [6].

From 1945 to 1955, the widespread usage of antibiotics reduced TB mortality in the United States by 70%, but the United States still has a severe TB epidemic [4, 8]. Based on decades of technology and experience, most active and latent TB can be effectively treated, and latent TB can be treated with Isoniazid. However, treatment can be effective only if the course of cures lasts no less than six months [4]. Active TB can be eliminated with a complex treatment regimen, coupled with the treatment of multiple drugs (Isoniazid, Rifampicin, Pyrazinamide) for nine months [4, 8].

In the United States, people who contract TB and receive treatment fall into two categories [9, 10]. The first type of people are those who complete the treatment and eventually recover. The other type of the infected people do not complete treatment, due in part to lost data, patients’ adverse reactions to the drug, or their refusal of treatment, etc. Another potential reasons that causes incomplete treatment is the high cost of TB treatment, which means a heavy burden to the general public. When the treatment is incomplete, the drug-resistant strains will reproduce, which may severely increase the difficulty of treatment [4, 11, 12]. According to the latest treatment outcome data for new cases in 2016, 82% of the people can successfully recover from TB. Apart from the reduction from 2015 to 2016, there is another reduction from 86% in 2013 to 83% in 2015 [2].

Bacillus Calmette-Guerin (BCG) is a vaccine that has been used to prevent TB for a long time. The protection period of BCG vaccine usually varies from 10 to 20 years [13]. BCG precludes nearly 20 percent of children from getting infected, while the vaccine prevents approximately half of those already infected from getting worse [14]. The study of Fjallbrant et al. [15] suggested that the primary vaccination and revaccination of negative tuberculin skin test (TST) of young adults with BCG caused a significant increase in the T-helper 1 (Th1) response against mycobacterial antigens, suggesting a protective effect against TB. This gives support for the policy of primary vaccination as well as the revaccination under this setting and age group. In some areas where TB is prevalent, the primary BCG vaccination is essential, though should not be the primary measure for TB control. BCG is widely used in countries with a high burden of TB, but not those with a small burden. In the United States, BCG is often used by special people [16] but not widely used by the general public. As an alternative, Chemoprophylaxis [17, 18] is more affordable, easier to take, and can prevent latent infectious from turning into active TB. Thus, for some countries less inflicted by TB, Chemoprophylaxis is a better alternative to control TB [19].

On the dynamic model study, many researchers have devoted big efforts for the research of the epidemic law and transmission dynamics of TB. In 1962, Waaler et al. established the first dynamic model of TB based on a susceptible-infected-recovered (SIR) model [20]. From then on, many models that consider multiple influencing factors have been established, such as reinfection [2123], vaccination [2426], interactions with HIV [27, 28], reactivated [29, 30], Chemoprophylaxis [19] and so on [3134]. Revelle et al. considered prophylaxis, cure and BCG vaccination to research the optimal strategy to fight against TB, which was then extensively used to study the epidemic model of transmission for infectious disease in 1967 [35]. Buonomo et al. studied the global behavior of a non-linear susceptible-infectious-removed (SIR)-like epidemic model with a non-bilinear feedback mechanism [36]. The SEI model proposed by Bowong et al. exhibits the traditional threshold behavior [34]. Whang et al. used an SEIR model with the time-dependent parameters to develop a dynamic model for TB transmission in South Korea [37]. A mathematical model was proposed to understand the spread of TB disease in the human population for both pulmonary and drug-resistant subjects by Mishra et al. [38]. Three control factors must be considered simultaneously to decrease the threat of TB by Gao et al.: a preventive measure in the form of vaccination and two treatment measures aiming at the susceptible and individuals infected TB in the active stage and latent stage [39].

In developing countries, the increase of TB cases by a high level of undiagnosed infectious population and incompletely treated population is one of the greatest challenges to control TB. These people are more likely to develop multi-drug resistance relative to the diagnosed infectious population [21, 40]. According to the actual situation of TB in the United States, we considered several factors: slow-fast process [4143], vaccination [2426], reinfection [2123], reactivated [29, 30] and undiagnosed infection [21, 44]. Then we referenced the modeling thought from D.P. Moualeu et al. [21, 45] and Liu et al. [24], and finally established our model. The biggest differences between our model and their model are the vaccination and recovery to the susceptible population [21, 24, 45].

Our study aims to analyze the factors affecting TB based on the dynamics model, give several measures to control and prevent TB, and predict the epidemic trend in America. The structure of this paper is as follows. In Section 2, we introduce our TB model expressed by ordinary differential equations (ODE) and define parameters. Then we describe the model assumptions and modeling ideas in detail and give the disease-free equilibrium and basic reproduction number. In Section 3, the model is simulated by global differential evolution and local sequential quadratic programming (DESQP) [46, 47] optimization algorithm based on the US cases. We analyze the fitting effect by the root mean square percentage error (RMSPE) and the mean absolute percentage error (MAPE). In Section 4, we make the uncertainty and sensitivity analysis of the parameters for our model by Latin hypercube sampling (LHS) and partial rank correlation coefficients (PRCC). We also analyze the sensitivity of each parameter on the basic reproduction number and the total infectious, respectively. In Section 5 and 6, we analyze the results, make some suggestions to prevent TB, carry out simulation experiments, and then discuss the deficiency of our study. Finally, we summarize our research.

Methods

In this section, we introduce our new mathematical model, briefly explain the structure of our model, and then analyze the basic reproduction number.

2.1 Model instruction

The total population is denoted by N(t), which is sub-divided into the following seven sub-populations:

V(t) vaccinated: healthy people vaccinated with TB,

S(t) susceptible: healthy people not exposed to TB,

E(t) exposed: exposed to TB but not infectious,

I(t) diagnosed infectious: infected with TB and diagnosed in hospital,

J(t) undiagnosed infectious: infected with TB but undiagnosed in hospital,

L(t) incompletely treated: have been diagnosed with active TB and begun their treatment in hospital or home, but quitted before the end,

R(t) recovered: recovered from TB after treatment.

Considering slow-fast process [4143], vaccination [2426], reinfection [2123], re-activated [29, 30] and undiagnosed infection [21, 44] and referencing the modeling thought of the D.P. Moualeu et al. [21, 45] and Liu et al. [24], we display our dynamic model with a flow diagram shown in Fig. 1 and introduce the model in detail as the following:

Fig. 1
figure 1

The flow diagram for the compartment model of the transmission dynamics system of TB

For the dynamic system, there will be a recruitment to the dynamic system with an average scale Λ. Considering the vaccination, a proportion χ will be vaccinated (i.e., primary vaccination) and become the vaccinated class, and the remainder (1−χ) will not be vaccinated and become the susceptible class.

In fact, BCG will be ineffective for some newborns, and those newborns will be infected by contacting with the infectious, we assume the fraction as ε. The vaccine protection period takes 10 to 20 years [13], during which there will be a rate ψ of losing the immune system and converging into the susceptible.

During the fast-slow process, the susceptible will be the exposed by contacting at a slow process (1−p) and turning into the active TB at a fast p. Because the test of TB is not sensitive, there will be a fraction f of diagnosed and (1−f) undiagnosed. Here, we set p1=pf and p2=p(1−f). We do not rule out that there will be a rate ϕ, at which the susceptible will be revaccinated and become the vaccinated.

For the exposed, a proportion r will use Chemoprophylaxis to prevent them from becoming active TB, and the remainder (1−r) will be the active TB at a rate k. For the active TB, there will be a fraction h of individuals who will be diagnosed and healed, and thus (1−h) undiagnosed.

In the diagnosed infectious, some will be recovered at a rate g. Some will be incompletely treated at a rate δ. For the undiagnosed, some will become diagnosed and go to the hospital at a rate θ; thanks to their immune system, some individuals will be the exposed at a rate ρ.

For the incomplete treatment, some will detect their illness and go to the hospital at a rate α, while others will become recovered with the aid from self-immunity at a rate ω.

Considering the re-infected and reactivated, the recovered will lose the immune system at a rate γ. A fraction q1 of recovered people will be reactivated and become the diagnosed infectious; a fraction q2 of the recovered will be re-infected but not be the active; and the remaining recovered individuals with a fraction 1−q1q2 will become the susceptible.

Lastly, the death of a certain number of people in each part of the system is resulted from natural causes. In our article, the main feature of our model is to consider the transmission route and mechanism of TB in the context of reality in a comprehensive way, which is of a great practical significance. Therefore, the conclusion will be closer to reality.

2.2 Model introducing

We give the definition and range of the parameters in Table 2, and proposed a mathematical model to state the transmission dynamics and epidemic of TB which is represented by the following system of ordinary differential equations:

$$\begin{array}{@{}rcl@{}} {\begin{aligned} \left\{ \begin{array} {l} \frac{dV}{dt} = \Lambda \chi + \phi S - (\psi + \mu)V - \varepsilon vV,\\ \frac{dS}{dt} = \Lambda (1 - \chi) + \psi V + (1 - {q_{1}} - {q_{2}})\gamma R - \phi S - vS - \mu S,\\ \frac{dE}{dt} = \varepsilon vV + (1 - {p_{1}} - {p_{2}})vS + {q_{2}}\gamma R + \rho J - (1 - r)kE - \mu E,\\ \frac{dI}{dt} = {p_{1}}vS + h(1 - r)kE + \theta J + {q_{1}}\gamma R + \alpha L - (\delta + g)I - (\mu + {d_{1}})I,\\ \frac{dJ}{dt} = (1 - h)k(1 - r)E + {p_{2}}vS - (\theta + \rho)J - (\mu + {d_{2}})J,\\ \frac{dL}{dt} = \delta I - (\alpha + \omega)L - (\mu + {d_{3}})L,\\ \frac{dR}{dt} = gI + \omega L - (\gamma + \mu)R,\\ \end{array}\right. \end{aligned}} \end{array} $$
(1)
Table 2 The definition and value range of parameters for the model (1)

where

$$ N(t) = V(t) + S(t) + E(t) + I(t) + J(t) + L(t) + R(t). $$
(2)

The susceptible and the vaccinated are infected with TB through contacting individuals with active TB at a rate v(I,J,L) given by:

$$ v = \frac{{{\beta_{1}}I}}{N} + \frac{{{\beta_{2}}J}}{N} + \frac{{{\beta_{3}}L}}{N}, $$
(3)

where βi, i=1,2,3 are the rates that the diagnosed, undiagnosed infectious, and incompletely treated people sufficiently and effectively transmit TB to the susceptible or the vaccinated [1].

2.3 Basic reproduction number

The basic reproduction number \({\mathcal R_{0}}\) represents the number of infected during the initial patient’s infectious (not sick) period [48]. Our model is a biological system model, so it must meet the biological conditions. Therefore, we only study the dynamic state of the solution of system (1) in the following feasible region:

$$ {\begin{aligned} \Omega = \left\{ (V,S,E,I,J,L,R) \in \mathbb{R}_{+}^{7}:V + S + E + I + J + L + R \leq \frac{\Lambda}{\mu}\right\}, \end{aligned}} $$
(4)

which can be confirmed as positively invariant. Therefore, we restrict our attention to the dynamics of the model (1) in Ω.

For the threshold system, when \({\mathcal R_{0}}<1\), the model will stabilize to the disease-free equilibrium, and the disease will be controlled and eventually become extinct. When \({\mathcal R_{0}}>1\), the model will stabilize to the endemic equilibrium, and the disease will develop into an endemic disease. For other complex systems, this conclusion may not be valid, for example, backward bifurcation, multistable system and other complex dynamic behaviors may occur. Therefore, the smaller \({\mathcal R_{0}}\) is, the easier to control TB [49, 50]. Here, we use the next-generation matrix approach to calculate the basic reproduction number \({\mathcal {R}_{0}}\), which was proposed by Van den Driessche, etc [49]. The detail calculation is given in the Appendix.

Simulation

Based on the reported data from 1984 to 2018 by WHO [6] and the model (1), a global differential evolution and local sequential quadratic programming (DESQP) optimization algorithm was conducted to estimate the undetermined parameters [47, 51]. DESQP, which combines differential evolution (DE) [46] and local sequential quadratic programming (SQP) [52, 53], is a method used to search for the optimal solution of DE. In the method, DE is used as a base level search and SQP is used as a local search. DE is first applied to the short term of the problem to find the best solution. This optimal solution is given to SQP as an initial condition to fine tune the solution to reach the global optimum or near global optimum.

We get the estimated value, standard deviation, confidence interval, P-value and t-statistic of the parameters, as listed in Table 3. Based on the estimated results, the basic reproduction number \({\mathcal R_{0}}\) can be calculated \({\mathcal R_{0}}=2.3597\). Aandahl et al.[54] in 2014 specified an informative in [55] and improved the convergence performance of the Markov chain Monte Carlo (MCMC) sampler in [56], then estimated the reproduction number as: 2.1 (95% CI: 1.54-2.66) for the approximate method in [55] and 2.05 (95% CI: 1.55-2.63) for the exact method in [56].

Table 3 The t-statistic, P-value, CI Bound, Standard deviation, the estimated value of parameters, and the initial condition of each compartment of the model (1)

The real data and the model results are shown in the following Fig. 2. We evaluate the fitting effect of our established model through the root mean square percentage error (RMSPE) and the mean absolute percentage error (MAPE) which are significant evaluation indicators. The RMSPE and the MAPE are defined as:

Fig. 2
figure 2

The comparison of real data and fitted data and the projection for the future status of TB

$$ \text{MAPE} = \left({\frac{1}{n}\sum\limits_{t = 1}^{n} {\left| {\frac{{I{{(t)}^{*}} - I(t)}}{{I{{(t)}^{*}}}}} \right|}} \right) \times 100\%, $$
(5)
$$ \text{RMSPE} = \sqrt {\frac{{\sum\limits_{t = 1}^{n} {{{\left[ {(I{{(t)}^{*}} - I(t))/I{{(t)}^{*}}} \right]}^{2}}} }}{{n - 1}}} \times 100\%, $$
(6)

where I(t) is the real value at time t, I(t) is its fitting value, and n is the number of data used for prediction. The criteria for MAPE and RMSPE are shown in Table 4 [57, 58]. We use model (1) to simulate the number of the infected, where MAPE=4.7245% and RMSPE=5.7676%, which means the fitting effect is very desirable and our system has strong prediction ability and high prediction accuracy.

Table 4 The criteria for MAPE and RMSPE

Sensitivity analysis

In model (1), the precise estimation of parameter values is one of the greatest challenges. Direct measurements of specific biological parameters are rare, and many parameters are estimated within a broad range of values identified by fitting the model with limited experimental data [59]. Therefore, the estimation of the parameters is always associated with uncertainty analysis (UA) and sensitivity analysis (SA). It is vital to study the influence of the uncertainty of these parameters on the model, as this will not only help us successfully use the mathematical and computational models of biological systems as prediction tools, but also comprehend the functions of biological systems.

Some factors are which we cannot control in the model (1). Generally, researchers would choose the factors that we can control to analyze the sensitivity so that we can find effective measures to eliminate TB. Since we cannot control the death rate d1, d2, and d3 in our analysis, we only did sensitivity analysis for 22 parameters that we can control in an attempt to eliminate TB. To find how the parameters impact on the outcome, a general and better way is to do the sensitivity analysis for each parameter. The ideal method is to use Latin Hypercube Sampling (LHS) and Partial Rank Correlation Coefficient (PRCC) to study the dependence of model parameters on the basic reproduction number and total infectious [60, 61].

4.1 Latin hypercube sampling (LHS)

Generally, the input factors of the most mathematical and computational model consist of initial conditions and parameters, which are independent and dependent model variables. Thanks to natural variation, lack of current techniques, measurement error, etc., the parameters are not always known with adequate certainty [61]. The purpose of uncertainty analysis (UA) is to solve these problems. UA can quantify the degree of confidence in the experimental data and the estimated values of the parameters [61].

In the article, introduced by Mckay et al., the most popular and efficient Latin hypercube sampling-LHS that belongs to Monte Carlo (MC) class of sampling methods was used to perform UA [61]. MC method, a common algorithm to solve various computational problems, can evaluate multiple models, the results of which can not only be used to perform SA, but also to determine the uncertainty of model inputs. LHS can unbiasedly estimate the average output of the model, and fewer samples are required to achieve the same accuracy as simple random sampling [61].

The remaining 22 parameters have been chosen to do uncertainty analysis. We assume each parameter to be a random variable with normal distribution to analyze the uncertainty in the value of these parameters. Normal distribution for all parameters with the mean (i.e., estimated value) and variance value (i.e., square of standard deviation) are given in Table 3. Latin hypercube sampling has been used to sample for these parameters considered for the sensitivity analysis. Here, we set the sample size N=2000. Using Latin hypercube sampling method and probability density function for each parameter is stratified into 2000 equiprobable (1/2000) serial intervals. Then a single value is randomly chosen from each interval. This produces 2000 sets of values for each parameter, and we can compute 2000 sets of values for \(\mathcal R_{0}\) from 2000 sets of different parameter values and get the distribution hist of \(\mathcal R_{0}\), as is shown in Fig. 3.

Fig. 3
figure 3

The distribution of the basic reproduction number \({\mathcal {R}_{0}}\)

4.2 Partial rank correlation coefficient (PRCC)

Sensitivity analysis (SA) is a quantitative way to analyze the effects of the parameter uncertainty on the model’s outputs. Based on the parameters, we are able to raise presumptions about the biological system that actuates the system behavior, which can be measured by conducting experiments [62]. Local SA techniques, one class of SA, investigate on the effects of small variations in individual parameters around some nominal point and have been applied to a number of signal transduction and metabolic pathway models [63, 64]. Since the most influential parameters are determined, the predictability of the model can be significantly enhanced.

In this section, we compute PRCC to analyze the sensitivity of the parameters to the \({\mathcal {R}_{0}}\) and the total infectious so as to identify the parameters that have great effect on the variability in the outcome and how those parameters affect both \({\mathcal {R}_{0}}\) and the total infectious. Here we compute the PRCC of \(\mathcal R_{0}\) and the total infectious based on the LHS matrix, the result of which can be seen from Fig. 4, Table 5. In our experiment, we assume that the parameters have a significant effect when P-value <0.01.

Fig. 4
figure 4

(a) show the PRCC of parameters with \(\mathcal R_{0}\); (b) show the PRCC of parameters with the total infected. Here,we assume that when P-value <0.01, the parameters have significant effect \(\mathcal R_{0}\) and the total infectious. To better control TB, we emphasize on analyzing the parameters whose PRCC >0.2

Table 5 The value of PRCC between each parameter and \({\mathcal {R}_{0}}\) and the total infectious

From Fig. 4, we can easily see the different parameters have different extent of the effect on both \({\mathcal R_{0}}\) and the total infectious, so it is complex to take proper measures to control TB. To better control TB, we emphasize on analyzing the parameters whose PRCC >0.2. Apart from that, we assume that these parameters have a high degree and a significant effect on \({\mathcal R_{0}}\) and the total infectious. Expect for the uncontrollable factor (which we cannot take relative measure to control TB) from Fig. 4, we can easily see that r, ω, g and ϕ have significantly positive affect on both the \({\mathcal R_{0}}\) and the total infectious, while ε, k, ψ, δ and θ have significantly negative effect on both the \({\mathcal R_{0}}\) and the total infectious.

Results

In this section, we present the results of the simulation for the model. In general, parameter estimation is an iterative process, in which we use the current parameter values as the initial values of the next iteration [1]. All the parameter values of the first iterative process are set to be their initial guess values, which are estimated with the lowest sub-condition. Then parameters estimation is carried out with a limited list of previously non-identifiable parameters. Finally, we repeat the estimation process and check all the estimated parameters to see whether the new values of the previously unrecognized parameters affect the values of the identifiable parameters. We use the data of TB cases (i.e., diagnosed infectious) in America from 1984 to 2018 (see Table 1) published by the Centers for Disease Control and Prevention (CDC) to estimate the parameters of the model (1).

In our model, some parameters have been estimated by WHO, some evaluated by the TB researchers, and the others remain uncertain. We specify some parameter values as listed below.

(1) The natural mortality μ: It is assumed to be equal to the inverse of the life expectancy at birth and 1/μ=79.30 is the average human lifespan. Accordingly μ=0.0126 [65].

(2) Progress rate of the exposed to infectious individuals (including diagnosed and undiagnosed infectious) k: Based on the parameter estimation k=0.0421, the incubation period of TB is 1/k=23.7529 years. The latent TB infection lasts from 1 year to several years in general [1, 19, 66].

(3) Recovery rate of the diagnosed infectious g: In our simulation, g=0.5022, and the course of recovery for the diagnosed infection is estimated 1/g=1.9912 years. By 2010, the course of treatment of the first-time TB patient is normally treated in 6 months; and the course of treatment of the reinfected tuberculosis patient is usually 18-24 months [67, 68].

(4) Diagnosis rate θ: θ=0.6082 per year, which means the undiagnosed individual will be diagnosed with active TB after 1/θ=1.6442 years. Generally, some people with active TB are difficult to be diagnosed.

(5) Progression rate at which diagnosed infectious people become incompletely treated δ: It has been estimated as δ=0.5698 per years, which means the diagnosed people may give up treatment after 1/δ=1.7550 years. Generally, the average convalescence period of TB is around 1 year, which means after treated for 1 year, people will consider themselves to be recovered, but this is not the case [49].

(6) Reactivated ratio q1 and reinfected ratio q2: It has been estimated as q1=2.40%, q2=54.25%. This shows that relapse for most people is a slow progress, but re-infected is fast. (1−q1q2)=43.35%, which means the people will lose the immune system and become the susceptible.

(7) Progress rate at which incompletely treated people become diagnosed infectious α: It has been estimated as α=0.1002, which means that the patients that are incompletely treated may be retreated after 1/α=9.9800 years.

(8) The natural vaccination rate of the newborn babies χ: It has been estimated as χ=13.88%. In America, only a few people get BCG vaccination [1618]. The United States and other western countries with low TB burden do not necessarily require people to be vaccinated against BCG for newborns. Some Americans will be vaccinated per doctor’s advice [1618].

(9) Progress rate ψ: the rate at which the vaccinated become the susceptible. It has been estimated as ψ=0.0510 which means the vaccination may be invalid after 1/ψ=19.6078 years. BCG vaccine duration varies widely, ranging from 10 to 20 years [13].

(10) Progress rate γ: the rate at which the recovered individuals lose the immune system. It is estimated as γ=0.1444 per year which means the recovered individuals may lose the immune system after 1/γ=6.9252 years.

(11) Progression rate at which the undiagnosed become the exposed ρ. It has been estimated as ρ=0.1993 which means it takes 1/ρ=5.0176 years or so for the undiagnosed to become non-infectious individuals (exposed).

(12) Detection rate of active TB h: It is estimated as h=39.98%. This shows that 60.02% of tuberculosis patients will not be diagnosed or will not be diagnosed for a short time.

(13) Vaccination coverage ϕ: It has been estimated as ϕ=0.0500, which means after an average of 1/ϕ=20.0000 years, people will lose antibodies to TB and be vaccinated again. Generally, BCG is not widely used in America, and the adults also will not choose to be vaccinated if it’s unnecessary [16, 17].

(14) Chemoprophylaxis rate r: It has been estimated as r=0.9219. Dye et al. have estimated r=0.7 [19, 69, 70].

(15) Recovery rate of the incompletely treated ω: It has been estimated as ω=0.1986, which means some incompletely treated individuals will naturally recover after 1/ω=5.0352 years. Bacaër et al. have estimated that the natural recovery from HIV-negative TB and HIV-positive TB takes 0.1390 and 0.2400 per year, respectively. HIV-negative TB and HIV-positive TB will be recovered after 7.1900 years and 4.1700 years without treatment [71].

(16) The rate of the susceptible become the diagnosed and undiagnosed infectious p1,p2: It is estimated that p1=2.5% and p2=34.14%, which means 2.5%+34.14%=36.54% of the people will be sick at once after infected by TB, while 2.5% have severe symptoms and 34.14% have mild symptoms and do not get a diagnosis. The remaining (1−p1p2)=63.46% of people are those who proceed to a slow progression of TB infection become the exposed.

Discussion

In the section, we discuss the sensitivity of parameters to the \({\mathcal R_{0}}\) and the total infectious. Furthermore, we suggest several measures to prevent TB. TB is a prevalent infectious disease in the world and the infectious are spread worldwide. It is vital to seize the leading causes and find the best measures to prevent and control the disease. In the article, we construct a TB model to study the transmission dynamic and provide some measures to control and prevent TB in the US. To find more ways to prevent TB, we analyze many factors that may have affect on \({\mathcal R_{0}}\) and the total infectious. Besides, we complete the sensitivity analysis of the parameters with \({\mathcal R_{0}}\) and the total infectious (see Fig. 4, Table 5). When we take measures to control TB, the result is shown in Fig. 5. In general, we find we can control the factors with α, g, ψ, r, βi,i=1,2,3, ϕ and δ. From the result in Fig. 5, we can find that r has the greatest effect on the total infected, then g has the second greatest effect, and others have a similar effect.

Fig. 5
figure 5

Simulation of the total infectious with parameters 1.03×r=0.9496, 1.5×α=0.1503, 1.5×g=0.7533, 1.5×ϕ=0.0750, 0.9×δ=0.5128, 0.99×ψ=0.0505, 0.7×β1=3.1067, 0.7×β2=0.2096 and 0.7×β3=4.1063, when one parameter takes a specific value, others take the value of the first column in Table 3. Differently, we synthesize the effects of three contact rates βi,i=1,2,3 into one contact rate β effect. ‘With all control’ means that we let all parameters specific values simultaneously. ‘Without control’ is the situation which we take no measures. We can find ψ has a mild effect on the total infected, with its line overlapping with ‘without control’ approximately

Strategy 1: We can find r is strongly negatively correlated with \({\mathcal R_{0}}\), which means it is wise to use Chemoprophylaxis to control TB. With our control, the TB can be significantly controlled from Fig. 5. In order to prevent the TB outbreak, we can encourage people to have Chemoprophylaxis by the media, research on new and more efficient chemoprophylaxis to improve the effect, reduce the harm to the human body, and improve the therapeutic effect [72].

Strategy 2: The parameter g has a negative effect on the \({\mathcal R_{0}}\), and with our control, it evidently reduced the total infectious. In doing so, we should take more money and energy to research on new medicine and therapy to reduce the period of treatment [73]. Nowadays, despite improvement in the treatment of TB, it remains the second leading cause of death in the world. Therefore, we still have a long way to eradicate TB.

Strategy 3: From Fig. 4, we can see βi,i=1,2,3 have a positive effect on \({\mathcal R_{0}}\). With our control, we can greatly reduce the total infectious by reducing the transmission rate. Although we have placed emphasis on treating TB patients in isolation the majority were infected TB through person-to-person contact each year [74]. We must strictly monitor, take protective measures to avoid TB, and examine outsiders to prevent contact with TB patients [75]. If we decrease half of βi,i=1,2,3, we can find the total infectious will drop (see Fig. 5).

Strategy 4: It is also clearly shown that ψ is positively correlated with \({\mathcal R_{0}}\) and the total infectious, which means the longer the vaccine lasts, the easier it is to control TB. We make 0.8×ψ, and find the total number of the infectious being reduced greatly, which means we can lower it to prevent TB. In addition to that, we can delay the duration of the BCG by researching on new and better vaccination to prevent infecting TB. Currently, one of the primary health interventions that can be used to prevent TB is to vaccinate children.

Strategy 5: The parameter δ exerts a positive effect on the \({\mathcal R_{0}}\), and α has a negative effect on the \({\mathcal R_{0}}\). Firstly, we can encourage people to complete the treatment to be fully cured by reducing the cost [76]. The state can make more medical insurance to relieve the financial burden and help people heal. Secondly, we can educate people to learn more about TB treatment, to follow the doctor’s plan, and to realize that health is more important than everything.

Strategy 6: It shows that ϕ is negatively correlated with both \({\mathcal R_{0}}\) and the total infectious, which means increasing the number of vaccinated people per year can protect people from contracting TB, and thus control TB. Even for the state with little burden of TB, BCG can help prevent TB.

In this paper, the data used for fitting is annual data. Due to the extensive time scale, the accuracy of the model will be reduced. In later research, we will choose monthly data. Although our model take into account factors such as slow-fast process [4143], vaccination [2426], reinfection [2123], reactivated [29, 30] and undiagnosed infection [21, 44], these factors are not enough. We did not take into account factors such as interactions with HIV [27, 28], immigration [77, 78] and drug-resistant TB bacilli [69, 79, 80]. In addition to the aforementioned factors, we also did not consider residents’ medical expenditure and awareness of disease control to discuss the prevention and control measures in this study. As for the deficiencies in the study, we will analyze and discuss them in the follow-up study.

Conclusion

In general, from the analysis, it is evident that, similar to TB studies elsewhere, the prevalence of TB in the United States is heavily influenced by exposure, vaccination, and treatment effectiveness. In our study, we also found that chemoprophylaxis affects the prevalence of TB more than other factors. However, each coin has two sides. Chemoprophylaxis has certain harm to the human body [72], so we should research new and better measures to prevent and control TB. Based on the analysis, we propose some strategies to control and eliminate TB in two ways: prevention and treatment. Chemoprophylaxis stands out as the one that can greatly control TB with some side effects. Accordingly, we should do much research to find better Chemoprophylaxis.

In fact, because the recovered infectious may relapse, the difficulty of diagnosing and treating TB is hard to control. When all the control measures are implemented together (see Fig. 5, solid black line), the basic reproduction number of the model (1) is 0.6915<1. However, the case does not disappear. Instead, the system stabilizes to the endemic equilibrium. The measures we propose may not eliminate TB, but they are critically useful for controlling the epidemic of TB. According to the latest report, in the announcement came at the first WHO Global Ministerial Conference on Ending Tuberculosis in the Sustainable Development Era, 75 ministers agreed to take urgent measures to end TB by 2030 [81]. From our analysis, it is difficult to end TB until 2030 under the existing conditions (Fig. 5). Thus, we should find more and better methods to eradicate TB. Although it will be difficult to eliminate TB in a short period of time, we believe that in the future, with advanced technologies, TB can be eliminated.

\thelikesection Appendix

It is easy to see that the model (1) always has a disease-free equilibrium P0, and the disease-free equilibrium, which are the solutions of the algebraic equations:

$$\begin{array}{@{}rcl@{}} {\begin{aligned} \left\{ {\begin{array}{*{20}{l}} {\Lambda \chi + \varphi S - (\phi + \mu)V - \varepsilon vV{{ = }}0},\\ {\Lambda (1 - \chi) + \phi V + (1 - {q_{1}} - {q_{2}})\gamma R - \varphi S - vS - \mu S{{ = }}0},\\ {\varepsilon vV + (1 - {p_{1}} - {p_{2}})vS + {q_{2}}\gamma R + \rho J - (1 - r)kE - \mu E{{ = }}0},\\ {{p_{1}}vS + h(1 - r)kE + \theta J + {q_{1}}\gamma R + \alpha L - (\delta + g)I - (\mu + {d_{1}})I{{ = }}0},\\ {(1 - h)k(1 - r)E + {p_{2}}vS - (\theta + \rho)J - (\mu + {d_{2}})J{{ = }}0},\\ {\delta I - (\alpha + \omega)L - (\mu + {d_{3}})L{{ = }}0},\\ {gI + \omega L - (\gamma + \mu)R{{ = }}0},\\ {I = 0}. \end{array}} \right. \end{aligned}} \end{array} $$
(7)

The disease-free equilibrium P0:

$$ \begin{aligned} {P_{0}} &= ((\varphi \Lambda + \mu \chi \Lambda)/(\mu \varphi + \mu \phi + {\mu^{2}}),\\&(\mu \Lambda + \phi \Lambda - \mu \chi \Lambda)/(\mu \varphi + \mu \phi + {\mu^{2}}),0,0,0,0,0). \end{aligned} $$
(8)

The next-generation matrix approach is applied to calculate the basic reproduction number \({\mathcal {R}_{0}}\) [49]. For this purpose, we set X=(V,S,E,I,J,L,R), there are \(\dot {X}=F-Q\), where:

$$ F = {\left({\begin{array}{*{20}{c}} {0,}&{0,}&{(1 - {p_{1}} - {p_{2}})vS + \varepsilon vV,}&{{p_{1}}vS,}&{{p_{2}}vS,}&{0,}&0 \end{array}} \right)^{T}}, $$
(9)

and

$$ {\begin{aligned} Q = \left({\begin{array}{*{20}{c}} { - (\Lambda \chi + \varphi S - (\phi + \mu)V - \varepsilon vV)}\\ { - (\Lambda (1 - \chi) + \phi V + (1 - {q_{1}} - {q_{2}})\gamma R - \varphi S - vS - \mu S)}\\ { - ({q_{2}}\gamma R + \rho J - (1 - r)kE - \mu E)}\\ { - (h(1 - r)kE + \theta J + {q_{1}}\gamma R + \alpha L - (\delta + g)I - (\mu + {d_{1}})I)}\\ { - ((1 - h)k(1 - r)E - (\theta + \rho)J - (\mu + {d_{2}})J)}\\ { - (\delta I - (\alpha + \omega)L - (\mu + {d_{3}})L)}\\ { - (gI + \omega L - (\gamma + \mu)R)} \end{array}} \right). \end{aligned}} $$
(10)

Jacobian matrices \(\mathcal {F}\) and \(\mathcal {Q}\) at the disease-free equilibrium of F and Q are calculated respectively:

$${\begin{aligned} &\mathcal{F}=\left.\frac{\partial F(V, S, E, I, J, L, R)}{\partial(V, S, E, I, J, L, R)}\right|_{P_{0}}\\ &=\left[\begin{array}{ccccccc} 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ F_{1} & F_{2} & F_{3} & F_{4} & F_{5} & F_{6} & F_{3} \\ W_{1} & p_{1} A_{1}+W_{1} & W_{1} & \frac{S \beta_{1} p_{1}}{N}+W_{1} & \frac{S \beta_{2} p_{1}}{N}+W_{1} & \frac{S \beta_{3} p_{1}}{N}+W_{1} & W_{1} \\ W_{2} & p_{2} A_{1}+W_{2} & W_{2} & \frac{S \beta_{1} p_{2}}{N}+W_{2} & \frac{S \beta_{2} p_{2}}{N}+W_{2} & \frac{S \beta_{3} p_{2}}{N}+W_{2} & W_{2} \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}\right] \end{aligned}} $$

where:

$$ \begin{array}{l} F_{1}=\varepsilon A_{1}+S\left({p}_{1}+{p}_{2}-1\right) {A}_{2}-V \varepsilon A_{2} \\ F_{2}=S\left({p}_{1}+{p}_{2}-1\right) {A}_{2}-V \varepsilon A_{2}-\left({p}_{1}+{p}_{2}-1\right) {A}_{1} \\ F_{3}=S\left({p}_{1}+{p}_{2}-1\right) {A}_{2}-V \varepsilon A_{2} \\ F_{4}=S\left({p}_{1}+{p}_{2}-1\right) {A}_{2}-V \varepsilon A_{2}-\frac{S \beta_{1}\left({p}_{1}+{p}_{2}-1\right)+{V} \beta_{1} \varepsilon}{N} \\ F_{5}=S\left({p}_{1}+{p}_{2}-1\right) {A}_{2}-V \varepsilon A_{2}-\frac{S \beta_{2}\left({p}_{1}+{p}_{2}-1\right)+{V} \beta_{2} \varepsilon}{N} \\ F_{6}=S\left({p}_{1}+{p}_{2}-1\right) {A}_{2}-V \varepsilon A_{2}-\frac{S \beta_{3}\left({p}_{1}+{p}_{2}-1\right)+{V} \beta_{3} \varepsilon}{N} \end{array}\notag $$

and

$$ \begin{array}{l} W_{1}=-S p_{1} A_{2} \\ W_{2}=-S p_{2} A_{2} \\ A_{1}=\frac{I \beta_{1}+J \beta_{2}+L \beta_{3}}{N} \\ A_{2}=\frac{I \beta_{1}+J \beta_{2}+L \beta_{3}}{N^{2}} \\ N=V+S+E+I+L+J+R \end{array}\notag $$
$${\begin{aligned} \begin{array}{l} \mathcal{Q}=\left.\frac{\partial Q(V, S, E, I, J, L, R)}{\partial(V, S, E, I, J, L, R)}\right|_{P_{0}} \\ \,=\,\left[\!\!\!\!\begin{array}{ccccccc} Q_{1} & -\phi-M & -M & \frac{V \beta_{1} \varepsilon}{N}-M & \frac{V \beta_{2} \varepsilon}{N}-M & \frac{V \beta_{3} \varepsilon}{N}-M & -M \\ -\psi-S A_{2} & Q_{2} & -S A_{2} & \frac{S \beta_{1}}{N}-S A_{2} & \frac{S \beta_{2}}{N}-S A_{2} & \frac{S \beta_{3}}{N}-S A_{2} & Q_{8} \\ 0 & 0 & Q_{3} & 0 & \-\rho & 0 & -\gamma q_{2} \\ 0 & 0 & h k(r-1) & Q_{5} & -\theta & -\alpha & -\gamma q_{1} \\ 0 & 0 & Q_{4} & 0 & Q_{6} & 0 & 0 \\ 0 & 0 & 0 & -\delta & 0 & Q_{7} & 0 \\ 0 & 0 & 0 & -g-\omega & 0 & 0 & \gamma+\mu \end{array}\right] \end{array} \end{aligned}} $$

where

$$ \begin{array}{l} M=V \varepsilon A_{2} \\ Q_{1}=\mu+\psi+\varepsilon A_{1}-M \\ Q_{2}=\mu+\phi+A_{1}-S A_{2} \\ Q_{3}=\mu-k(r-1) \\ Q_{4}=-k(h-1)(r-1) \\ Q_{5}=d_{1}+\delta+g+\mu \\ Q_{6}=d_{2}+\mu+\rho+\theta \\ Q_{7}=\alpha+d_{3}+\mu+\omega \\ Q_{8}=\gamma\left(q_{1}+q_{2}-1\right)-S A_{2} \end{array}\notag $$

then, one can obtain \(G=\mathcal {F}\mathcal {Q}^{-1}\). The basic reproduction number \(R_{0}=\max \left \{\lambda _{\mathcal {R}_{0}}\right \}\), where \(\lambda _{\mathcal {R}_{0}}\) are the eigenvalues of G. Because \({\mathcal {R}_{0}}\) is complex, we don’t give the detailed expression, and compute it via Matlab (the Mathworks, Inc.).