Introduction

Tuberculosis (TB), a chronic infectious disease caused by Mycobacterium tuberculosis (M.tb), is a serious threat to human health and a major public health problem in developing countries (Zumla et al. 2013). The World Health Organization (WHO) reported that there were 10 million new cases of TB in 2020 and China has the second highest TB burden in the world, accounting for 9% of the world’s TB cases (Chakaya et al. 2021). The Coronavirus Disease 2019 (COVID-19) ravages the world, and extraordinary health demands during the pandemic have diminished efforts to diagnose and treat people with tuberculosis (Kuehn 2020). While one-third of the world’s population were infected with M.tb according to the statistics, only 10% of infected individuals eventually develop active TB, the other 90% may never develop symptomatic disease (Tang et al. 2011; Moonan et al. 2020). This suggests that a large proportion of these individuals may be naturally resistant to TB infection and that host genetic susceptibility plays a non-negligible role in the development of TB. However, the specific role played by specific genetic factors in the development of tuberculosis is unknown, so further research into host genetic factors associated with M.tb infection could help us uncover new molecular markers and provide potential targets for tuberculosis prevention and treatment research.

Several studies support the influence of host genetics on TB susceptibility, including case observations, twin studies, candidate gene association and genome-wide association studies (Thye et al. 2010; Vannberg et al. 2011). Moreover, in recent years, there have been a large number of previous studies proving the evidence of association between host genetic polymorphisms and TB susceptibility (Selvaraj et al. 2015; Yuan et al. 2016). These genetic variants are mainly involved in the body’s immune response to M.tb and their polymorphisms may lead to increased genetic susceptibility to TB. Therefore, it is essential to determine the relationships between the genetic variations and the susceptibility to TB to comprehensively understand the pathogenesis of mycobacterial infection (Zhang et al. 2020).

TB is considered to be an immune-related disease. The incidence and development of TB are associated with host immune status, which is mostly decided by host immune regulation-related genes (Verway et al. 2013). Transporter-associated antigen processing (TAP) is responsible for transferring the endogenous antigen peptides from the cytoplasm to the endoplasmic reticulum (ER) and then handling and presenting them to immune cells (Harriff et al. 2013). TAP complex is a heterodimer consisted of two subunits encoded by TAP1 and TAP2 genes, a cluster of genes located within the HLA class II region of the HLA between the HLA-DP and -DQ loci on the short arm of human chromosome 6 (6p21.3) (Ramos et al. 2009). Single-nucleotide polymorphisms in the TAP1 and TAP2 genes may result in structural changes that prevent TAP heterodimer formation and alter antigen recognition and presentation, as evidenced in several experimental models, human case studies and malignancies (Zhang et al. 2002; Tang et al. 2001). These polymorphisms can serve as markers for human disease. Several studies have supported the association of TAP genes with various multifactorial diseases, such as immune-mediated, viral, and bacterial diseases and various types of cancers (Ozbas-Gerceker et al. 2013; Abitew et al. 2020; Qian et al. 2017; Thu et al. 2016). Studies were also performed on TAP polymorphisms and TB in various populations, including Iran, Koreans, Chinese, etc. (Naderi et al. 2016; Roh et al. 2015; Zhang et al. 2021). However, the research conclusions are not always consistent. Notably, the research in Han Chinese population into the relationship between TAP gene polymorphisms and TB is limited.

The risk of M.tb infection is much higher in household contacts of tuberculosis patients, ranging from 30 to 80% (Marks et al. 2000). In the present study, we intend to use family-based case–control study to avoid the false-positive results that might occur in traditional case–control studies due to population stratification. The aim of this study was to investigate the associations between polymorphisms of TAP gene and susceptibility to TB in the Han population in Guangdong province. Concurrently, we also attempted to determine whether the haplotypes covering these SNPs were linked to the development of TB and elucidate the immunopathological mechanisms of tuberculosis and provide a scientific basis for the prevention and treatment of tuberculosis.

Materials and methods

Ethics statement

All individuals voluntarily participated in the study, signed the informed consent before conducting the epidemiological investigation and collecting peripheral blood samples. All experiments were performed in accordance with relevant named guidelines and regulations and approved by the audit of the Ethics Committee of the First Affiliated Hospital of Guangdong Pharmaceutical University.

Subjects

Subjects were recruited from January 2017 to January 2020 among TB-prone families and healthy control families in the Han population in Guangdong province. A total of 413 people were enrolled in the study, including 133 TB patients, 107 genetically related healthy household controls contacts (HHCs), and 173 healthy controls (HCs).

According to one of the questions in Guangdong Province TB Drug Resistance Surveillance Questionnaire: ‘does any resident of your household suffer from tuberculosis’, we documented and verified familial clustering (families with two or more patients) from 2017 to 2020. Then we identified the probands (the first patient) and their blood family members and recruited them into the TB-prone family group.

The TB diagnosis was based on the Guidelines for The Implementation of Tuberculosis Control Program in China: (1) clinical symptoms suggestive of TB; (2) evidence of active TB in the upper lobes of the chest radiograph; (3) sputum smears positive with or without chest radiographic evidence of active TB; (4) excluding the individuals with diseases such as lung carcinoma, pneumonia, diabetes, HIV/AIDS and those under immunosuppressive treatment.

The healthy control families were from the general population with the same socioeconomic status, family structure and ethnic background as that of the TB-prone families, and without blood ties. Definition: (1) HHC (healthy household contacts): individuals who have a kinship with the proband, residing in the same house. (2) HC (healthy controls): individuals without exposure to M.tb. Considering that age is a potential confounding variable, the proband of each control family was matched to the proband of a tuberculosis-prone family, the matching criterion was an age difference no higher or smaller than five years. Other demographic characteristics are not matched to avoid information loss due to over-matching. All subjects had no kinship with the members in the TB-prone families and did not suffer from tuberculosis, HIV infection and diabetes.

Measurement of risk factors

There are numerous potential environmental risk factors for tuberculosis. Some of the most studied factors include the history of smoking, alcohol consumption, BCG vaccination, crowded housing and exposure to TB patients. BMI, fitness activities, socioeconomic conditions, indoor humidity, lighting and hygiene may also influence the development of TB, and studies were relatively limited. This study was based on the literature to comprehensively examine the environmental factors that may influence the development of tuberculosis and then analyses the true link between the TAP gene and tuberculosis while controlling for environmental factors. Survey data was collected using epidemiological questionnaires and face-to-face field interviews. The questionnaire included general demographic characteristics, such as gender, age, occupation, education level, Bacillus Calmette Guerin (BCG) vaccination history, and personal behavior, such as smoking status, alcohol consumption and fitness situation, as well as living environmental factors, such as lighting, humidity, ventilation and bedroom congestion. The definition and judgment criteria of related indicators as follows: smoking was defined as having at least one cigarette a day for 3 months, and drinking was defined as having at least three times a week for 1 year. At least three times a week, each time more than 30 min of fitness activities and lasting more than half a year is considered to have fitness, as a measure of nutritional status, a BMI of less than 18.5 is defined as low weight malnutrition according to the WHO.

DNA extraction and genotyping

Approximately, 5.0 mL of peripheral blood was collected from participants into a tube containing EDTA (ethylenediaminetetraacetic acid) and then stored in a low temperature refrigerator at − 80 ℃. Genomic DNA was isolated from peripheral blood cells using the E.Z.N.A. SE Blood DNA Kit D3471-02 (Omega Bio-Tek) according to the manufacturers’ protocol. DNA concentrations were determined using the NanoDrop-2000 spectrophotometer (Thermo Scientific).

Genotyping of SNPs at rs1057141 and rs1135216 of TAP1 gene and rs241447 and rs3819721 of TAP2 was analyzed by polymerase chain reaction-restriction fragment length polymorphism (PCR–RFLP). Each 5 µL PCR sample contained 0.25 µL TaqMan probe (Thermo Fisher Scientific), 2.5 µL Mix (Takara Biotechnology), 0.5 µL DNA, and 1.75 µL ultra-pure water. The mixture was added to a 384-well plate and placed in a PCR apparatus (ABI Prism® 7900HT) for automatic genotyping. The PCR cycling parameters were set as follows: pre-denaturation at 95 ℃ for 30 s, denaturation at 95 ℃ for 5 s, annealing extension at 60 ℃ for 30 s, and 40 cycles were repeated.

Blank control and 5% duplicate control were set on each reaction plate in the experiment, and the consistent typing rate of repeated samples reached 100%.

Statistical analysis

SPSS 26.0 software was employed for statistical analysis of the data. Continuous data are reported as mean ± standard deviation. Categorical variables are shown as number and proportion. Independent sample t test for continuous data and Chi-square test for categorical data were used. Hardy–Weinberg Equilibrium (HWE) was tested using Chi-square tests for each SNP. The associations between genotypes and TB incidence were calculated by computing the odds ratio (OR) and 95% confidence intervals (95% CI) from non-conditional logistic regression analyses. Linkage disequilibrium (LD) and haplotype analysis for four SNPs was performed with online software (http://analysis.bio-x.cn/myAnalysis.php) (Li et al. 2009), and frequencies < 0.03 were disregarded during analysis. The association of genetic risk scores (GRS) for the four loci of the TAP gene with the development of TB was analyzed by constructing simple genetic risk scores and weighted genetic risk scores. P value less than 0.05 was considered statistically significant.

Results

Demographic characteristics

The demographic characteristics of study participants are shown in Table 1. A total of 170 families were enrolled in this study, 85 tuberculosis-prone families and 85 healthy control families, which included 133 TB patients, and their 107 household contacts and 173 healthy controls, with an average age of 43.65 ± 16.06, 34.11 ± 19.71, and 41.90 ± 14.75 years, respectively.

Table 1 Demographic characteristics of TB (patients), HHC (healthy household contacts) and HC (healthy controls)

The distribution of age, gender and occupations were significantly different between TB and HHC (P < 0.05). A significant difference was achieved in the distribution of TB and HC regarding gender and level of education and occupations (P < 0.05), while the distribution of level of education between TB and HHC, as well as the distribution of age between TB and HC, had no significant difference.

Environmental risk factors

Environmental risk factors’ distribution between TB and HHC

As shown in Table 2, a higher proportion of smokers (50.4%), non-exercisers (68.4%) and individuals with TB exposure history (90.2%) was observed among the TB compared with HHC (15.0, 52.3 and 79.4%, respectively), and the differences were statistically significant (P < 0.05). However, no significant difference was detected when comparing BMI, medical insurance, monthly household income and BCG vaccine history between the two groups. For the living environment factors, as TB and HHC are in the same household and have the same living environment, no comparison is made.

Table 2 Comparison of environmental factors between TB (patients) and HHC (healthy household contacts)

Environmental risk factors’ distribution between TB and HC

The comparison results of environmental factors between TB and HC are summarized in Table 3. The HC reported higher proportions of individuals with a BMI > 18.5 (91.9%), high monthly household income (43.9%), fitness activities (68.2%), urban residence (72.8%), good indoor sanitation (85.0%), good indoor lighting (79.2%) and a history of BCG vaccination (64.2%) than TB (P < 0.05). Nevertheless, the proportions of smokers (50.4%), individuals in crowded housing (80.5%), indoor moist (74.4%), and TB contact messengers (90.2%) were higher among TB than in HC, and the difference was statistically significant (P < 0.05).

Table 3 Comparison of environmental factors between TB (patients) and HC (healthy controls)

Multivariate logistic regression analysis

Table 4 shows the results of multivariate logistic regression analysis for TB vs HHC and TB vs HC. When comparing TB to HHC, the risk of tuberculosis in smokers is 3.61 times higher than in non-smokers (P = 0.003, OR = 3.61, 95% CI 1.54–8.45). Other statistically significant difference included the history of TB exposure (P = 0.014, OR = 2.99, 95% CI 1.25–7.16). Fitness activities can effectively reduce the risk of tuberculosis (P = 0.038, OR = 0.58, 95% CI 0.28–0.96).

Table 4 Multivariate logistic regression analysis of risk factors regarding TB vs HHC and TB vs HC

Multivariate analysis of TB and HC showed that BMI higher than 18.5 (P = 0.032, OR = 0.30, 95% CI 0.10–0.90), fitness activities (P = 0.019, OR = 0.41, 95% CI 0.19–0.86) and living in towns (P = 0.016, OR = 0.38, 95% CI 0.18–0.84) were the protective factors for tuberculosis. While the history of smoking (P = 0.043, OR = 2.72, 95% CI 1.03–7.15), bedroom crowding (P = 0.002, OR = 3.56, 95% CI 1.61–7.89), indoor moist (P = 0.026, OR = 2.75, 95% CI 1.13– 6.70) and history of TB exposure (P < 0.001, OR = 12.93, 95% CI 5.29–31.61) increased the risk of tuberculosis.

Association between TAP gene polymorphisms and susceptibility to tuberculosis

Hardy–Weinberg equilibrium test

The genotype distributions of TAP1 rs1135216 and rs1057141, as well as TAP2 rs241447 and rs3819721 were found to be in Hardy–Weinberg equilibrium in TB, HHC and HC (P > 0.05). The results of the HWE check and MAF are shown in Supplementary Table 5.

Four genetic patterns of TAP1 and TAP2 genes

As shown in Table 5, a significant association was observed between TAP1 rs1135216 and susceptibility to tuberculosis in four genetic models (CT vs TT: OR = 2.56, 95% CI 1.31–4.99; CC vs TT: OR = 6.73, 95% CI 1.33–34.02; CT–CC vs TT: OR = 2.92, 95% CI 1.55–5.53; CC vs TT–CT: OR = 5.14, 95% CI 1.04–25.53; CT vs TT–CC: OR = 2.28, 95% CI 1.18–4.39) regarding TB vs HHC. However, there was no significant association for such locus when comparing TB and HC (P > 0.05). For TAP2 rs3819721, the AG genotype in the co-dominant model (AG vs GG: OR = 2.28, 95% CI 1.22–4.27), GG genotype in the dominant model (GG vs AG–AA: OR = 2.32, 95% CI 1.27–4.24), as well as AG genotype in the super-dominance model (AG vs GG–AA: OR = 2.04, 95% CI 1.11–3.76) was found to be associated to TB when compared to HHC.

Table 5 Genotype and allele distribution of TAP1 rs1135216 and rs1057141 in TB, HHC and HC

Different gene models of TAP1 rs1057141 and TAP2 rs241447 were not associated with tuberculosis susceptibility either in TB vs HHC (P > 0.05) or in TB vs HC (P > 0.05).

The C allele of TAP1 rs1135216 was found to be positively associated with the risk of tuberculosis in TB and HHC (P < 0.001, OR = 2.28, 95% CI 1.41–3.70), as much as in TB and HC (P = 0.001, OR = 2.03, 95% CI 1.35–3.06). The A allele of TAP2 rs3819721 also increased the susceptibility to tuberculosis. No significant difference in the distribution of alleles of TAP1 rs1057141 and TAP2 rs241447 in TB vs HHC (P > 0.05) and in TB vs HC (P > 0.05).

Haplotype analysis of TAP1 and TAP2 genes

Haplotype refers to the linear arrangement of a group of related multiple SNP sites on a chromosome or in a certain region (Jakobsson et al. 2008). Studies have shown that haplotypes as genetic markers have higher polymorphism and statistical efficiency than single SNPs (Davidson 2000).

As shown in Table 6, the Linkage Disequilibrium (LD) analyses indicated that polymorphisms rs1135216 and rs1057141 in TB vs HHC (D′ = 0.55, r2 = 0.19) and TB vs HC (D′ = 0.61, r2 = 0.27) were not independent from another, whereas the strength of association between rs241447 and rs3819721 was observed to be low.

Table 6 Analysis of association between haplotype and pulmonary tuberculosis susceptibility

Four haplotypes were formed at the TAP1 rs1135216 and rs1057141, which were TT, TC, CC and CT. Take TT haplotype as reference, CT haplotype was associated with an increased risk of tuberculosis in TB vs HC (P = 0.020, OR = 11.34, 95% CI 1.49–86.56) and in TB vs HHC (P = 0.018, OR = 7.45, 95% CI 1.43–38.76). No other haplotypes of TAP1 were found to be statistically associated with susceptibility to tuberculosis either in TB vs HC or in TB vs HHC.

The two loci of TAP2 gene rs241447 and rs3819721 form four haplotypes of TG, CG, TA, and CA. Compared with the TG haplotype, individuals with the TA haplotype have a 1.2-fold increased risk of tuberculosis regarding TB vs HHC (P = 0.034, OR = 2.20, 95% CI 1.07–4.56).

Genetic risk score of the TAP gene

Individual genetic risk scores ranged from a minimum of 0 to a maximum of 6. The subgroups were 0–2, 3–4 and 5–6 (Table 7). In TB vs HHC, the risk of TB increased 3.4-fold (P = 0.038, OR = 4.40, 95% CI 1.09–17.76) for individuals with a GRS score of 5–6 compared to those with a GRS score of 0–2, after adjusting for smoking history, fitness activities and history of TB exposure. Similarly, when comparing TB and HC, individuals with a GRS score of 3–4 had a higher risk of TB than those with a GRS score of 0–2 (P = 0.047, OR = 2.13, 95% CI 1.01–4.47). The results of the trend test showed that the risk of tuberculosis increased with increasing GRS scores in TB vs HHC (Ptrend = 0.010) and in TB vs HC (Ptrend = 0.001).

Table 7 The relationship between the GRS of TAP gene and tuberculosis susceptibility

Discussion

In addition to socioeconomic factors such as living conditions and poverty, host genetic susceptibility may be another important risk factor for the development of tuberculosis. In this family-based case–control study, our results demonstrated that SNPs at rs1135216 of the TAP1 gene and rs3819721 of the TAP2 gene were significantly associated with an increased risk of TB in the Chinese Han population. The haplotype of rs1135216*C/rs1057141*T and rs3819721*T/rs241447*A was found to be associated with high susceptibility to TB. Moreover, the risk of developing tuberculosis increases with the number of risk alleles. These findings are important for the timely identification of people at high risk of developing TB and for elucidating the molecular mechanisms of TB development.

In current study, we found that the rate of smoking among TB patients was relatively higher, accounting for 50.4%. Smokers had three times the risk of developing TB compared to non-smokers. Smoking was shown to be a significant risk factor for developing TB and is consistent with other studies (Velen et al. 2020; Htet et al. 2018). The study also observed that keeping a certain frequency of regular fitness could improve the body’s immunity and BMI greater than 18.5 was an effective protective factor against tuberculosis. Several studies (Lönnroth et al. 2010) had shown that BMI could have an effect on the development of tuberculosis. Moreover, multivariate analysis indicated that living in rural areas and crowded bedrooms, as well as moist indoors may be a risk factor for the occurrence of tuberculosis. This may be due to the fact that rural areas are less hygienic than urban areas and M.tb survives more easily in moist environments. Individuals with a history of TB exposure are also at increased risk of developing TB. Literature (Horsburgh and Rubin 2011; Acuña-Villaorduña et al. 2018) published had demonstrated that close contact to infectious tuberculosis cases is considered the strongest predictor of M.tb infection, more constant exposures to infectious M.tb and subsequently to more risk of progression to tuberculosis, which is in accordance with the findings of this research.

The present study sought to identify four polymorphisms in the coding regions of TAP1 and TAP2 genes and observed all the polymorphisms in the subjects enrolled in the study. Significant association between TAP1 rs1135216 and TAP2 rs3819721 genes and tuberculosis was observed when compared TB patients and healthy household contacts. Individuals containing TAP1 CC homozygote and CT heterozygote were found to be strongly associated with tuberculosis infection (OR = 6.73, 2.56, respectively). Similar to the study by Wang et al. (2012) showed that rs1135216 exhibited a significant relation to tuberculosis susceptibility. However, the study carried out in a Northwestern Colombian population also failed to detect any association of TAP1 gene with TB disease (Gomez et al. 2006). In addition, the present study also showed that the AG genotype of TAP2 rs3819721 was more common in tuberculosis patients than that in healthy household contacts, suggesting that AG genotype significantly increased risk of tuberculosis infection. In contrast, no association has been found between the risk of tuberculosis and the polymorphisms at TAP1 rs1057141 and TAP2 rs241447, which is not fully concordant with the results of studies by Naderi et al. (2016) and Zhang et al. (2021) found that polymorphisms at the TAP2 rs241447 locus can reduce the risk of tuberculosis. This may be due to the different races and regions of the subjects.

Haplotype combination and analysis combines multiple-loci together to discover genetic predictors of disease incidence risk (Gandhi et al. 2020). In this study, the rs1057141 and rs241447 polymorphism was not found to be significantly associated with TB susceptibility. However, when the rs1135216 and rs1057141, as well as rs3819721 and rs241447 SNPs were combined in haplotypes, the rs1135216*C/rs1057141*T and rs3819721*T/rs241447*A were found to be significantly associated with increased tuberculosis risk, and the OR were 11.34 (95% CI 1.49–86.56) and 2.20 (95% CI 1.07–4.56), respectively. The haplotype CT and TA were genetic markers that may predict tuberculosis risk. Consequently, it seems more important to determine the association between polymorphism and phenotype based on the configuration of haplotypes compared with alleles and genotypes. The results revealed that the combination of multiple genetic variants is a useful and valuable tool, providing deeper insights and additional clues for discovering the risk of disease incidence.

The number of genetic variants affecting the development of complex diseases is large and genetic risk scores combine the weak effects of multiple genetic variants to improve the predictability of genetic variants for disease (Li et al. 2020). Compared with those carrying 0–2 risk alleles, those carrying 3–4 risk alleles and those carrying 5–6 risk alleles increased the risk of tuberculosis by 2.13 times and 4.40 times, respectively, with increased disease risk with higher scores. The four genetic polymorphisms of the TAP1 and TAP2 genes have a certain cumulative effect in the pathogenesis of tuberculosis.

There are still some limitations in the current study that should be addressed. First, the number of families in our group was still relatively small. Second, the age matching in the study resulted in a degree of selection bias. Third, this study focused on Han Chinese subjects in Guangdong, China, and the results may not be applicable to other populations. In the future, studies in multiple and large samples are needed to accurately evaluate the true role of TAP gene polymorphisms in the pathogenesis of tuberculosis.

In summary, our research identified that the SNPs of rs1135216 in the TAP1 gene and rs3819721 in the TAP2 gene were associated with TB susceptibility among the Chinese Han tuberculosis-prone families, and the risk of developing tuberculosis increases with the number of risk alleles. These findings add to our understanding of the possible impact of TAP gene variants on the development of tuberculosis.