Background

After two decades of neglect, tuberculosis (TB) is being resurrected as a major public health problem, especially in low- and middle-income countries [1]. Nearly one third of the world's population has been latently infected with the pathogen of mycobacterium tuberculosis (MTB) [2]. However, only 10% of them will develop active TB throughout their lifetimes [35]. Although previous studies have indicated that susceptibility has a substantial genetic component [68], progress in the determination of contributing genetic variants of TB was slow. With the completion of Human Genome Project and advances in genotyping technology, Genome-wide Association (GWA) Study has been one powerful tool for the study of genetic susceptibility in human complex diseases [9]. Despite the widely held view that exposure to pathogens during human evolution has put evolutionary pressures on host susceptibility, progress in identifying susceptibility genes for infectious diseases has been slow in comparison to other common disorders [10]. Recently, a GWA study in Ghana and Gambia identified a susceptibility locus of rs4331426 on chromosome 18q11.2 in association with the risk of TB [odds ratio (OR) = 1.19, 95% confidence interval (CI):1.13-1.27, P = 6.8 × 10-9] [11]. However, till now this finding has not been replicated in other populations.

China, the world's second largest country with TB epidemic, has a different genetic background, lifestyle, and disease prevalence from Africa [12]. It was estimated that 1.3 million people across the country developed active TB in 2009, of whom 600 000 had the highly infectious form. To validate the findings from the GWA study in Ghana and Gambia and to search for more susceptibility loci of TB, we performed a case-control study via the fine mapping analysis of the region of 18q11.2 in the Chinese population.

Methods

Study population

This case-control study was conducted in Jiangsu, a developed province located in the eastern part of China, with a total population of 77 million in 2009. We recruited 578 patients with pulmonary TB, including 368 (63.7%) new cases and 210 (36.3%) previously treated ones. For the definitions of new and previously treated cases, we referred to the WHO guidelines. In brief, a "new case" was termed as a newly registered episode of TB in a patient who, in response to direct questioning, denied having had any prior antituberculosis treatment (for up to one month); and in study sites where adequate documentation was available, there was no evidence of such history. A "previously treated case" was defined as a newly registered episode of TB in a patient who, in response to direct questioning admitted having been treated for TB for one month or more, or, in study sites where adequate documentation was available, there was evidence of such history. All patients were diagnosed with the evidence of sputum culture. Sputum samples were cultured on Lowenstein-Jensen (LJ) culture media. Identification of MTB was done by using the p-nitrobenzoic acid (PNB) and thiophene carboxylic acid hydrazine (TCH) resistance test. Growth in LJ medium containing PNB indicated that the bacilli did not belong to the MTB complex. We also recruited 756 controls from a pool of individuals who participated in the local community-based health examination program. Controls were frequency-matched to the cases by sex and age. These control subjects had no self-reported history of TB, diabetes and malignancy. All cases and controls had no prior HIV positive history. Each subject was individually interviewed in local health facilities by using a structured questionnaire and donated a blood sample for genotyping analysis.

SNPs selection and genotyping

The significant SNP rs4331426, which was identified in a GWA study in Ghana and Gambia, is located on the region of chromosome 18q11.2 [11]. Due to the low minor allele frequency (MAF) of this SNP (< 5%) among Han Chinese, we further searched for tag SNPs around it. Firstly, we downloaded all eligible SNPs in the 100 Kbp up and down stream of the SNP rs4331426 on chromosome 18q11.2 by using the Chinese Han population (CHB) database of HapMap http://www.hapmap.org/. All SNPs in this region were filtered by using the following criteria: (1) MAF≥0.05; (2) Hardy-Weinberg equilibrium test P value≥0.05. Then, tag SNPs were selected by using Haploview 4.2 software based on their ability to tag surrounding variants [13]. As a result, 7 SNPs (rs4330012, rs8087945, rs12456774, rs12457731, rs12958098, rs4800136 and rs4800417) were chosen for genotyping, which was performed by using TaqMan allelic discrimination technology on the ABI 7900 Real-Time PCR System (Applied Biosystems, Foster City, CA) [14]. The primers and probes for each SNP (Table 1) were designed by Nanjing Steed BioTechnologies Co., Ltd. Due to the technical limitation, we failed to design probes for detecting the SNP rs4330012 as another SNP rs9954441 neighboring to it. Preliminary experiments were carried out for each SNP and blank controls were set in each batch of samples. Both the laboratory personnel and the readers of genotyping were blinded to the status of cases and controls. The overall call rate of genotyping was > 95%.

Table 1 Information of primers and probes

Statistical analysis

Data were double entered with EpiData 3.1 (Denmark) and discrepancies were checked against the raw data. Continuous variables were described as mean ± SD and differences between groups were analyzed by using student-t test. Categorized variables were described as percentage and analyzed by using the Chi-square test. Unconditional logistic regression model was used to calculate odds ratio (OR) and 95% confidence interval (CI), as well as corresponding P-values. Hardy-Weinberg equilibrium was estimated using the χ2 goodness of fit test among controls. Haplotype blocks were selected with Haploview software by considering linkage disequilibrium (LD) blocks. The estimated frequency of polymorphic loci was calculated using PHASE 2.1 software. All analyses were performed using the SPSS software (SPSS Inc., USA). The P-value reported was two-sided and the values less than 0.05 were considered statistically significant.

Ethical consideration

This project has been approved by the Institutional Review Board of Nanjing Medical University. Written informed consents were obtained from all participants. Ethics has been respected throughout the whole study period.

Results

Overall, this study consisted of 578 cases (72.1% males, 27.9% females) and 756 controls (75.3% male, 24.7% female). The age (mean ± SD) was 52.07 ± 18.01 years for cases and 52.85 ± 18.42 years for controls, respectively. As a result of frequency-matching, there were no significant differences in the distribution of gender and age between cases and controls. However, education level and the history of smoking and drinking were found to be different between the two groups. As shown in Table 2, the proportion of ever smoking was 52.5% in patients, which was significantly higher than that in controls (44.4%) (χ2 = 8.543, P = 0.003). In contrast, the proportion of alcohol drinking was 17.9% in TB patients, which was significantly lower than that in controls (28.4%) (χ2 = 19.675, P < 0.001).

Table 2 Basic characteristics of the cases and controls

As expected, genetic variants of rs4331426 were rare in the study population. The frequencies of AA, AG, and GG genotype were 93.70%, 6.14%, and 0.15%, respectively. No significant difference was observed in the distribution of either genotypes or alleles of this SNP between cases and controls. Six tag SNPs we genotyped were all in Hardy-Weinberg equilibrium [rs8087945 (χ2 = 0.330, P = 0.566), rs12456774 (χ2 = 0.438, P = 0.508), rs12457731 (χ2 = 0.236, P = 0.627), rs12958098 (χ2 = 0.757, P = 0.384), rs4800136 (χ2 = 0.047, P = 0.829) and rs4800417 (χ2 = 1.320, P = 0.251)]. The genotype analysis showed that the minor allele frequencies of rs8087945, rs12456774, rs12457731, rs12958098, rs4800136 and rs4800417 were 22.41%, 29.96%, 5.10%, 47.14%, 3.91% and 19.34% in the cases and 22.00%, 29.93%, 4.37%, 48.99%, 4.03% and 18.66% in the controls, respectively. No significant difference was observed in genotypes or allele frequencies of the six tag SNPs between case and control groups either before or after adjusting for age, sex, education, smoking, and drinking history (Table 3). Side by side r2/D' plot for six tag SNPs was shown in the additional file 1. By considering both D' and r2, we analyzed two haplotypes as presented in the Table 4. For example, strong LD was observed between rs8087945 and rs12456774 (D' = 1, r2 = 0.12). The common haplotype within this block was AA (rs8087945-rs12456774). Comparison of haplotype frequencies between case and control groups demonstrated that both the haplotype AG(rs8087945-rs12456774) and GA(rs8087945-rs12456774) were associated with the decreased risk of TB, with the adjusted OR(95% CI) of 0.34(0.27-0.42) and 0.22(0.16-0.29), respectively (Table 4). A strong linkage disequilibrium was also observed between rs4800417 and rs8087945 (D' = 0.99, r2 = 0.78). As compared with the common haplotype CA(rs4800417-rs8087945), the haplotype TG(rs4800417-rs8087945) was associated with a decreased risk (aOR = 0.64, 95%CI: 0.50-0.81) whereas the haplotype CG(rs4800417-rs8087945) was related to an increased risk of TB (OR = 3.19, 95%CI: 2.26-4.51) (Table 4).

Table 3 Distribution of genotypes in cases and controls and their risks with pulmonary tuberculosis
Table 4 Haplotype frequencies and the risks of pulmonary tuberculosis

Discussion

A puzzling feature of TB is that only a small proportion of infected persons will develop active diseases during their lifetimes [15], though nearly one third of global populations have been latently infected with the pathogen [2]. Host genetic factors can explain, at least in part, why some people resist infection more successfully than others [16, 17]. Recently, a GWA study from Ghana and Gambia identified rs4331426 on the chromosome 18q11.2, as a susceptibility locus associating with the risk of TB [11]. Till now, this is the only GWA study relating to the susceptibility of TB, suggesting that a new non-MHC locus can be identified in an infectious disease caused by a highly polymorphic pathogen even in African populations [11]. The identified variant of SNP rs4331426 is common in the African, but is much rarer in other populations. No data of this SNP have been published yet in association with TB from other areas of the world. Considering the limitation of extensive genetic diversity and shorter LD ranges in African populations, we performed a study to validate this finding in China by searching for tag SNPs on the chromosome 18q11.2 in the 100 Kbp up and down stream of the SNP rs4331426. To our knowledge, since the publication of the GWA study by Thye et al [11], our work is the first one to explore the role of genetic polymorphisms in this region on the susceptibility to TB. Unfortunately, we observed no significant association between TB risk and selected SNPs individually. One possible explanation might be the heterogeneity of populations, which can be confirmed by the disparity of genotype frequency [18]. Another explanation was that we only detected tag SNPs on the chromosome 18q11.2 within 100 Kbp in the up and down stream of the SNP rs4331426, which could only represent a relatively narrow scope of the genetic loci. Even though none of polymorphisms we investigated were associated with TB in the single-point locus analysis, we found the haplotypes within this block might be associated with the altered risks of TB. For example, compared to individuals carrying the common haplotype Ars8087945Ars12456774, those with A rs8087945G rs12456774 or G rs8087945A rs12456774 had a significantly decreased risk. We should notice that in this study we only analyzed two haplotypes. Other haplotypes covering more SNPs might also contribute to the risk of TB.

Interestingly, chromosome 18q11.2 is a gene-desert region that is punctuated by evolutionarily conserved domains with regulatory potential [11]. Neither rs8087945 nor rs12456774 is located inside any gene or in the regulatory sequence. The nearest genes to these SNPs are GATA6, CTAGE1, RBBP8 and CABLES1, as well as a number of as yet unannotated open reading frames. Additional studies are required to ascertain their functional significance and any possible counterbalancing selective pressures. In addition, it must be noted that the association found in China could be population-specific; however, it could also be a false-positive result. For this reason, it is important that these findings should be replicated to confirm the association in other areas of China. Future work is needed to explore the nearest genes as well as a number of as yet unannotated open reading frames around this region.

Conclusions

Susceptibility locus of rs4331426 identified in the African population could not be validated in the Chinese population. Even though none of genetic polymorphisms we investigated was associated with TB in the single-point analysis, the haplotypes might contribute to the susceptibility to TB in the Chinese Han population. Additional studies are required to ascertain the causative variant, its functional significance and any possible counterbalancing selective pressures.