Background

Tuberculosis (TB), an infectious disease caused by Mycobacterium tuberculosis (MTB) complex, usually affects the lungs but also affects other parts of the body [1]. The typical symptoms of active pulmonary TB (PTB) are chronic cough and hemoptysis, fever, night sweats, and weight loss. Currently, about a quarter of the world’s population are carriers of MTB [2]. In 2018, there were an estimated 10.0 million new TB cases globally; of these, China accounted for 9% of the global total [2]. There were 1.3 million deaths from TB in 2017, and TB became the leading cause of infectious diseases all over the world [2, 3]. Notably, extrapulmonary TB (EPTB) is also emerging as a serious clinical problem, and comprises an increased proportion of total TB cases in the past few decades [4,5,6].

Traditionally, the diagnosis of active TB is mainly based on chest X-ray, microscopy, and body fluid culture; whilst the diagnosis of latent TB depends on the tuberculin skin test or hematology test. Histology and X-ray relies on highly trained operators, and characteristic morphology is shared with other diseases. Acid-fast bacilli (AFB) smear microscopy remains the most used and widely available TB diagnostic method in low-income and middle-income countries; however, as many as 40–50% of active TB cases were smear-negative [7]. TB culture requires 2–6 weeks for interpretation [2], and has less than perfect sensitivity [8, 9]; thus culture was not done with all presumptive patients in non-TB specialized hospitals in China. Interferon-gamma (IFN-γ) release assay (IGRA) is a new immunoassay for TB diagnosis, and has been widely applied throughout China in recent years; however, the heterogeneity of diagnostic efficacy in active TB samples varies from 50 to 100%, with a specificity of 83–98% [10, 11]. Thus, the current diagnosis of TB is still challenging.

The rapid test Xpert MTB/RIF (Cepheid, Sunnyvale, CA, USA), an automated real-time PCR platform that can detect both MTB complex and rifampicin resistance within two hours, has been recommended by the World Health Organization (WHO) as the initial diagnostic test in all persons with signs and symptoms of TB [12]. Xpert MTB/RIF has been well documented in the literature in many countries worldwide; however, as a country with large numbers of TB patients, there are relatively few studies on this method in China. Xpert MTB/RIF has been routinely used in Peking University People’s Hospital (PKUPH), a comprehensive teaching hospital in Beijing, China, since November 2016. Thus, the aim of this study was to evaluate the performance of Xpert MTB/RIF, and provide a certain reference and guidance for the detection and diagnosis of TB in non-TB specialized hospitals.

Methods

Study design

This is a retrospective survey and analysis of data collected for clinical purposes. Inpatients simultaneously tested with Xpert MTB/RIF, AFB smear microscopy, and IGRA at PKUPH from November 2016 to October 2018 were included. PKUPH is a non-TB specialized, comprehensive teaching hospital, which serves patients in Beijing, as well as surrounding cities in northern China, and even throughout China. Equipped with over 2000 beds, the hospital admits more than 78,000 inpatients and handles nearly 2.6 million outpatient visits each year.

Acid-fast bacilli (AFB) smear microscopy

Smear microscopy was performed according to the Clinical and Laboratory Standards Institute (CLSI) M48-A guideline [13]. Specimens were stained for acid-fast microscopic examination using the Ziehl-Neelsen stain (BaSO Diagnostics Inc., Zhuhai, China). Smear-positive specimens were graded from 1+ to 4+ according to the American Thoracic Society scale [14].

Interferon-gamma release assay (IGRA)

The heparinized blood of patients was collected and used for the T-SPOT®.TB test (Oxford Immunotec, Oxford, UK) in accordance with the manufacturer’s instructions. When the negative and positive quality controls were under control, the results were considered positive if either panel A or panel B had ≥6 spots number.

Xpert MTB/RIF assay

Xpert MTB/RIF assay was performed on the GeneXpert Dx instrument system according to the manufacturer’s recommendations (Cepheid, Sunnyvale, CA, USA) (Detailed in supplementary material 1). The Xpert software was used to interpret the results, and semiquantitative results were provided based on the cycle threshold (CT) defined by the manufacturer as follows: high (CT ≤ 16), medium (16 < CT ≤ 22), low (22 < CT ≤ 28), and very low (CT > 28).

Classification and diagnosis

Active TB cases were considered comprehensively according to a composite reference standard (CRS), which combined the WHO guidelines [15] and WS 288--2017 (released on 2017-11-09 and implemented on 2018-05-01) by the National Health and Family Planning Commission of the People’s Republic of China (NHFPC, is now National Health Commission of the People’s Republic of China) [16]. Briefly, its contents are as follows: (1) microbiologically confirmed PTB: patients with positive MTB culture, or a pulmonary case with one or more positive initial sputum smears and chest imaging; (2) molecular biologically confirmed PTB: patients with positive MTB nucleic acid test and chest imaging; (3) histopathological examination confirmed PTB: patients with positive histopathological examination; (4) EPTB: patients with definite TB involving organs other than the lungs, with MTB isolated from a non-pulmonary source or histological or strong clinical evidence consistent with active EPTB, as well as improvement observed in anti-TB specific therapy. Thus, CRS was a combination of many pertinent aspects, which could diagnose the culture-negative TB and provide a higher accuracy. The cases were categorized as confirmed TB (PTB, EPTB, or tuberculous pleurisy, i.e., TP) or non-TB; the concurrent PTB and EPTB/TP cases were classified as PTB [15]. Then, different methods were evaluated and compared.

Results

Patient characteristics

From November 2016 to October 2018, 1423 patients were tested using the Xpert in PKUPH. Patients who did not undergo AFB smear microscopy and/or IGRA (n = 602), patients whose Xpert specimen was inconsistent with AFB smear microscopy (n = 20) or had non-corresponding diagnostic results (n = 14) were excluded. In all, 787 patients were enrolled in this survey (Fig. 1). The mean age of the patients was 55.4 ± 18.5 years, and 55.1% of them were males (Table S1). Combined with presumptive symptoms, specimens, and clinical diagnosis, the patients were divided into three categories: presumptive PTB (533 cases), presumptive EPTB (172 cases) and presumptive TP (82 cases) (Table 2). When using CRS as the gold standard, 89 patients (11.3%) were confirmed as active TB: 52 PTB, 17 EPTB, and 20 TP (Fig. 1).

Fig. 1
figure 1

Flowchart explaining the overall patient flow and diagnostic classifications. AFB, Acid-fast bacilli; Interferon-gamma release assay, IGRA; TB, tuberculosis; PTB, pulmonary tuberculosis; EPTB, Extrapulmonary tuberculosis; TP, tuberculous pleurisy

Comparison of Xpert MTB/RIF, AFB smear microscopy, and IGRA

The positive ratios of AFB, Xpert, and IGRA were 3.4% (27/787), 8.3% (65/787), and 48.3% (380/787), respectively. Figure 2 shows the received operating characteristic (ROC) curves with area under the curve (AUC) for Xpert, AFB smear microscopy, and IGRA in comparison to the gold standard. The diagnostic performance of IGRA (AUC = 0.754, 95% confidence interval, i.e., 95% CI 0.722–0.783; sensitivity = 93.3, 95% CI 86.1–96.9%; specificity = 53.5, 95% CI 53.8–61.1%) was significantly higher than AFB (AUC = 0.614, 95% CI 0.579–0.648; sensitivity = 23.6, 95% CI 16.0–33.4%; specificity = 99.1, 95% CI 98.1–99.6%) (P < 0.001, Z test). Xpert showed the best diagnostic performance, with AUC value 0.846 (95% CI 0.819–0.871), sensitivity 69.7% (95% CI 59.5–78.2%), specificity 99.6% (95% CI 98.7–99.9%), and significantly higher than the other two methods (P < 0.001) (Fig. 2 and Table 1).

Fig. 2
figure 2

Receiver operating curves (ROC) for Xpert MTB/RIF, Acid-fast bacilli (AFB) smear microscopy and Interferon-gamma release assay (IGRA) to differentiate tuberculosis (TB) infection and non-TB infection cases. AUROC, area under the ROC

Table 1 Performance of Xpert MTB/RIF, AFB smear microscopy and IGRA in detecting PTB, EPTB and TP

In presumptive PTB, EPTB, and TP cases, the positive ratios of Xpert were 9.0, 8.1, and 3.7%, respectively, which were higher than AFB (4.5, 1.7%, and 0), but lower than IGRA (49.2, 37.8, and 64.6%). Among the 89 confirmed TB cases, 19 cases (21.3%) tested positive using all the three methods, and the use of these three methods simultaneously could screen 96.6% (95% CI 90.5–98.9%) TB patients (Fig. 3a); 62 (69.7%, 46 PTB, 13 EPTB, and 9 TP), 21 (23.6%, 19 PTB, 2 EPTB, and 0 TP) and 83 (93.3%, 50 PTB, 14 EPTB, and 19 TP) cases tested positive using Xpert, AFB, and IGRA, respectively (Fig. 3b), while 3 (3.4%) confirmed TB cases (1 TP and 2 EPTB, i.e., 1 lumbar spine TB and 1 arthritis TB) tested negative using all the three methods. As shown in Table 1, the sensitivity of Xpert in detecting PTB, EPTB, and TP was 88.5% (95% CI 77.0–94.6%), 76.5% (95% CI 52.7–90.4%) and 15.0% (95% CI 5.2–36.0%), respectively, slightly lower than IGRA (96.2, 82.4, and 95.0%), but higher than AFB (36.5, 11.8, and 0%). IGRA had the highest sensitivity, but the specificity (55.9, 67.1, and 45.2%) was significantly lower than Xpert (99.6, 99.4, and 100%) and AFB (99.0, 99.4, and 100%) (P < 0.001); notably, the positive predictive values (PPV) (7.3, 95% CI 4.9–10.7%) is poor when only IGRA is positive.

Fig. 3
figure 3

a Venn diagram of overlap in TB detection using Xpert MTB/RIF, Acid-fast bacilli (AFB) smear microscopy and Interferon-gamma release assay (IGRA); and b The number of true positive tuberculosis cases detected by the three methods. PTB, pulmonary tuberculosis; EPTB, Extrapulmonary tuberculosis; TP, tuberculous pleurisy

Performance of Xpert MTB/RIF assay in different specimens

As shown in Table 2, the highest sensitivity of Xpert in pulmonary specimens was in lung tissue (100%), followed by sputum (88.5%), BALF (85.7%), and FOB (81.2%); the specificity of all pulmonary specimens was higher than 99%. When detecting extrapulmonary specimens, the sensitivity and specificity of CSF, joint cavity fluid, and lymph node specimen were 100%; while the sensitivity for urine was relatively low (33, 95% CI 6.2–79.2%). The specificity of intestine tissue is 98.1%; other extrapulmonary specimens (except pleural fluid) showed a specificity of 100%. Notably, the sensitivity of pleural fluid was the lowest, only 15.0% (95% CI 5.2–36.0%), and significantly lower than pulmonary tissue, CSF, joint cavity fluid, extrapulmonary lymph node, sputum, BALF, and FOB (P < 0.05).

Table 2 Performance of Xpert MTB/RIF in different specimens

Correlation between Xpert semiquantitative results and AFB smear microscopy results

The correlation between Xpert semiquantitative category and AFB smear grade is presented as cross-tabulated data (Table 3). Of 65 Xpert positive cases, 12 were positive high and 13 were positive medium, and PPV were all 100%; two and one false positive cases were obtained in positive low and positive very low cases, respectively. The sensitivity of AFB decreased gradually, accompanied with the Xpert results from 75.0% in Xpert positive high cases to 3.7% in Xpert negative cases. Notably, the PPV of AFB suddenly dropped from 100% in Xpert positive cases to 14.3% in Xpert negative cases, i.e., six of the seven (85.7%) positive AFB cases that occurred in Xpert negative cases were false positive. Totally, the PPV of Xpert (95.4, 95% CI 87.3–98.4%) were significantly higher than AFB (77.8, 95% CI 59.2–89.3%) (P < 0.05). Besides, the smear grades increased as the Xpert CT values decreased among the 62 Xpert true positive cases (Figure S1), and there was a strong, negative correlation between smear grades and Xpert CT values, which was statistically significant (R = -0.632, P < 0.001).

Table 3 Correlation between Xpert MTB/RIF semi-quantitative results and AFB smear microscopy results

Comparison of Xpert MTB/RIF and IGRA in AFB smear-negative patients

Among the 27 AFB positive cases, 21 confirmed TB cases (23.6%, 21/89, 19 PTB, and 2EPTB) were detected, with a PPV of 77.8%; the pooled sensitivity, specificity, PPV, and negative predictive value (NPV) of Xpert were 95.2, 100, 100, and 85.7%, respectively, all higher than IGRA (90.5, 50.0, 86.4, and 60.0%, respectively). Obviously, 68 confirmed TB (76.4%, 68/89, 33 PTB, 15 EPTB, and 20 TP) were detected using Xpert and/or IGRA in AFB smear-negative patients (n = 760, Table S2); the sensitivity of Xpert (61.8, 95% CI 49.9–72.4%) was significantly lower than IGRA (94.1%, 95 CI 85.8–97.7%) (P < 0.001), while the specificity of the former (99.6, 95% CI 98.7–99.9%) was significantly higher than the latter (57.5, 95% CI 53.8–61.2%) (P < 0.001). In detail, the sensitivity of Xpert was 84.9, 73.3, and 15.0% for detecting presumptive PTB, EPTB, and TP cases, respectively, which were lower than IGRA (100, 80.0, and 95.0%); while the specificity of Xpert (99.6, 99.4, and 100%) was significantly higher than IGRA (56.1, 66.9, and 45.2%) (P < 0.001).

Discussion

With the progress of molecular biology technology, the use of Xpert for TB detection in China is increasing. In total, 787 patients simultaneously tested with Xpert, AFB, and IGRA at the PKUPH from November 2016 to October 2018 were enrolled, and according to ROC curves (Fig. 2), Xpert showed the best diagnostic performance (AUC = 0.846, sensitivity = 69.7% and specificity = 99.6%), and significantly higher than AFB and IGRA (P < 0.001). The high specificity of Xpert across all specimens highlights its utility as a rule-in test for TB diagnosis, and can be used to reliably inform the start of TB treatment when positive [17].

This study confirms that Xpert showed high sensitivity (88.5%) in the diagnosis of PTB, as reported by previous studies (82–88%) [18]. It is interesting to note that lung tissue (n = 52), a specimen that was less evaluated in previous studies, showed the highest sensitivity (100%, 7/7) among pulmonary specimens, probably due to higher bacteria loads in diseased tissues than sputum, BALF, and FOB. Consistent with the literature (68–94%) [17], this research found that Xpert also showed high sensitivity (76.5%) in the diagnosis of EPTB. When pleural fluid was used to diagnose TP, the high specificity (100%) of Xpert suggested its high value in confirming TP diagnosis; however, the low sensitivity (15.0%), which was similar to previous studies (14–34%) [17, 19, 20], indicated that Xpert is of limited value for TP screening. Low sensitivity may be attributed to the presence of PCR inhibitors in pleural fluid [20], or the loading of MTB is too low that the detection limit cannot be reached even by centrifugation. Thus, this study together with previous literature, did not recommend the use of pleural fluid as a specimen of Xpert for the diagnosis of TP, until pleural fluid was optimized for improved sensitivity. Pleural biopsy may be a better alternative, but an invasive procedure was required, and the sensitivity (45%) improvement was still not ideal [19].

There were three Xpert false positive cases in this study, one sputum (CT = 26.8, AFB negative and IGRA positive), one BALF (CT = 29.3, AFB negative and IGRA positive), and one intestinal tissue (CT = 27.0, AFB negative and IGRA negative). According to previous studies, the false-positive results of Xpert may occur in patients with prior TB, and Xpert may detect cell-free DNA rather than DNA in cells [21].

The false positives of IGRA are common in this survey (false positives rate, i.e., FPR was 46.5%), which is consistent with previous research in China (FPR was 43.6%) [22], mainly because China is a country with high burden of TB, and the prevalence of latent TB is as high as 44.5% [23]; in addition, for patients with previous TB, there may be a long-term presence of antigen-specific memory T cells in the body, resulting in false positive results [24]. Thus, IGRA was not suggested to be used alone for the diagnosis of active PTB in high-burden TB settings. There were few false negatives of IGRA in this study (six cases): the reduced immune response or immunosuppression of T lymphocyte function against MTB-specific antigens due to increasing age, long-term hospitalization (> 6 months), overweight, obesity, concomitant immune system instability, HIV infection and other immunosuppressive diseases, and the use of steroid drugs may lead to false negative results of IGRA [24]. However, in consideration of sampling for the detection of EPTB and TP, which is still challenging in peripheral level laboratories in China, the highly sensitive IGRA was recommend for use as a supplementary test for TP and EPTB diagnosis.

The specificity of AFB was high (99.1%), and false positive cases (six cases) may mainly be due to infections caused by nontuberculous mycobacteria. Low sensitivity of AFB (23.6%; 68 AFB false negative cases occurred) due to its high detection limit (5000 ~ 10,000 CFU/mL, while the detection limit of Xpert was 131 CFU/mL in sputum) [25, 26]. Thus, its utility as a rule-in test for TB diagnosis was limited, and it is not recommended for use alone for early TB diagnosis. Notably, for smear-positive cases, Xpert (95.2%) had higher sensitivity than IGRA (90.5%).

The positive rate of rifampicin-resistant TB in this study was 12.9% (8/62) (for reference only, as there is no gold standard for drug-resistant TB confirmation in this study), higher than the 5th Chinese TB survey in 2010 (6.8%) [27], but lower than the previous research in a tertiary TB referral hospital in Beijing, China (16.9% in 2006 and 30.5% in 2012) [28], which may be due to the higher proportion of previously-treated TB cases in the latter study, as previous treatment is a well-known risk factor for drug-resistant TB. Considering that the importance of early and effective treatment has been highlighted, it is necessary to monitor the occurrence of TB and rifampicin resistance rate, and maintain good practice in TB prevention, care, and treatment in non-TB specialized hospitals.

This study has some limitations. First, this was a single-center retrospective study, thus the overall relevant scope of our findings was limited. Second, PKUPH is a non-TB specialized comprehensive teaching hospital, and TB culture was not carried out in many patients, thus the culture method was not analyzed and failure to compare with CRS (data is shown in Table S3); besides, the culture-based MTB antimicrobial susceptibility test was not conducted, thus rifampicin-resistant TB was not verified. Third, the low priori power (< 0.8) (power analyses was done by NCCS-PASS 11 program) for assessment of the partial result of Xpert, AFB smear microscopy, and IGRA in detecting presumptive TB, as well as different specimens tested using Xpert may cause type of error and overlook significant differences due to the small sample size. However, the sensitivity, specificity, PPV, NPV, and corresponding 95% CI of the different methods used in our study will be helpful in designing future multicenter prospective studies covering different regions of China.

Conclusions

The traditional AFB smear microscopy method showed low sensitivity and high specificity, thus it is not recommended for use alone for early TB diagnosis. Though the specificity of IGRA is relatively low, it still has a certain diagnostic value, especially when assisting diagnosis of EPTB, where the specimens are difficult to sample and TP, where the performance of Xpert is relatively poor. Xpert showed both high sensitivity and high specificity, even in AFB smear-negative patients; however, pleural fluid is not recommended as a specimen for Xpert for the diagnosis of TP. The simultaneous use of these three methods could help screening 96.6% of TB patients, but it should be noted that the PPV (7.3%) is poor when only IGRA is positive. In future studies, the treatment process of pleural fluid should be optimized, and pleural biopsy may be sent together or as a substitute to improve Xpert detection efficiency.