Background

Tuberculosis (TB), a highly contagious disease, is still a major health and economic burden [1]. Globally, approximately 10 million individuals developed TB and more than 1.3 million died of the disease in 2017, according to a WHO report [2]. Pulmonary tuberculosis (PTB), accounting for 75% of all TB cases, contributes substantially to TB mortality, especially with HIV co-infection [3, 4]. Correctly discriminating PTB is an important step to eliminate TB by 2030, a goal established by the WHO [2].

In clinical practice, sputum smear microscopy is ineffective for detecting PTB [5]. Specimen culture for Mtb provides the most accurate diagnosis [6]. However, the results of microbiological examination and acid-fast bacillus stains depend on the sputum sample. Immunological tests, such as the tuberculin skin test (TST) and interferon-gamma release assay (IGRA), are auxiliary diagnostic tools for PTB [7]. TST has a low specificity in Bacilli Calmette Guerin (BCG)-vaccinated individuals [7]. In children, IGRAs can yield many indeterminate results [8, 9]. Considering these limitations, additional valid tools are required to improve the diagnosis of PTB.

Interferon gamma-induced protein 10 (IP-10), an IFN-gamma-inducible chemokine, could be expressed at 100-fold higher levels than those of IFN-gamma after TB infection [10, 11]. Age and gender do not affect the level of IP-10 [11, 12]. Since 2007, IP-10 has been reported as a potential parameter for PTB detection [7, 13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29].

Many studies have evaluated the diagnostic potential of IP-10 for PTB, but the results are variable. Therefore, the aim of this study was to synthesize and analyze the diagnostic value of IP-10 for PTB.

Methods

Literature search

This study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Diagnostic Test Accuracy criteria 2018 (PRISMA-DTA 2018) [30]. The Web of Science, PubMed, the Cochrane Library and Embase databases were used to search for relevant English language citations published up to February 2019. Our search terms were “tuberculosis,” “pulmonary tuberculosis,” “Chemokine CXCL10,” and “interferon gamma-induced protein 10.” Comprehensive literature search strategies were used based on the following combination of MeSH terms, title/abstracts and all fields for these databases (Additional file 1: Table S1). Additionally, the reference lists of the applicable studies, relevant research letters, and reviews were manually searched to find other potentially relevant studies.

Literature selection

Two investigators independently determined literature eligibility. Studies reporting IP-10 levels for the detection of PTB were included according to the following criteria: (1) reporting on individuals with PTB and non-TB (population); (2) provision of IP-10 in whole blood and plasma as index test; (4) Mtb culture as a gold standard, and other reference standard including pathological examination, microscopy and genexpert MTB/RIF test (WHO recommended) [2]; (5) the primary outcomes including diagnostic performance of IP-10 (sensitivity and specificity); (5) randomized controlled trails, prospective and retrospective studies included (study design); (6) more than 10 individuals reported meeting the inclusion criteria. Studies not published in English, other letters (except research letters), conference abstracts, veterinary experiments, reviews and case reports were excluded.

Data extraction

The following data were extracted: the first author, year of publication, country, TB high-burden, study design, age, number of participants (patients with PTB and non-TB subjects), TB site, non-TB status, cut-off for index test (IP-10), diagnostic reference standard, method and condition for the IP-10 assay, HIV-infection status, sensitivity, specificity, true positive (TP), false positive (FP), false negative (FN), and true negative (TN) for IP-10. Two investigators independently extracted data from eligible articles, and disagreements were resolved by discussing and reaching a consensus.

Quality assessment

According to the Cochrane Collaboration, two investigators independently reviewed the methodological quality of eligible articles by Quality Assessment of Diagnostic Accuracy Studies tool-2 (QUADAS-2) [31, 32]. Disagreements were resolved by consensus. Revman (version 5.3) was used to perform the quality assessment.

Data analysis

Excel was used to construct a two-by-two table, including TP, FP, FN, and TN for patients with PTB. Stata (version 14.0) was used to perform the data analysis. The index test had different optimal cut-offs. According to the recommendation of Cochrane Collaboration, the hierarchical summary receiver operating characteristic (HSROC) model by Rutter et al. was utilized when the index test was assessed by applying various thresholds [32, 33]. The HSROC curve was computed with the “metandi” command [34]. Prediction region presented possible point of sensitivity and specificity in the HSROC curve. The summary point showed the pooled sensitivity and specificity under the optimal threshold value. Confidence region reflected the possible summary point.

The main outcomes were the diagnostic performance of IP-10 for detecting PTB by the random effect model, as evaluated by the summary estimates of sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and the area under the curve (AUC). Sensitivity, reflecting the ability of index test to detect patients, calculated by “Sensitivity = TP/(TP + FN)”. Specificity, reflecting the ability of index test to eliminate disease-free, calculated by “Specificity = TN/(FP + TN)”. PLR, a measure of index test for detection potential for disease, could be calculated by the formula “PLR = Sensitivity/(1-Specificity)”. NLR, a measure of index test for detection potential for non-disease, could be calculated by the formula “NLR = (1-Sensitivity)/Specificity”. DOR, a measure for overall accuracy of index test, could be calculated by the formula “DOR = (TP/FN)/(FP/TN)”. AUC, indicated how the index test was accurate, especially exceeded 0.90. 95% confidence interval (CI) was calculated by wilson method and no correction factor applied.

The I2 value was not suitable for the quantification of heterogeneity in accuracy studies [35]. Thus, to explore potential sources of heterogeneity, we used a meta-regression analysis with the “midas” command. The intercept was zero. Seven subgroups were created: TB high-burden country (yes or no), study design type (cohort or not), age (adults or not), IP-10 method (multiplex cytokines assay or ELISA), IP-10 condition (unstimulated or stimulated), and HIV-infection status (yes/some or no).

The Deeks test was used to assess publication bias using the “midas” command [36]. No publication bias existed when studies evenly distributed on the sides of regression line or P value exceeded 0.05 in Deeks’ funnel plot.

The whole process of data analysis was described in Additional file 2.

Results

Search results

In total, 1349 records were identified from our literature searches (Fig. 1). After removing 623 duplicates, we read titles and abstracts and excluded 682 records. An additional 447 records were non-eligible for various reasons (e.g., studies involving leprosy, Crohn’s disease, pneumonia, monocyte chemotactic protein-1, interleukin-12, and interleukin-18), 73 records were animal experiments (mouse, calves, warthogs, etc.), 69 records were reviews, abstracts, and letters, 58 records focused on extra-PTB (pleural TB, TB meningitis, osteoarticular TB, etc.), and 5 records were non-English (Chinese, Russian, Polish, etc.). Then, we reviewed the full texts of 44 articles. Ultimately, 18 articles were included in this study.

Fig. 1
figure 1

Flow chart of the process of the search strategy for study selection

Characteristics of included studies

The main characteristics of the 18 articles, comprising 24 trials, are listed in Tables 1 and 2 [7, 13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]. In total, 2836 participants were involved. The year of publication ranged from 2012 to 2018. Nine (50%) studies were from TB high-burden countries (China, South Africa, India, Thailand, and Uganda), and nine (50%) studies were from TB low-burden countries, according to WHO [2]. Study design, TB site, non-TB status, IP-10 cut-off, and reference standards are summarized in Table 1. IP-10 method, condition, HIV-infection status, cut-off values, sensitivity, specificity, TP, FP, FN, and TN of IP-10 for each trial are shown in Table 2.

Table 1 The main characteristics of included studies
Table 2 Baseline data of included studies

Quality of included studies

The QUADAS-2 tool reflects the methodological quality of included articles (Additional file 3: Figure S1). Patient selection bias was unclear for five studies; one study used a case-control design [19] and four studies did not report the time and consecutiveness of patient enrolment [17, 21, 23, 27]. Additionally, 50% of studies had unclear bias in index tests; in particular, we could not determine whether the results were interpreted in blind conditions [7, 18, 20, 23,24,25, 27,28,29]. One study had high risk of bias in the reference standard, which was clinical PTB by clinical presentation and radiological confirmation [16]. Flow and timing bias were unclear in three studies, in which patients were lost in the analysis [19,20,21]. The applicability concerns were generally low.

Summary statistics

A total of 2836 participants, comprising 3219 blood samples were included. The sensitivity for IP-10 was 0.86 (95% CI: 0.80–0.90) and the specificity was 0.88 (95% CI: 0.82–0.92). The pooled PLR was 7.00 (95% CI: 4.76–10.30), and the pooled NLR was 0.16 (95% CI: 0.12–0.23). The pooled DOR was 43.01 (95% CI: 25.80–71.69), indicating that the discriminatory effect of IP-10 was good. The AUC was 0.93 (95% CI: 0.91–0.95), showed the accuracy of IP-10 was good. Figure 2 shows the HSROC curves for IP-10, under the optimal threshold value, the pooled sensitivity and specificity were 0.86 and 0.88, respectively.

Fig. 2
figure 2

The HSROC curve for assessment of IP-10 for PTB

Heterogeneity

As shown in Table 3, heterogeneity was assessed by a meta-regression analysis. Heterogeneity was not detected with respect to TB high-burden versus TB low-burden countries (P = 0.83), cohort versus other study design types (P = 0.55), adults versus children (with or without adults) (P = 0.59), multiplex cytokine assay versus ELISA to detect IP-10 (P = 0.73), IP-10 stimulation or not (P = 0.72), and HIV infection or not (P = 0.53).

Table 3 Heterogeneity assessment

Publication bias

Deeks’ funnel plot showed no statistical significance (P = 0.20), indicating no striking publication bias in this study (Additional file 4: Figure S2).

Discussion

PTB is still a major cause of death worldwide, especially in immunocompromised individuals and children younger than 5 years [37, 38]. The accurate detection and timely treatment of PTB are important components of the “End TB Strategy” globally [39]. Currently, methods for detecting PTB depend on the region, BCG-vaccinated status, HIV status, etc. The search for new markers for the auxiliary diagnosis of PTB is ongoing. Several studies have shown that IP-10 is a promising marker for PTB detection [7, 13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29].

In 2014, Guo et al. published a meta-analysis of studies of IP-10 for diagnosing TB [40]. The diagnostic performance of IP-10 was moderate. In this study, both PTB and extra-PTB individuals were included, and plasma and pleural effusion samples were included. However, the diagnostic standards for PTB and extra-PTB were different. Pleural effusion detection is more traumatic than the use of peripheral venous blood.

Considering these limitations, we performed a meta-analysis to evaluate the overall diagnostic performance of blood IP-10 as a potential biomarker for detecting PTB. We found that IP-10 could be a valuable detection tool (sensitivity: 86%, specificity: 88%). The PLR (7.00>1.00) suggested that IP-10 had good detection potential for PTB. The NLR (0.16<1.00) indicated that IP-10 distinguished non-TB individuals well. The DOR (43.01) indicated a good overall performance of IP-10 in discriminating between PTB and non-TB.

The TST and IGRA, as immunodiagnostic tests, are recommended for the auxiliary diagnosis of PTB by the WHO [2]. The TST could show cross-reactivity in BCG-vaccinated individuals. However, IP-10 is less influenced by BCG vaccination [7]. Ruhwald et al. reported that IP-10 has a much higher sensitivity (92.5%) when compared to the TST (73.9%), and suggested that IP-10 is an alternative biomarker of TST [41]. The recently developed IGRA can overcome some limitations of TST. However, it lacks power when applied to children and individuals coinfected with HIV [9, 14]. IP-10 could be produced at a high level in these populations [42, 43]. Vanini et al. showed that the sensitivity is 66.7% for IP-10-based test and 52.4% for the IGRA in HIV-infected individuals [44].

In bivariate analyses, TB-burden country, study design, age, IP-10 detection method, assay conditions, and HIV infection status were not significant sources of heterogeneity. We also found that the diagnostic performance of IP-10 was similar in multiplex cytokine assays and ELISA (sensitivity: 84% vs. 87%, specificity: 89% vs. 87%). These two methods were comparable with respect to reliability and reproducibility [20]. Considering the cost, ELISA is preferred over multiplex cytokine assays. Stimulated and unstimulated IP-10 had similar diagnostic accuracies for PTB, suggesting that IP-10 could be detected in both conditions. IP-10 had a higher diagnostic potential in HIV-infected individuals, consistent with previous findings [45].

Certainly, our meta-analysis had several limitations. First, we enrolled studies which had various cut-offs of IP-10 assays. In most situations, the investigators of included studies might choose the different cut-offs according to their aims. Second, IP-10 assays are usually performed in combination with conventional tests, but we did not address the reliability and incremental benefit of adding IP-10 to other tests. Third, some studies included patients with PTB after treatment while others did not. Furthermore, the severity and extent of PTB might vary. These factors might influence the diagnostic potential of IP-10. Fourth, heterogeneity could not be ignored. Although the TB-burden country, design type, age, IP-10 method, IP-10 condition and HIV-infection status were not significant sources of heterogeneity in this meta-regression analysis (P > 0.05), they could also increase the heterogeneity and reduce the generalizability of the overall performance of IP-10. Furthermore, intercurrent diseases (diabetes mellitus and malignancy) in the included studies might influence heterogeneity.

Despite the low probability of publication bias, it was a concern. Based on the linguistic abilities of our team, only studies written in English were included. The true potential of IP-10 for discriminating PTB from non-TB might be lower than we reported.

Conclusions

In conclusion, this meta-analysis shows that IP-10 is a promising and reliable marker for differentiating PTB from non-TB. Updated global TB reports should consider IP-10 as an auxiliary diagnostic method for PTB. Furthermore, large, multi-center, prospective studies are warranted to support our findings.