Background

Tuberculosis (TB), an important infectious disease caused by Mycobacterium tuberculosis (M.tb), remains the world’s deadliest infectious killer; indeed, close to 30,000 new cases and over 4000 deaths from this disease are recorded worldwide each day [1]. Pulmonary TB (PTB) is the most prevalent clinical manifestation of TB. Thus, the early diagnosis of PTB is of paramount importance in efforts to treat and control the spread of the disease. Imaging examination, sputum smear microscopy, and culture are among the most commonly used methods for diagnosing PTB [2]. However, the sensitivity of these techniques is usually low and variable [3]. Moreover, sputum testing is impossible in cases with PTB who are unable to expectorate. Although some rapid molecular testing methods with higher sensitivity are used in regions with advanced laboratory facilities, these methods are costly and difficult to implement in resource-limited settings [2].

Rapid serological diagnostic approaches are essential and useful in the diagnosis of PTB. Although a number of serological tests, such as interferon gamma release assays (IGRAs) and M.tb antibody detection, have been developed for the accurate and rapid diagnosis of M.tb infection, the test results are not correlated with disease activity or treatment responses [4, 5]. Besides IGRAs and TB-Ab detection, CA-125 also has been used for PTB diagnosis [6,7,8]. CA-125 is a high-molecular weight glycoprotein expressed on mesothelial cells lining the pleura, peritoneum, and pericardium and epithelial cells of the endometrium and fallopian tubes [9]. Elevated serum CA-125 levels can be detected in some patients with malignant diseases involving the breasts, lungs, ovaries, colon, and pancreas, as well as in others with non-malignant diseases, such as endometriosis, uterine myomas, ovarian cysts, hepatic cirrhosis, pleural effusions, peritonitis, pancreatitis, and heart failure [8,9,10,11,12]. Although studies on the relation between CA-125 and PTB have been reported, some scholars believe that the results of these studies are biased toward patients with pleural effusions [13, 14]. In addition, the specificity and sensitivity of serum CA-125 in diagnosing PTB vary widely among different studies, and no multicenter study with a large sample size has yet been published to confirm the value of serum CA-125 in diagnosing PTB.

The aim of the present study is to evaluate the diagnostic performance of serum CA-125 in PTB patients via a systematic review and meta-analysis of data collected from previous studies.

Methods

Literature search and selection

The present study was conducted in accordance with the recommendations of the Cochrane Diagnostic Test Accuracy Working Group [15]. The PubMed, Web of Science, Embase, Chinese BioMedical databases, China National Knowledge Internet (CNKI), Wanfang Data and Cochrane Databases were searched by two researchers to screen for eligible studies published up to February 2020 by using the keywords “Tuberculosis”, “CA-125”, “Carbohydrate antigen 125”, “Cancer antigen 125”, “Tumor marker”, and “Cancer marker” as a combination of free text and thesaurus terms. The reference lists of the obtained articles were also searched for possible candidate studies. No language limits were applied. The inclusion criteria were as follows: studies that assessed the diagnostic accuracy of serum CA-125 for PTB diagnosis, included PTB and non-PTB participants, and reported true-positive, true-negative, false-positive, and false-negative rates; in the absence of this last criterion, the data in the original studies must be sufficient to enable the corresponding rates. Studies that failed to meet the inclusion criteria or lacked the essential information were excluded from the analyses.

Data extraction and quality assessment

Data extraction and quality assessment of the original studies were independently performed by two reviewers. The following information was recorded: author, year of publication, language, country, age of participants, number of cases, CA-125 detection method, reference standard, study design, and CA-125 cut-off, sensitivity, specificity, true-positive, true-negative, false-positive, and false-negative rates for the diagnosis of PTB.

The modified version of quality assessment of diagnostic accuracy studies tool-2 (QUADAS-2) was used to assess the methodological quality of the recruited articles [16]. Four domains, namely, patient selection, index test, reference standard, and flow and timing, were considered to assess the risk of bias, and three domains, namely, patient selection, index test, and reference standard, were assessed based on applicability. In this study, some items for patient selection in the risk assessment of bias were revised; the items used to assess the methodological quality of the studies are listed in Table 1.

Table 1 The assessment items of methodological quality of studies

Discrepancies in article screening, data extraction, and quality assessment were resolved by a third reviewer through discussion or arbitration.

Statistical analysis

RevMan 5.2 software was used to assess the methodological quality of the included studies. STATA 15.1 software was used for the statistical analyses and data pooling for sensitivity, specificity, positive likelihood ratio (LR+), negative likelihood ratio (LR−), and diagnostic odds ratio (DOR). The summary receiver operating characteristic (SROC) curve was used to detect the diagnostic performance of serum CA-125 in PTB patients. The heterogeneity of the recruited studies was analyzed using the means of a test for the Q statistic, with the extent of heterogeneity determined using the I2 index. Here, I2 values of 25%, 50%, and 75% were considered to indicate low, moderate, and high heterogeneity, respectively. Meta-regression analysis by country, language, age of participants, number of cases, CA-125 detection method, and QUADAS-2 items was performed to discover potential sources of heterogeneity. Sensitivity analysis was performed to test the robustness of the results by using two methods: Cook’s distance was used to identify strongly influential studies, and a scatter plot of the standardized predicted random effects was used to detect outliers [17]. The meta-analysis was repeated to test the robustness of the results after the exclusion of strongly influential studies and outliers. Deek’s funnel plot asymmetry test was used to evaluate publication bias [18]. A two-tailed P-value < 0.05 was considered to indicate significant difference.

Results

Literature search

A total of 589 articles (English articles: 334; French articles: 2; Japanese articles: 2; Chinese articles: 251) were screened from above-mentioned databases, and 243 duplicate articles were removed. The titles and abstracts of the remaining 346 articles were reviewed, and 312 articles on non-pulmonary TB were removed. The remaining 34 articles on PTB were further evaluated, and 18 articles that did not meet the inclusion criteria (they lacked some indicators, such as true-positive, false-positive, true-negative, and false-negative, or these indicators couldn’t be calculated from the data in the original studies) were excluded. Finally, 16 articles (English articles: 10; Chinese articles: 6) meeting the inclusion criteria were recruited in the study [6, 7, 19,20,21,22,23,24,25,26,27,28,29,30,31,32] (Fig. 1).

Fig. 1
figure 1

Flow chart of the process of the search strategy for study selection

Characteristics of the eligible studies

The characteristics of the 16 recruited studies are listed in Table 2. Six articles were published in Chinese from 2009 to 2019, and 10 articles were published in English from 2001 to 2018.

Table 2 Main characteristics of studies included in the meta-analysis

Quality assessment

Two domains were identified as major risks for bias: patient selection and index test. Twelve studies indicated a high risk of bias in patient selection [6, 7, 19, 21,22,23,24, 26, 28,29,30, 32]. For example, whether random or consecutive samples of patients were enrolled was not clear in two studies [26, 32], the control groups did not include other respiratory diseases in eight studies [6, 7, 21,22,23, 28, 29, 32], and clear descriptions of whether the case groups excluded other combined diseases were not provided in seven studies [6, 7, 19, 22, 24, 26, 30]. Because none of the studies discussed whether the detection results of CA-125 were interpreted without knowledge of the results of the reference standard [6, 7, 19,20,21,22,23,24,25,26,27,28,29,30,31,32], these studies were likely to have unclear risks of bias in the index test. Thresholds were used in all eligible studies, but the corresponding values were not specified in 10 studies [6, 7, 20, 23, 25, 27,28,29,30,31]; thus, these studies were also likely to have a high risk of bias in the index test. Three studies had high applicability concerns in the patient selection domain [6, 7, 22] (Fig. 2).

Fig. 2
figure 2

Summary of methodological quality of studies according to the QUADAS-2 tool

Summary estimates

When all 16 studies were evaluated together, the pooled sensitivity, specificity, LR+, LR−, and DOR of CA-125 were 0.85 [95% confidence interval (CI) 0.75–0.91], 0.87 (95% CI 0.78–0.93), 6.65 (95% CI 3.62–12.20), 0.18 (95% CI 0.10–0.31), and 37.82 (95% CI 13.17–108.60), respectively (Table 3, Fig. 3). The area under the SROC curve (AUC) was 0.93. The SROC curve is displayed in Fig. 3.

Table 3 Pooled summary estimates of all studies
Fig. 3
figure 3

Summary receiver operating characteristics plot of sensitivity and specificity. Each circle represents an individual study; solid diamond in middle is summary sensitivity and specificity; inner ellipse represents 95% confidence region, and outer ellipse represents 95% prediction region

Exploration of heterogeneity

The threshold effect is one of the main causes of heterogeneity in test accuracy studies. In the present meta-analysis, the “non-shoulder-arm” plot in the SROC space indicated no threshold effect (Fig. 3). In addition, the proportion of heterogeneity likely due to the threshold effect was 0.22; such a low value is indicative of the absence of a threshold effect.

The results of heterogeneity analysis for sensitivity, specificity, LR+, LR−, and DOR showed that all P values were less than 0.05 (Table 3), which indicates the presence of significant heterogeneity in the meta-analysis. All I2 values were greater than 75% (Table 3), which indicates strong heterogeneity. Meta-regression analysis revealed that “pre-specified threshold (Yes/No)” and “mean or median age ≥ 45 years (Yes/No)” were significant sources of heterogeneity in sensitivity (P = 0.01 and P < 0.001) and specificity (P = 0.04 and P = 0.03); specifically, the pooled sensitivity and specificity of studies in which these factors were answered “Yes” were lower than those of studies in which these factors were answered “No” (Table 4). “Control group includes other respiratory diseases (Yes/No)” was also a significant source of heterogeneity in specificity (P < 0.001), and the pooled specificity of studies in which this factor was answered “Yes” was lower than that of studies in which this factor was answered “No” (Table 4).

Table 4 Meta-regression analyses of sensitivity and specificity

Sensitivity analysis

The results of sensitivity analysis revealed that the studies of Liang et al. [21], Şahin et al. [28], and Yilmaz et al. [6] were most influential in the present work (Fig. 4a). Among these studies, those of Şahin et al. [28] and Yilmaz et al. [6] were also identified as outliers with highly standardized residuals (Fig. 4b). When these three studies were excluded from the meta-analysis, the pooled sensitivity and specificity of CA-125 decreased from 0.85 to 0.83 and from 0.87 to 0.80, respectively.

Fig. 4
figure 4

Sensitivity analysis

Publication bias

In the present study, Deek’s funnel plot asymmetry test did not reveal a striking publication bias (P = 0.60), and the funnel plot did not exhibit asymmetry (Fig. 5).

Fig. 5
figure 5

The potential publication bias assessment. The plot shows the symmetric distribution of the log of diagnostic odds ratios against the inverse root of effective sample sizes (ESS), indicating the absence of any publication bias

Discussion

PTB can be confirmed by the presence of M.tb or its DNA in respiratory specimens. However, M.tb in respiratory specimens may not be detected in some cases of PTB, and respiratory specimens may not always be available; in this case, other detection methods, such as serological tests, must be used to assist in the diagnosis of PTB. CA-125 detection is a potential alternative detection method for PTB. CA-125 levels can increase in various malignant or non-malignant diseases. Previous studies suggested that serum CA-125 levels increase in patients with extrapulmonary TB [4, 33,34,35] and PTB [6, 7, 19,20,21,22,23,24,25,26,27,28,29,30,31,32]. However, to the best of our knowledge, the results of these studies have not been evaluated systematically.

A systematic review and meta-analysis of 16 studies were performed in the present work to assess the diagnostic performance of serum CA-125 in PTB patients. The results indicate that CA-125 detection could be beneficial in the diagnosis of PTB.

In addition to recruited studies, the studies that were excluded from the present meta-analysis due to lack of relevant information on sensitivity and specificity also suggested that serum CA-125 levels may assist in the diagnosis of active PTB and monitoring of therapeutic responses. A study in South Korea demonstrated that the mean serum CA-125 level (38.9 ± 41.4 U/ml) of patients with PTB is higher than the reference value (35 U/ml); however, only 38% of the patients in this study had serum CA-125 levels higher than the reference value [8]. The same study also suggested that elevated CA-125 levels are independently related to women, positive acid-fast staining of sputum, cavitary lung lesions, and involvement of more than one lung on chest X-ray, and CA-125 levels decreased after anti-TB treatment [8]. A study in Taiwan, China, also indicated that 45% of PTB patients had elevated serum CA-125 levels prior to treatment and that serum CA-125 levels decrease with improvements in anti-TB treatment outcomes [4]. Ichiki et al. found elevated serum CA-125 levels in 44.4% of patients with active PTB; after treatment with antituberculosis drugs, mean serum CA-125 levels significantly decreased [36]. Tascı et al. from Turkey observed that the serum CA-125 levels of PTB patients are significantly higher than those of healthy controls [37]. A significant decrease in serum CA-125 levels was observed after anti-TB treatment; however, if the serum CA-125 level was lower than 35 U/ml prior to treatment, the reduction achieved following anti-TB treatment was not significant [37]. The group also found that patients with a higher degree of sputum smear positivity have higher serum CA-125 levels [37].

Although previous studies suggested that serum CA-125 levels could assist in diagnosing active PTB and monitoring therapeutic responses [4, 6,7,8, 20, 21, 27,28,29, 32, 37, 38] and may be related to the severity of PTB [7, 8, 37], caution must be exercised when applying this finding to clinical practice. First, several studies have also reported that the proportion of PTB patients with elevated serum CA-125 levels is not especially high (38–45%) [4, 8, 36], which suggests that the serum CA-125 levels of most PTB patients may not be elevated. One study indicated that differences in CA-125 levels between the PTB and healthy control groups are not statistically significant [39]. Second, the recruited studies may present some risks for bias. For instance, the control groups in 8 studies did not include other respiratory diseases, 7 studies did not clearly describe whether the case group excluded other combined diseases, 10 studies did not provide specific thresholds, and all studies failed to illustrate whether the index test results were interpreted without knowledge of the results of the reference standard. Third, the recruited studies demonstrated significant heterogeneity. Meta-regression analysis indicated that the variables “control group includes other respiratory diseases,” “pre-specified threshold,” and “mean or median age ≥ 45 years” were associated with lower diagnostic specificity and/or sensitivity of CA-125 in the diagnosis of PTB.

Because CA-125 is present in mesothelial cells of the pleura, pericardium, or peritoneum, especially in areas of inflammation, increases in CA-125 are consistently observed in diseases involving these structures [9]. Serous effusions derived from these structures have been associated with increased serum concentrations of CA-125 [40,41,42,43,44]. Huang et al. suggested that the serum CA-125 levels of patients with tuberculous serositis (234.82 ± 279.25 U/ml) are significantly higher than those of PTB patients (48.26 ± 53.30 U/ml) [4]. Diabetes mellitus and adenocarcinoma may also increase serum CA-125 levels. Du et al. demonstrated that the serum CA-125 levels of PTB patients with type 2 DM (82.04 ± 82.96 U/ml) are significantly higher than those of PTB patients without DM (46.56 ± 42.47 U/ml) in initial treatment; moreover, the serum CA-125 levels of pulmonary adenocarcinoma patients (287.95 ± 341.64 U/ml) were also significantly higher than those of PTB patients with type 2 DM in initial treatment and retreatment, PTB patients without type 2 DM in initial treatment and retreatment, inactive PTB, bacterial pneumonia patients, patients with type 2 DM without PTB, and normal controls [7].

Given this background, if the control group does not include other respiratory diseases or the case group does not exclude other comorbid conditions, the specificity or sensitivity of CA-125 may be expected to increase. Knowledge of the reference standard is likely to influence the interpretation of the index test results [45], and the potential for bias could be associated with the subjectivity of interpreting the index test and order of testing [16]. In addition, if the test threshold used in the original study on the accuracy of the diagnostic test is the optimal result selected on the basis of sensitivity and/or specificity, then the test performance is also likely to be overestimated [16]. In the present study, a mean or median age of ≥ 45 years may be associated with the lower diagnostic sensitivity and specificity of CA-125, thus suggesting that serum CA-125 may have better diagnostic value in younger patients than in older ones. While only three studies were determined as most strongly influential in this meta-analysis and two of these studies were identified as outliers with high standardized residuals in the sensitivity determination, exclusion of these three studies did not remarkably affect the pooled sensitivity and specificity of CA-125. No striking publication bias was detected. These findings strengthen the validity of the results of our meta-analysis.

The present study presents several limitations. First, the meta-analysis did not recruit cohort studies, which may lead to overestimation of the test performance of serum CA-125. Second, the quality of the recruited studies was not high, and significant performance heterogeneity was noted; these limitations may affect the accuracy of serum CA-125 in the diagnosis of PTB. Although the results of the meta-regression analysis could explain part of the heterogeneity detected in the accuracy estimates, a considerable proportion of the heterogeneity observed remained unexplained. Third, although no major publication bias was found in this study, the possibility of publication bias cannot be completely excluded because positive results are generally more likely to be published than negative ones. Finally, although an extensive search for eligible sources was conducted, some qualified studies may still have been missed.

Conclusions

In conclusion, CA-125 presents potential practical value for diagnosing PTB, but its clinical applicability must be further examined. The combination of CA-125 with other clinical information, such as clinical symptoms, chest radiography, microscopy screening, cultivation or molecular testing of M.tb, IGRA, and the tuberculin skin test (TST), is recommended in clinical practice. Large, multicenter, high-quality studies should also be conducted to strengthen the case for its use.