Background

Over the past decade, digestive system tumors (DST) have become major causes of cancer-related mortality worldwide [1]. According to global cancer statistics compiled in 2016, death rates have increased for patients with DST, including for those with liver cancer and pancreatic cancer [2]. Due to lack of sensitive diagnostic testing, large numbers of patients with DST are mostly diagnosed at advanced stages, resulting in poor 5-year survival rates [2]. It is therefore necessary to identify novel, reliable biomarkers which can predict early diagnosis and/or prognosis of patients with DST.

Human growth differentiation factor 15 (GDF-15), also known as macrophage inhibitory cytokine-1 (MIC-1), is a divergent member of the transforming growth factor-β (TGF-β1) superfamily of proteins [3, 4]. The human GDF-15 gene maps to chromosome 19 in the p13.1–13.2 region, and encodes a 25-kDa secreted growth factor that is highly expressed in cardiomyocytes, adipocytes, endothelial cells, and macrophages in both normal and diseased tissues [3,4,5]. Intriguingly, GDF-15 levels are substantially increased in various pathological conditions, including inflammation and injury [5,6,7]. Notably, experimental and epidemiological evidence has demonstrated that GDF-15 levels are up-regulated in many types of DST, such as colorectal cancer (CRC) [8,9,10,11,12,13], gastrointestinal cancer (GC) [14, 15], pancreatic cancer (PC) [16,17,18,19,20], esophageal carcinoma (EC) [21, 22], and liver cancer [23, 24]. Recently, GDF-15 has received much attention as a diagnostic and prognostic biomarker in DST. However, data are inconsistent among studies assessing the clinical relevance of GDF-15, and the statistical power of these studies has also been insufficient. In this study, we collected published studies regarding the expression of GDF-15 in DST and performed a meta-analysis to determine whether high GDF-15 expression levels can be used as a diagnostic or prognostic biomarker in DST.

Methods

Literature search

We searched the PubMed, EMBASE, ESBCO, Wiley Online Library, and Ovid databases for eligible studies from their incipience to June 20, 2018. We used the following search terms or Medical Subject Headings (MeSH) words to identify eligible studies: “macrophage inhibitory cytokine-1/MIC-1/growth differentiation factor 15/GDF-15” AND “oesophageal cancer/oesophageal neoplasm/colorectal cancer/colorectal carcinoma/colon cancer/colon carcinoma/CRC /gastrointestinal cancer/gastric carcinoma/gastric cancer/stomach cancer/hepatocellular carcinoma/liver cancer/pancreatic carcinoma/ pancreatic neoplasms/pancreatic ductal adenocarcinoma/pancreatic mass/digestive system tumor/digestive system neoplasm” AND “survival/prognosis/outcome/hazard ratio/HR” OR “diagnosis/sensitivity/specificity/ROC/AUC/area under the curve”. Reference lists of the included articles or relevant reviews were also browsed for potentially missing studies.

Inclusion and exclusion criteria

Studies meeting the following criteria were included: (1) clinical trials reporting the diagnostic and/or prognostic features of GDF-15 in DST; (2) studies where the diagnostic parameters or survival outcomes included sensitivity, specificity, area under the curve (AUC), overall survival (OS), disease free survival (DFS), progression-free survival (PFS), recurrence-free survival (RFS), tumor-specific survival (TSS), or cancer-specific survival (CSS); and (3) the estimated hazard ratios (HR) or odds ratio (OR) with corresponding 95% confidence intervals (CIs) were available or could be calculated from published data. Accordingly, exclusion criteria included: (1) studies defined as reviews, basic studies, animal studies, letters, or conference abstracts; (2) data for statistical analyses were unavailable, and also failed to contact the authors; (3) studies with high risk and bias in quality assessment; and (4) articles written in a language other than English.

Data extraction and quality assessment

Data extraction was performed for study sensitivity, specificity, sample numbers, as well as HRs and their corresponding 95% CIs. Where such data were unavailable, the values were calculated indirectly using Engauge Digitizer 4.1 software. Other information included the first author’s name, article date, patient ethnicity, specimen type, test method, cut-off value settings, survival points, follow-up time, quantiles of GDF-15, and other relevant clinicopathological characteristics.

Study quality was judged according to the Quality Assessment of Diagnosis Accuracy Studies criteria (QUADAS), which is based on a 14-item list [25]. The quality of all retrospective cohort studies was assessed using the Newcastle-Ottawa Scale (NOS) checklist, wherein potential bias due to cohort selection, comparability, and outcome ascertainment is judged on a score ranging from 0 to 9 [26]. The included studies were eliminated if they were scored to be of low quality (i.e. a final score of less than 5 for NOS or 8 for Quality Assessment of Diagnostic Accuracy Studies [QUADAS]).

Statistical analysis

Statistical analyses were conducted using STATA 12.0 software (Stata Corporation, College Station, TX, USA). The primary outcomes (pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and AUC with corresponding 95% CIs) were obtained in the diagnostic meta-analysis; a pooled HR with 95% CI was calculated to measure the association between GDF-15 expression (high vs. low) and the clinical outcomes of patients with DST. A combined HR > 1 implied that GDF-15 had a negative effect on the survival outcome of the patients. Heterogeneity for the size of each effect was calculated using Cochran’s Q and I2 statistics, and with statistical significance defined as P < 0.05 or I2 > 50%. Fixed or random meta-analysis models were selected depending on the degree of study heterogeneity. Influence analysis was undertaken to ascertain the effects of outlier studies on the overall results. Publication bias was examined using Deek’s funnel plot asymmetry test, as well as Egger’s and Begg’s tests, with statistical significance defined as P < 0.05.

Results

Search results and study quality

Figure 1 schematically displays the selection procedure for eligible articles. According to our search criteria, a total of 3281 studies were eligible after the elimination of duplicates among databases. Among them, 3217 records were excluded due to irrelevant content or non-original data after reading the titles and abstracts. In the subsequent stages of study selection, 64 studies were assessed based on full-text evaluation, and with another 47 excluded. Finally, 17 articles (12 relating to diagnosis, and 9 relating to prognosis) were included in the final meta-analysis.

Fig. 1
figure 1

Flow diagram of the study selection and exclusion criteria

Study bias judged, as by the 14-item QUADAS list or NOS checklist, revealed that all of the diagnostic and prognostic studies had QUADAS scores of ≥10 or NOS scores of ≥6 (Table 1, Additional file 1 and Additional file 2), indicating that these data were suitable for our final statistical analysis.

Table 1 Main features of all included studies used in the diagnostic meta-analysis

Study characteristics

The main features of the included studies relating to the diagnostic role of GDF-15 are displayed in Table 1. Twelve diagnostic studies [12, 13, 15,16,17,18,19,20,21,22,23,24], comprising 2380 patients and 4630 paired controls, were included in the diagnostic meta-analysis. The study participants involved included Chinese [12, 13, 16, 20, 21, 23, 24], Australian [17, 19, 22], American [18], Polish [20], and Spanish [15] patients, with sample sizes ranging from 42 to 807. The types of DST covered in these studies included colorectal cancer (CRC) [8,9,10,11,12,13], gastrointestinal cancer (GC) [14, 15], pancreatic cancer (PC) [16,17,18,19], esophageal carcinoma (EC) [21, 22], and liver cancer [23, 24], of which the final diagnoses were all confirmed histologically by surgical operation. The types of samples collected included plasma [18, 22], serum [12, 13, 15,16,17, 19, 21, 23, 24], and tissue [22] samples obtained prior to treatment. Enzyme linked immunosorbent assay (ELISA) was primarily used to test for GDF-15 levels [12, 13, 15,16,17,18,19, 21,22,23,24], and only one study used multiplexed tandem PCR (MT-PCR) [22].

We also included 9 cohort studies [8,9,10,11, 13,14,15, 21, 22], with a total sample size of 2200, to assess the association between GDF-15 expression levels (high vs. low) and the clinical outcomes of patients with DST (Table 2). All 9 studies were retrospective, and study populations included Chinese [8, 13, 21], Australian [10, 14, 22], American [11], and Spanish [15] patients. Survival outcomes included OS [8,9,10,11, 14, 15, 22], CSS or TSS [11, 13, 21], RFP [21], and PFS [15], with an average follow-up time of 30 months to 9.2 years. In one study [11], HRs calculated based on different quartile points were judged as independent data. However, survival outcomes like RFP and PFS were not combined due to insufficient study numbers.

Table 2 Main features of all included studies used in the prognostic meta-analysis

Heterogeneity

In the diagnostic meta-analysis, heterogeneity was observed in the overall pooled data, of which the I2 value was estimated to be 99.38% (P < 0.001). Heterogeneity was also detected among 6 groups in our collected diagnostic data (Table 3), with I2 values ranging from 78.4 to 93.7% (P < 0.0001). Thus, random effect models were used for these studies. In our pooled data for prognosis, no significant heterogeneity was detected.

Table 3 Subgroup analyses of GDF-15 testing for CRC based on different covariates

Diagnostic meta-analyses

The overall pooled sensitivity, specificity, diagnostic odds ratio (DOR), and area under the curve (AUC) for GDF-15, used to distinguish DST from non-cancerous tumors, were 0.74 (95% CI: 0.68–0.80), 0.83 (95% CI: 0.75–0.89), 14.07 (95%CI: 9.12–21.71), and 0.84, respectively (Fig. 2 and Table 3), corresponding to a positive likelihood ratio (PLR) of 4.38 (95%CI: 3.00–6.39) and a negative likelihood ratio (NLR) of 0.31 (95%CI: 0.25–0.38). These results suggest that GDF-15 level is a useful alternative biomarker to differentiate patients with DST from those with non-cancerous tumors.

Fig. 2
figure 2

Forest plots of the overall pooled data for GDF-15 (a) sensitivity, (b) specificity, (c) DOR, and (d) AUC used to diagnose DST

Stratified analyses were performed in the diagnostic meta-analysis based on cancer type, sample type, cut-off setting, and ethnicity. As summarized in Table 3, the pooled AUC of GDF-15 to rule out PC, EC, GC, and liver cancer were estimated to be 0.82, 0.84, 0.90, and 0.85, respectively. Moreover, GDF-15 had an AUC of 0.82 for its ability to distinguish PC from pancreatitis, which was a higher value than the AUC for its ability to distinguish PC from healthy individuals (AUC = 0.73). When meta-analyzed based on sample type, serum-based GDF-15 testing achieved a specificity of 0.80 (95%CI: 0.78–0.81) and an AUC of 0.87, which were superior to plasma-based analysis. We found differences in diagnostic efficacy based on cut-off value: a cut-off setting <2000 pg/mL showed an AUC of 0.85 for PC (PC vs. non-cancerous tumors), and 0.87 for all cancers (all cancers vs. non-cancerous tumors). In the meta-analysis based on ethnicity, GDF-15 testing in Caucasian and Asian patients yielded an AUC of 0.83, whereas the Asian-based test conferred a higher specificity of 0.81 (95% CI: 0.79–0.83). The raw data used for the diagnostic meta-analysis was attached as Additional file 3.

Prognostic significance

Analysis of a 2200 patient cohort was used to define the association between GDF-15 levels and patient prognosis. Patients with DST who had increased GDF-15 levels had worse overall survival (OS) (HR = 2.34, 95%CI: 2.03–2.70, P < 0.001; I2 = 0.0%) compared with patients with low GDF-15 levels (Fig. 3a). Moreover, elevated levels of GDF-15 were associated with a significantly shorter OS time in patients with CRC (HR = 2.27, 95%CI: 1.96–2.63, P < 0.001; I2 = 0.0%) (Fig. 3b). We also included 6 individual data sets for CSS and TSS, and the results showed that GDF-15 levels were correlated with worse CSS in CRC (HR = 2.33, 95% CI: 1.95–2.78, P < 0.001; I2 = 0.0%)(Fig. 3c). The raw data used for the prognostic meta-analysis was attached as Additional file 4.

Fig. 3
figure 3

Forest plots of pooled HRs (95% CI) for GDF-15 levels in the prognostic datasets. a Pooled HR (95% CI) of OS data for DST; b pooled HR (95% CI) of OS data for CRC; c pooled HR (95% CI) of CSS/TSS data for CRC

Influence analysis and meta-regression

Influence analysis was conducted for both diagnostic and prognostic meta-analyses using STATA 12.0 software. One individual study [16] was identified as an outlier in the overall pooled diagnostic dataset for DST (Fig. 4a) and PC (Fig. 4b). However, no outlier studies were found at the upper or lower CI limit of the prognostic studies, indicating that the selected studies had relatively high homogeneity (Fig. 4c, d, and e).

Fig. 4
figure 4

Influence analysis of outlier studies. a The overall pooled diagnostic dataset for DST; b the combined diagnostic dataset for GDF-15 levels in PC; (c) the overall prognostic dataset of OS for DST; d the combined prognostic dataset of OS, and CSS (e) for GDF-15 in CRC

Meta-regression was performed to trace the causes of heterogeneity, wherein seven covariates, comprising ethnicity, sample size, control size, cancer type, test matrix, cut-off setting, and QUADAS score, were predefined. As displayed in Additional file 5, the analysis of QUADAS score received the lowest P-value (0.0349) among the analyses, suggesting that QUADAS score is the likely source of heterogeneity among diagnostic studies.

Publication bias

Publication bias analysis, assessed by Deeks’ funnel plot asymmetry test, demonstrated no clear bias in the overall diagnostic meta-analyses of DST and PC (Fig. 5a and b, n = 22 or 12, P = 0.375 or 0.479). Additionally, no significant publication bias, as assessed using Egger’s and Begg’s tests, was detected in the meta-analyzed prognostic data (all with P > 0.05) (Fig. 5c, d and e).

Fig. 5
figure 5

Publication bias judged by Deek’s funnel plot asymmetry test for the overall pooled diagnostic effect of (a) DST and (b), and Begg’s funnel plot for the overall pooled prognostic effect of (c) DST, (d) PC, and (e) CRC

Discussion

Expression levels of growth differentiation factor 15 (GDF-15) are increased in most DSTs, including those of the colon [27], stomach [28], pancreas [29], liver [23, 24], and esophagus [21, 22]. Given the important role of GDF-15 in DST, GDF-15 has the potential to be a promising biomarker in DST [6, 8,9,10,11,12,13,14,15,16,17,18,19, 21,22,23,24]. Nevertheless, the utility of GDF-15 as a biomarker has not been confirmed due to a lack of data supported by evidence-based medicine. In the present study, we performed a meta-analysis using recent information obtained regarding GDF-15 as a diagnostic and prognostic biomarker in DSTs.

As expected, GDF-15 was used successfully as a diagnostic biomarker in DST: the pooled sensitivity, specificity, and AUC for the discriminative performance of GDF-15 to rule out DST were 0.74, 0.83, and 0.84, respectively. Although the combined sensitivity was not significantly high, the specificity and AUC were relatively high as well, and illustrated an acceptable diagnostic performance for GDF-15. The diagnostic odds ratio (DOR) is another measure of diagnostic effectiveness, with a value higher than 1.0 representing diagnostic validity [30]. Herein, we obtained a DOR of 14.07, further suggesting that GDF-15 testing can be used to diagnose DST. The pooled PLR of 4.38 also indicated that GDF-15 testing harbored a ratio between the true-positive and false-positive rate.

Several groups have demonstrated that GDF-15 may be used as a biomarker to assist in the detection of PC, EC, GC, and HCC [12, 13, 15,16,17,18,19, 21,22,23,24]. In our stratified analysis, 4 groups of carcinomas had been evaluated repeatedly: the pooled AUC of GDF-15 to rule out PC, EC, GC, and HCC were estimated to be 0.82, 0.84, 0.90, and 0.85, respectively, showing that GDF-15 testing achieved a significant level of efficacy in confirming GC. In PC, GDF-15 testing had an AUC of 0.82 for its ability to differentiate PC from pancreatitis, which was higher than its ability to distinguish PC from healthy individuals. These data indicate that GDF-15 may also be a useful indicator for the differential diagnosis of PC and pancreatitis. Additionally, we observed matrix effects for the test performance: serum-based GDF-15 testing yielded a better AUC than that for plasma-based analysis, suggesting that serum samples may be more suitable than plasma samples for GDF-15 testing. We also found differences in diagnostic efficacy based on cut-off value: a cut-off setting of less than 2000 pg/mL exhibited better performance for all cancer types. Lastly, for data stratified by ethnicity, we found an equal diagnostic efficacy of GDF-15 testing between Caucasians and Asians. However, without additional data to support these findings, more investigation is needed.

We found that increased levels of GDF-15 were an independent prognostic marker for DST [8,9,10,11, 13,14,15, 21, 22]. Previously, the topic of whether GDF-15 could serve as prognostic markers for OS, DFS, or RFS in cancer was considered controversial. In our prognostic analysis, 2106 patients with complete follow-up data were included. A clear association between increased GDF-15 levels and shorter OS was observed in patients with DST (HR = 2.34), as well as in colorectal cancer (HR = 2.27). We also included 11 individual studies that measured CSS or TSS in CRC, with results that showed a correlation between GDF-15 expression and poor CSS and TSS (HR = 2.33). These data suggest that GDF-15 could be used as an independent prognostic biomarker in DST. Previous studies have hypothesized that GDF-15 could be used to assist the prediction of cancer recurrence and metastasis in CRC [31, 32]. However, the data obtained for CRC recurrence and metastasis were not sufficient for our study, and were therefore not analyzed.

Study heterogeneity and bias are very common in meta-analysis studies [33]. We observed significant heterogeneity in our diagnostic meta-analyses; thus, we attempted to interpret the cause of this heterogeneity. Firstly, we included studies that included varying patient population. Secondly, patients participating in these studies had different types of cancer and received a wide range of treatments. Moreover, the primary method of GDF-15 expression detection testing (ELISA) used a different cut-off value in each study, particularly that the cut-off points were obviously higher in gastric and liver cancers than other malignancies. Whether the differences in cut-off points were due to cancer type or limited studies still warranted further investigations. Collectively, these factors above may have resulted in non-homogeneous conditions. We therefore conducted sensitivity analysis and meta-regression test. Our sensitivity analysis identified one outlier study, and the degree of heterogeneity was decreased after we excluded all outlier data from the analysis. The univariate meta-regression test showed that only study quality (different QUADAS scores) seemed to be a source of heterogeneity among all other studies.

Limitations of this study include low sample sizes for some cancer types and few available current articles. Secondly, significant heterogeneity was observed in the diagnostic meta-analysis, compromising the overall study accuracy. Lastly, the method used to detect GDF-15 expression consisted primarily of ELISA, which might not be the optimal method to detect GDF-15.

Conclusions

In summary, we meta-analyzed the diagnostic and prognostic value of GDF-15 in patients with DST. Our analysis provides evidence that elevated GDF-15 levels may be used as a novel diagnostic and prognostic biomarker for DST.