Background

Multiparametric magnetic resonance imaging (mpMRI) has become a corner stone of diagnosis in prostate cancer (PC) in a cost effective and highly accurate manner [1,2,3,4]. A great concern in PC treatment is possible over-diagnosing and over-treatment due to very different biological behaviors of PC, discriminated using Gleason score (GS) [5,6,7]. GS is still one of the most important prognostic features in prostate cancer [6]. So, a cancer with a GS 6 or lower is considered as a clinically insignificant cancer, which will most likely not result in cancer related death. Therefore, it can be treated in some cases with clinical surveillance [5]. However, PC with a GS of 7 and higher is clinically significant and is associated with tumor related morbidity/mortality [8].

In clinical routine mpMRI is very beneficial due to the high negative predictive value [1]. However, mpMRI can also detect more lesions than conventional diagnostic work flow, which might result in more insignificant cancers [9].

Diffusion-weighted imaging (DWI) is an important sequence of mpMRI. DWI reflects free water movement in tissues [10]. Furthermore, restriction of free water movement in tissues can be quantified by apparent diffusion coefficient (ADC) [10]. ADC is associated with histological features, which restrict diffusion of water molecules, like cell count and protein concentration in the extracellular space [11, 12]. Thus, ADC may aid in discrimination of several tumors. Previously, numerous studies reported that malignant tumors have significantly lower ADC values compared to benign lesions [13, 14]. PC had also lower ADC values in comparison to benign prostatic tissue [15]. Therefore, DWI is an established technique for detection of PC, especially in the peripheral zone [3].

Besides diagnostic potential, DWI/ADC can also aid characterize prostatic tumors. So far, a recent meta-analysis showed that ADC values correlated inversely with GS [16]. In detail, a correlation coefficient of r = − 0.45 between ADC and GS was reported in all PCs [16]. Furthermore, it was stronger in PC located in the peripheral zone (r = − 0.48) in comparison to PCs arose in the transitional zone (r = − 0.22). Presumably, ADC may discriminate low risk PCs from high risk tumors. However, published data above are inconsistent and based on small single center studies.

The purpose of the present systematic review and meta-analysis was to compare ADC values between clinically significant and non-significant PCs according to GS in a large patient sample.

Methods

Data acquisition

MEDLINE library, EMBASE and SCOPUS databases were screened for the associations between ADC and Gleason score in PC up to May 2019. The paper acquisition is summarized in Fig. 1.

Fig. 1
figure 1

PRISMA flow chart. An overview of the paper acquisition. Overall, 27 articles comprising 1633 patients were suitable for the analysis

The following search words were used: “prostate cancer OR prostatic carcinoma OR prostatic cancer OR prostate carcinoma AND DWI OR diffusion weighted imaging OR ADC OR apparent diffusion coefficient AND Gleason score AND Gleason”.

The primary endpoint of the systematic review was the ADC value of PC groups according to Gleason score.

Studies (or subsets of studies) were included, if they satisfied all the following criteria: (1) patients with PC confirmed by histopathology, (2) mpMRI with DWI sequence quantified by ADC values, and (3) reported ADC value according to GS.

Exclusion criteria were (1) systematic review, (2) case reports, (3) treatment prediction or histopathology performed after treatment, (4) non-English language, and (5) experimental (xenograft or animals model) studies.

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement was used for the analysis [17]. In total 26 studies were suitable for the analysis and included into the present study [18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43].

Quality-assessment

The methodological quality of the acquired studies was independently evaluated by two readers (A.S. and H.J.M.) using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) instrument [44]. Results of QUADAS-2 assessments are shown in Fig. 2.

Fig. 2
figure 2

QUADAS-2 quality assessment of the included studies. Most studies showed an overall low risk of bias

Statistical analysis

The meta analysis was performed using RevMan 5.3 (2014; Cochrane Collaboration, Copenhagen, Denmark). Heterogeneity was calculated by means of the inconsistency index I2 [45, 46]. Finally, DerSimonian and Laird [47] random-effect models with inverse-variance weights were performed without any further correction.

Results

Of the included 26 studies, 8 (30.7%) were of prospective and 18 of (69.3%) retrospective design. Different 1.5 T scanners were used in 7 (26.9%) studies and 3 T scanners in 19 (73.1%) studies. In 7 studies (26.9%) an additional endorectal coil was used. In 8 studies (30.7%) a bowel preparation was performed.

Regarding the QUADAS-2 assessments, most studies had a low risk of bias. For patient selection in 8 studies (29.6%) had an unclear risk of bias mainly based on not sufficiently reported inclusion and exclusion criteria of the patient sample. For the reference standard 7 studies (26.9%) had an unclear risk of bias due to insufficient reported histopathology standard or blinded reading of the pathologic specimen. Only small concerns were identified for the reported index tests (7.4% of studies with unclear risk of bias). For 8 studies 8 (29.6%) there was unclear risk of bias for flow and timing due to not sufficiently reported time duration between biopsy and imaging of the patients. The same reasons were identified in the groups for applicability of the results.

In all studies, the diagnosis was confirmed by histopathology. The histopathological diagnosis and scoring of PC was made on specimen after radical prostatectomy in 14 studies (53.8%), in 10 studies (38.5%) after transrectal ultrasound guided biopsy, and in 2 studies (7.7%) with both techniques.

The acquired 26 studies comprised a total of 1633 lesions. Clinically significant PCs (Gleason score 7 and higher) were diagnosed in 1078 cases (66.0%) and insignificant PCs (Gleason score 5 and 6) in 555 cases (34.0%).

The pooled mean ADC value of the clinically significant PC was 0.86 × 10− 3 mm2/s [95% CI 0.83–0.90, Tau2 = 0.01, Chi2 = 1078.47, df = 45, I2 = 96%] and the pooled mean ADC value of insignificant PC was 1.1 × 10− 3 mm2/s [95% CI 1.03–1.18, Tau2 = 0.04, Chi2 = 1234.54, df = 24, I2 = 98%]. Figure 3 shows the distribution of ADC values divided in clinically significant and insignificant PC.

Fig. 3
figure 3

a Forrest plots of the mean apparent diffusion coefficients of clinical insignificant PC comprising Gleason score 5 and 6. The pooled mean ADC value was 1.10 × 10− 3 mm2/s [95% CI 1.03–1.18]. b. Forrest plots of the mean apparent diffusion coefficients of clinically significant PC comprising Gleason score 7 and higher. The pooled mean ADC value was 0.96 × 10− 3 mm2/s [95% CI 0.83–0.90]. c. Box plots of the mean ADC values of clinical insignificant and clinically significant PC. Clinical insignificant PC have lower ADC values than clinically significant PC.

Thereafter, PCs were divided into subgroups according to the GS as follows: GS 5 and 6 (n = 555, 34.0%), GS 7 (n = 258, 15.8%), GS 8 (n = 42, 2.6%) and GS 9 (n = 30, 1.8%). The pooled mean ADC values of the subgroups were as follows: GS 5 + 6 = 1.1 × 10− 3 mm2/s [95% CI 1.03–1.18, Tau2 = 0.04, Chi2 = 1234.54, df = 24, I2 = 98%], GS 7 = 0.87 × 10− 3 mm2/s [95% CI 0.80–0.94, Tau2 = 0.01, Chi2 = 209.3, df = 10, I2 = 95%], and GS 8 and 9 = 0.76 × 10− 3 mm2/s [95% CI 0.71–0.82, Tau2 = 0.01, Chi2 = 235.03, df = 15, I2 = 94%] (Fig. 4).

Fig. 4
figure 4

a. Forrest plots of the mean apparent diffusion coefficients of PC with Gleason score 7. The pooled mean ADC value was 0.87 × 10− 3 mm2/s [95% CI 0.80–0.94]. b. Forrest plots of the mean apparent diffusion coefficients of PC with Gleason score 8 and higher. The pooled mean ADC value was 0.76 × 10− 3 mm2/s [95% CI 0.71–0.82]. c. Box plots of the mean ADC values of clinical insignificant comprising Gleason score 5 and 6, Gleason score 7 and Gleason score 8 and 9 PC groups. There is a clear trend for higher Gleason score PC to have lower ADC values.

Furthermore, the GS 7 group was divided into cancers with a primary GS 3 pattern with a sum of 3 + 4 and those with a primary GS 4 pattern with a sum of 4 + 3. GS 3 + 4 were total 7 studies with 170 lesions. The pooled mean ADC value was 0.91 × 10− 3 mm2/s [95% CI 0.82–1.01, Tau2 = 0.02, Chi2 = 155.92, df = 6, I2 = 96%]. GS 4 + 3 were total 4 studies with 88 lesions. The pooled mean ADC value was 0.80 × 10− 3 mm2/s [95% CI 0.69–0.91, Tau2 = 0.01, Chi2 = 41.97, df = 3, I2 = 93%] (Fig. 5).

Fig. 5
figure 5

a. Forrest plots of the mean apparent diffusion coefficients of PC with Gleason score 3 + 4. The pooled mean ADC value was 0.91 × 10− 3 mm2/s [95% CI 0.82–1.01]. b. Forrest plots of the mean apparent diffusion coefficients of PC with Gleason score 4 + 3. The pooled mean ADC value was 0.80 × 10− 3 mm2/s [95% CI 0.69–0.91]. c. Box plots of the mean ADC values of Gleason score 3 + 4 and Gleason score 4 + 3. Gleason 4 + 3 PC have lower ADC values than Gleason score 3 + 4

Subgroup analyses

To evaluate the high heterogeneity of the results, we performed subgroup analyses.

Clinically insignificant PC

The pooled mean ADC value of the clinically insignificant PC (GS ≤ 6) was 1.16 × 10− 3 mm2/s [95% CI 1.01–1.31, Tau2 = 0.04, Chi2 = 228.4, df = 7, I2 = 97%] in the studies that used endorectal coils and 1.07 × 10− 3 mm2/s [95% CI 0.98–1.15, Tau2 = 0.04, Chi2 = 1030.75, df = 18, I2 = 98%] in the reports without (Fig. 6a).

Fig. 6
figure 6

a Forrest plots of the mean apparent diffusion coefficients of the subgroup analysis of clinical insignificant PC in accordance to endorectal coil. The pooled mean ADC value was 1.16 × 10− 3 mm2/s [95% CI 1.01–1.31] for lesions investigated with endorectal coils and 1.07 × 10− 3 mm2/s [95% CI 0.98–1.15] for tumors without. b. Forrest plots of the mean apparent diffusion coefficients of the subgroup analysis of clinical insignificant PC in accordance to histopathology specimen. For PC investigated histopathologically after radical prostatectomy the pooled mean ADC value was 1.10 × 10− 3 mm2/s [95% CI 1.01–1.19] and it was 1.12 × 10− 3 mm2/s [95% CI 0.95–1.28] for lesions, which were classified based on bioptic specimens. c. Forrest plots of the mean apparent diffusion coefficients of the subgroup analysis of clinical insignificant PC in accordance to tesla strength. The pooled mean ADC value was 1.10 × 10− 3 mm2/s [95% CI 1.01–1.20] for lesions investigated by 3 T scanners and 1.06 × 10− 3 mm2/s [95% CI 0.97–1.16 for PC investigated by 1.5 T scanners. d. Forrest plots of the mean apparent diffusion coefficients of the subgroup analysis of clinically significant PC in accordance to endorectal coil. The pooled mean ADC value was 0.89 × 10− 3 mm2/s [95% CI 0.80–0.98] for lesions investigated with endorectal coils and 0.84 × 10− 3 mm2/s [95% CI 0.81–0.88] for PC investigated without endorectal coils. e. Forrest plots of the mean apparent diffusion coefficients of the subgroup analysis of clinically significant PC in accordance to histopathological specimens. For PC investigated histopathologically after radical prostatectomy the pooled mean ADC value was 0.85 × 10− 3 mm2/s [95% CI 0.80–0.91, Tau2 = 0.02, Chi2 = 728.04, df = 25, I2 = 97%], and it was 0.87 × 10− 3 mm2/s [95% CI 0.82–0.92, Tau2 = 0.01, Chi2 = 310.2, df = 21, I2 = 93%] for lesions investigated on bioptic specimens. f. Forrest plots of the mean apparent diffusion coefficients of the subgroup analysis of clinically significant PC in accordance to tesla strength. The pooled mean ADC value was 0.87 × 10− 3 mm2/s [95% CI 0.83–0.92] for PC investigated by 3 T scanners and 0.83 × 10− 3 mm2/s [95% CI 0.76–0.89] for tumors analyzed by 1.5 T scanners

In PC that were investigated by histopathology after radical prostatectomy the pooled mean ADC value was 1.10 × 10− 3 mm2/s [95% CI 1.01–1.19, Tau2 = 0.03, Chi2 = 453.26, df = 13, I2 = 97%]. It was 1.12 × 10− 3 mm2/s [95% CI 0.95–1.28, Tau2 = 0.07, Chi2 = 581.36, df = 10, I2 = 98%] in tumors that were investigated after prostate biopsy (Fig. 6b).

Furthermore, the pooled mean ADC value was 1.10 × 10− 3 mm2/s [95% CI 1.01–1.20, Tau2 = 0.04, Chi2 = 1147.69, df = 17, I2 = 99%] for lesions investigated by 3 T scanners and 1.06 × 10− 3 mm2/s [95% CI 0.97–1.16, Tau2 = 0.02, Chi2 = 110.13, df = 8, I2 = 93%] for tumors investigated by 1.5 T scanners (Fig. 6c).

Clinically significant PC

Regarding clinically significant PC the pooled mean ADC value was 0.89 × 10− 3 mm2/s [95% CI 0.80–0.98, Tau2 = 0.03, Chi2 = 287, df = 15, I2 = 95%] for investigations with endorectal coils and 0.84 × 10− 3 mm2/s [95% CI 0.81–0.88, Tau2 = 0.01, Chi2 = 840.88, df = 33, I2 = 96%] for studies without use of endorectal coils (Fig. 6d).

The pooled mean ADC value in PC analyzed histopathologically after radical prostatectomy was 0.85 × 10− 3 mm2/s [95% CI 0.80–0.91, Tau2 = 0.02, Chi2 = 728.04, df = 25, I2 = 97%]. It was 0.87 × 10− 3 mm2/s [95% CI 0.82–0.92, Tau2 = 0.01, Chi2 = 310.2, df = 21, I2 = 93%] for cases investigated histopathologically after prostate biopsy (Fig. 6e).

Regarding Tesla strength, the pooled mean ADC value was 0.87 × 10− 3 mm2/s [95% CI 0.83–0.92, Tau2 = 0.01, Chi2 = 825.4, df = 30, I2 = 96%] for PC investigated by 3 T scanners and 0.83 × 10− 3 mm2/s [95% CI 0.76–0.89, Tau2 = 0.02, Chi2 = 330.24, df = 19, I2 = 94%] for tumors investigated by 1.5 T scanners (Fig. 6f).

Discussion

The present work is the first systematic review and meta-analysis comparing ADC values of clinically significant and insignificant PCs classified according to GS. Because it is based on a large cohort, it provides evident data regarding the quantitative analysis of DWI in distinguishing of different PCs.

GS is still one of the most important prognostic factors in PC to stratify patients employing a robust and durable method [6, 48]. So, GS is significantly associated with biochemical free survival [49]. As already mentioned, there is need to discriminate clinically insignificant PCs (GS 6 and lower), which are in almost every cases sufficiently treated with radical prostatectomy, whereas GS 7 and higher cancers are defined as clinically significant with a possibility of recurrence and tumor related death [48]. To predict GS non-invasively by mpMRI might be crucial because it is increasingly used in clinical routine. Thus, more cancers will be detected, which might result in over-diagnosing and over-treatment, when more clinically insignificant tumors are detected.

As reported previously, DWI/ADC can reflect tissue microstructure in several tumor entities, including PC [11]. In most studies, ADC inversely correlated with cellularity [11]. This is explained by the fact that the extracellular protons are mainly producing the MRI signal. Thus, in cell rich tumors, the extracellular water movement is lowered and correspondingly, the ADC value is also lowered.

However, it is also important to consider that DWI is sensitive on multiple spatial scales [50]. So, the time interval of the DWI measurement has an impact on how each water molecule is likely to encounter the tissue microstructure. For long diffusion times, structure heterogeneity on the smallest scales will be averaged and the signal attenuation will primarily be depended on large scale tissue structure features [50]. Moreover, the presented results depend on the used b-values in each study. Another important aspect is that the robustness of ADC values in clinical routine depends on fitting quality, repeatability of fitted parameters, robustness against measurement noise and clinically useful information [51]. Moreover, the present analysis only evaluated the monoexponential model to fit the ADC values. There are other methods, comprising non monoexponential models such as diffusion kurtosis imaging, which might better reflect microstructure of PC and better correlate with GS. However, there are some indications that the monoexponential model predicted the GS better with a higher repeatability compared to the intravoxel incoherent motion imaging model [52].

These facts might also be responsible for the large heterogeneity identified for the ADC values of the present study. We performed subgroup analyses but there were no substantial differences of ADC values obtained under different conditions like use of endorectal coiland tesla strength. Also no differences of ADC values were found between the tumors diagnosed after prostatectomy and PC diagnosed by prostate biopsy.

As reported previously, in PC, not only cell density is important, but also the glandular structure and formation of the tissue, which is also the most important factor for GS grading [6, 48, 49]. According to the literature, besides cellularity, ADC can also reflect other histopathological features in PC including proliferation index, vascular endothelial expression and hypoxia 1-alpha expression [53, 54]. In fact, it was unambiguously shown that ADC values are positively correlated with amount of glandular lumen with r = 0.688 and ADC values are negatively correlated to sole cell count (r = − 0.598) [53].

Consequently, ongoing research, showed weak to moderate inverse correlations between ADC values and GS, which further strengthened that ADC values are able to reflect tumor microstructure in a non-invasive way with possible translational benefit in daily clinical routine [16].

In a meta-analysis pooling 13 studies with 1107 tumor foci dated up to 2015, a sensitivity of 76.9% and a specificity of 77% was calculated for discrimination of clinically significant against insignificant based upon ADC values. In a subgroup analysis a higher sensitivity was identified for studies employing high b-values of 2000 s/mm2 [55].

The present meta-analysis showed that ADC values of different PCs distinct overlapped. However, clinically significant PC defined as PC with GS 7 and higher had lower ADC values than insignificant PCs. Moreover, the pooled ADC values of clinically insignificant PCs were no lower than 0.75 × 10− 3 mm2/s.

However, the present results cannot aid in proposing an ADC threshold for clinical routine due to differences in MRI technique in various instances [56, 57], mainly b-values, tesla strength and echo time. So, for every institution the ADC threshold needs to be evaluated.

There is recent literature suggesting that GS7 tumors include biological heterogeneous PCs. So far, GS 7 cancers can be estimated as 3 + 4 and 4 + 3 [58,59,60,61]. For the first group, the well differentiated cancer pattern is predominant. In contrast, for 4 + 3 lesions, the less differentiated pattern is predominant. This also might reflect different tumor behavior. For example, 4 + 3 cancers are more likely to be tumors with greater pathologic stage, and total tumor volume [58]. Our data corroborate the notion that GS7 cancers are heterogeneous in terms of their ADC values. In fact, GS 3 + 4 tumors had higher ADC values in comparison to GS 4 + 3 cancers. Presumably, ADC values are able to aid stratify GS 7 cancer, albeit further studies are needed to confirm these results.

Of note, in clinical routine the definition of clinically significant cancer is not only performed on GS alone but also the length and number of the positive biopsy core and the tumor volume of prostatectomy specimen. Moreover, seminal vesicle invasion and lymph node metastasis are key findings to define clinically significant cancer [62]. In the present analysis however only the GS could be evaluated to define clinically significant cancers.

Interestingly, some previous studies indicated that conventional imaging analysis by PIRADS scoring, a clinical used scoring system to predict the malignancy possibility, is not capable to discriminate between clinical significant and non-significant PC [63]. In PIRADS scoring, only a qualitative assessment based upon DWI, T2-weighted imaging, and contrast enhanced dynamic MRI [3]. ADC values are not quantitatively assessed in this system. Presumably, ADC values might harbor crucial information regarding GS in PC, which is not currently considered in clinical practice. In fact, Pierre et al. suggested that ADC quantification might aid in diagnosing of PC beyond the qualitative DWI assessment [64].

The present meta-analysis has several limitations to address. Firstly, it is mainly comprised of retrospective studies with possible known bias. Secondly, it was not possible to further stratify the patient samples according to tumor localization. Recently, a meta-analysis showed that cancers arising from transitional zone weaker correlated with GS, which might have an influence on the present analysis. Thirdly, we could not divide the patient sample according to biopsy and radical prostatectomy grading. It was shown that both methods might result in slightly different GS. Moreover, the prostatectomy specimen is considered the diagnostic gold standard, which was used in only 55.6% of the investigated studies. Fourthly, no exact threshold values and sensitivity/specificity could be established for discrimination of clinical significant and non-significant cancers. This reflects one limitation of ADC values caused by variabilities due to hardware including different MRI scanners, sequence parameters, and interreader variability, which hinders to establish clear threshold values for clinical routine. However, as shown, the pooled ADC values of clinically insignificant PCs were no lower than 0.75 × 10− 3 mm2/s. Fifthly, our results might be affected by possible publication bias because negative studies, which could not identify an inverse correlation between PC with different GS might not be published.

Clearly, further prospective studies based on large samples are needed to proof and confirm our present results.

Conclusion

Clinical significant PC showed lower ADC values compared to non-significant PC. The pooled ADC values of clinically insignificant PCs were no lower than 0.75 × 10− 3 mm2/s. This value may be proposed as a threshold for distinguishing clinically significant from insignificant PCs. The quantitative assessment of ADC should be included into the stratification of PCs in clinical practice.