Introduction

Worldwide, transarterial chemoembolization (TACE) is the most frequently applied treatment for patients with unresectable hepatocellular carcinoma (HCC) [1]. Survival outcomes following TACE are highly variable and individual patients outcomes, ranging between 11 and 45 months, are difficult to predict [2, 3]. It is widely acknowledged that a significant degree of variation in clinical outcomes can be attributed to differences in tumor stage (i.e., tumor size and number) and biological tumor behavior [3]. Current staging systems for HCC, including the guideline-endorsed Barcelona Clinic Liver Cancer (BCLC) staging system [2, 4], rely heavily on parameters such as performance status (Eastern Cooperative Oncology Group Performance Status (ECOG PS)), liver function (Child–Pugh score), and radiological tumor burden. The current standard for radiological response evaluation following TACE, i.e., modified Response Evaluation Criteria in Solid Tumors (mRECIST) criteria [2, 5], focusses on the change in tumoral contrast enhancement on post-TACE CT or MRI. Improved staging and evaluation of response to TACE may be possible by more accurately assessing tumor biology. Histopathological and molecular subtyping of HCC is a promising technique to assess the tumor biology [6], but this requires an invasive biopsy and needs further investigation prior to implementation in clinical practice.

Diffusion-weighted imaging (DWI) is an MRI technique that detects motion of free water molecules (Brownian movement) [7]. In hypercellular tissues (i.e., fast-growing tumors), the movement of water is ‘restricted’ by the cellular compartments, whereas in tissue with a lower cellular density or cell necrosis, there is ‘unrestricted’ water diffusion. The diffusion restriction, which can be quantified into an apparent diffusion coefficient (ADC), is therefore a way of non-invasively measuring the cellularity of (tumor) tissue. Prior studies have found an inverse correlation between ADC value and histological tumor grade and microvascular invasion [8,9,10,11]. Kamel et al. first demonstrated the feasibility of DWI for response assessment after TACE by showing that an increased tumoral ADC at 4–6 weeks after TACE corresponded to radiological and histological response [12, 13]. These findings have been confirmed by several studies using DWI at various time points after TACE (24 h to 6 weeks) in association with various outcomes of TACE response (RECIST, mRECIST, histopathological necrosis, progression-free survival [PFS]) [14,15,16,17,18,19]. Less studies have described the association between pre- and post-TACE ADC value with overall survival (OS) [20,21,22,23], which remains the most robust endpoint in HCC. Moreover, the added value of DWI during routine clinical follow-up, which is often done by mRECIST at 4-8 weeks, remains unknown. Therefore, the aim of this study was to assess whether (1) pre- or post-TACE ADC is a predictor of OS and if (2) ADC measurements can improve the current standard of response evaluation using mRECIST.

Methods

Study population

This retrospective study was approved by the Institutional Review Board (reference number W17_420#17.488) which waived the need for informed consent. This study included patients with HCC treated with TACE at our center between January 2008 and December 2017 which were identified by searching the institutional patient registration system and radiology archive. Patients lacking an assessable MRI with the institutional DWI protocol (b values 50, 400, 800 s/mm2) ≤ 3 months prior to TACE or ≤ 12 weeks after first TACE were excluded from the pre-TACE and post-TACE analysis, respectively.

Diagnostic work-up and TACE procedure

According to the European Association for Study of the Liver (EASL) guidelines, HCC was diagnosed by histopathological or imaging criteria [2]. All patients were discussed in a multidisciplinary team (MDT) and considered for TACE according to the Barcelona Clinic Liver Cancer (BCLC) algorithm [2, 4].

Accordingly, patients with intermediate stage (BCLC-B) HCC, or those with early stage (BCLC-A) not eligible for surgery or radiofrequency ablation (RFA), were considered for TACE. Segmental portal vein invasion (BCLC-C) was not considered an absolute contra-indication for TACE.

TACE was performed using a standardized technique as described previously [24] using drug-eluting beads (DEB-TACE) loaded with doxorubicin (75–150 mg)(DC Beads 100–300 µm, Terumo Europe, Belgium) or conventional TACE (cTACE) using an emulsion of doxorubicin (50–75 mg/m2) and lipiodol followed by gelatin sponge particles. Depending on size, number, and distribution of HCC lesions, uni- or bilobar, TACE was done in one or more consecutive sessions. Tumor feeders were catheterized as selectively as possible with a microcatheter, followed by chemoembolization until intra-arterial stasis was observed. If it was technically not feasible to achieve complete devascularisation in one session, an additional TACE procedure was scheduled. First follow-up was done 6 weeks after the first TACE including clinical assessment, biochemistry, and multiphasic MRI. The radiological response was assessed by expert abdominal radiologists using the mRECIST criteria [25]. Patients were reevaluated in the MDT after each follow-up visit and additional TACE was done ‘on demand,’ followed by clinical and radiological response assessment every 3 months.

MRI technique and ADC measurement

Pre- and post-TACE MRI were performed on a 1.5 T system (Siemens Healthcare, Erlangen, Germany) with combined body and spine-phased array coils. The scan protocol consisted of a T1W (axial gradient echo in- and out-of-phase) sequence, an axial and coronal T2W (fat-saturated) sequence, and axial T2W (long and short echo time) sequences followed by an axial 2D echo planar imaging (EPI) DWI sequence with 3 b values (50, 400, 800 s/mm2) before contrast series (Fig. 1). Details of the DWI sequence are provided in Supplementary Table 1.

Fig. 1
figure 1

Pre- and Post-TACE contrast-enhanced (CE) MRI of a 75-year-old man with cirrhosis due to hemochromatosis and HCC. Pre-TACE MRI shows a 3.9 cm HCC nodule (arrows) in segment 7 of the liver with arterial enhancement (a) and delayed phase washout (c) on T1-weighted imaging. The lesion shows diffusion restriction on the Diffusion Weighted Imaging (DWI) using b-values of 50, 400, 800 s/mm2 (e). Post-TACE imaging shows a complete response of the HCC to TACE with complete necrosis (b, d). Only a peripheral enhancing rim, representing inflammatory change remains visible. The lesion has an increased signal on DWI (f) using the same b-values caused by T2 shine-through. The tumor ADC increased from 959 mm2/s (g) to 1564 mm2/s (h), reflecting a ΔADC of 63%

ADC measurements were retrospectively conducted by a trained investigator (TL), supervised by an expert abdominal radiologist (OVD, with 20 years of experience) on a high-resolution PACS workstation. The ADC value of a maximum of 3 HCC lesions, ≥ 10 mm in diameter, was measured by delineating whole-lesion region of interests (ROI) at the level of the maximum transverse tumor diameter in the arterial phase. From the ADC measurements of all target lesions, the mean ADC was calculated. As a reference measurement, the ADC value of non-tumor liver parenchyma was measured by drawing a circular ROI (ø 2 cm) on the contralateral liver lobe, while avoiding large vessels and biliary structures.

ADC values of HCC lesions and non-tumor liver parenchyma were measured on pre-TACE and post-TACE MRI. Increase in ADC value, ΔADC (%), was defined as ΔADC = ((ADCpost − ADCpre)/ADCpre) × 100.

Several cut-offs for ΔADC have been reported in the literature [12, 15, 19, 20, 22, 23], therefore we analyzed 2 frequently used cut-points (13.6% and 25%).

Because TACE is often performed repeatedly, the ADC values at time of best mRECIST response were also measured. Best response was assessed on a patient basis, selecting the first radiological evaluation following TACE that observed the best mRECIST response category (Complete response [CR] > Partial response [PR] > Stable disease [SD] > Progressive disease [PD]) for that patient.

Statistical analysis

Categorical data were described as frequencies with percentages and continuous data as medians with interquartile ranges (IQR). The change in ADC value of HCC lesions and non-tumor liver parenchyma was assessed using the paired Wilcoxon test.

The primary endpoint, overall survival (OS), was defined from date of first TACE to date of death or last known date to be alive. Survival status was verified using the municipal records database on 4th of May 2018. If TACE was used as bridging treatment to liver resection or transplantation, OS was censored on date of surgery. Survival curves were plotted and compared using the Kaplan–Meier method and log-rank test. The association between OS and pre-TACE ADC, ΔADC and know prognostic factors [3] was assessed in a univariable Cox proportional hazard analysis. The subsequent multivariable analyses included factors that were associated with survival in univariable analysis (p < 0.1) and pre-TACE ADC (pre-TACE model) or ΔADC (post-TACE model). As there are no validated cut-offs for pre-TACE ADC, we dichotomized the population according to high (above median) or low (equal to or below median) pre-TACE ADC subgroups. Additionally, the correlation between pre-TACE ADC and OS was expressed using the Spearman’s rank correlation coefficient (ρ).

To assess the association between ADC values and radiological response, patients were categorized into two subgroups according to the mRECIST response at first follow-up. Patients with partial or complete response (PR/CR) were classified as responder, whereas non-responders had stable or progression disease (SD/PD). The Mann–Whitney test was used to compare pre- and post-TACE ADC and ΔADC between responders and non-responders. The area under the receiver operating characteristics (AUROC) curve was used to assess the accuracy index of ΔADC in predicting mRECIST response and quantify the sensitivity and specificity of ΔADC cut-points.

For all statistical tests, a two-tailed p value of < 0.05 was considered statistically significant. All analyses were performed using IBM SPSS Statistics for Windows Version 24.0 (IBM Corp., Armonk, NY, USA).

Results

Patients characteristics

Of the 203 consecutive HCC patients treated with TACE between January 2008 and December 2017, 113 patients were excluded due to non-available (i.e., CT-imaging) or poor quality pre-TACE MRI-images (n = 105) or different b-values of the DWI sequence (n = 8) (Fig. 2). One patient was not assessable because of extensive tumor necrosis after embolization of tumor bleeding prior to first TACE. Consequently, 89 patients treated with TACE between September 2008 and November 2017 formed the pre-TACE study cohort. Patient characteristics and treatment details are summarized in Table 1. The majority of patients had 1–3 tumors (75%) with a median largest tumor size of 42 mm (IQR 35–58). The median number of TACE procedures was 2 (range 1–7). Fourteen patients were bridged to surgical resection or liver transplantation. At first follow-up following TACE, 52 (58%) patients were assessed by MRI including DWI sequence and were available for post-TACE ADC analysis.

Fig. 2
figure 2

Consort flow diagram

Table 1 Baseline characteristics of patients

Pre- and post-TACE ADC values

The MRI was conducted a median of 5 weeks (IQR 4–8) before and 6 weeks (IQR 5–7) after first TACE, respectively. The ADC characteristics prior to and after TACE are shown in Table 2 and Fig. 3. The pre-TACE ADC measurement was not possible in 5/89 (6%) patients or 41/178 (23%) of the target lesions because these lesions were not visible using DWI (12%) or too small (< 1 cm) for a reliable measurement (11%). At first follow-up, HCC lesions were not visible in 4/52 (8%) patients or 18/98 (18%) target lesions, whereas 8/98 (8%) target lesions were too small for ADC measurement. Consequently, adequate ADC measurements were possible in 85 patients with 137 target lesions pre-TACE, and 48 patients with 72 target lesions post-TACE.

Table 2 ADC characteristics at first and best response
Fig. 3
figure 3

Box plots distributions of pre- and post-TACE ADC values in a tumor lesions, b liver parenchyma and c tumor lesions, stratified according to mRECIST response after first TACE. d Box plot distribution of difference in ADC (ΔADC) according to mRECIST response after TACE. Boxes represent 25–75 percentiles of data values. Line in box represents median value

The median pre-TACE ADC was 1075 mm2/s (IQR 944–1187). In the HCC lesions with both pre- and post-TACE imaging, there was a significant increase between pre-TACE ADC value (1081 mm2/s, IQR 964–1225) and post-TACE ADC value (1328 mm2/s, IQR 1197–1560) (p < 0.001), reflecting a median increase of 22.4% (IQR 11.3–38.0%). Non-tumor liver parenchyma did not show a significant increase in ADC value (965 to 1001 mm2/s, p = 0.181). According to the mRECIST criteria, there was CR in 10 (19%), PR in 25 (48%), SD in 4 (8%), and PD in 13 patients (25%). Responders (CR/PR) showed a higher ADC value (1375 vs. 1198 mm2/s, p = 0.019) and ΔADC (25.7 vs. 14.4%, p = 0.048) after first TACE than non-responders (SD/PD). ΔADC could moderately discriminate mRECIST responders and non-responders with an AUROC of 0.68 (95% CI 0.53–0.83) (Supplementary Fig. 1). For example, at the 25% cut-off, ΔADC had 52% sensitivity and 81% specificity for predicting mRECIST response, whereas this was 71% sensitivity and 50% specificity at the 13.6% cut-off for ΔADC.

ADC at best response according to mRECIST

Using repeated TACE, eight patients (15%) achieved response after initial non-response, including PD → CR (n = 4), PD → PR (n = 2), and SD → CR (n = 2). Consequently, the best response according to mRECIST was CR in 29 (56%), PR in 14 (27%), SD in 3 (6%), and PD in 6 (12%) patients. The best mRECIST response was observed after 1, 2, and 3 TACE procedures in 28 (54%), 19 (37%), and 5 (10%) patients, respectively. MRI was conducted after a median of 10 weeks (IQR 6–22 weeks) following first TACE. Comparing with first and best response, there was a lower, albeit still significant (p < 0.001), increase from pre-TACE ADC value (1081 mm2/s, IQR 964–1225) to best-response ADC value (1278 mm2/s, IQR 1155–1564), reflecting a median increase of 15.7% (IQR 5.7–43.6%). When patients were stratified according to best mRECIST response, there was no significant difference in ΔADC between non-responders and responders (11.5 vs. 16.5%, p = 0.433).

ADC value and overall survival

At the time of final survival follow-up, after a median follow-up period of 29.8 months (95% CI 17.3–42.3), 53 of 89 (60%) patients had died. The median OS was 21.7 months (95% CI 17.6–25.9). There was no significant correlation between OS and pre-TACE ADC value (ρ = 0.085, p = 0.443) (Fig. 4a). When stratifying patients according to high (> 1075 mm2/s) or low (≤ 1075 mm2/s) pre-TACE ADC, there were no significant differences in OS (23.8 months versus 21.7 months, p = 0.111) (Fig. 4b). When included in the multivariable analysis together with known pre-TACE predictors of OS, only tumor number and serum alpha-fetoprotein (AFP) remained independent predictors (Table 3).

Fig. 4
figure 4

The association between pre- and post-ADC measurements and overall survival. a Correlation between pre-TACE ADC and overall survival. b Overall survival according to according to high (>1075 mm2/s) or low (≤ 1075 mm2/s) ADC prior to first TACE. c Overall survival according to mRECIST response; Complete or partial response versus stable or progressive disease. d Overall survival according to ADC response; High (>25%) or low (≤25%) ΔADC

Table 3 Univariable and multivariable Cox-regression analysis for overall survival

At first follow-up, there were significant differences in OS between responders and non-responders according to mRECIST criteria (24.5 vs. 18.3 months, log-rank p = 0.038) (Fig. 4c). There were no significant differences in OS between patients with low or high ΔADC after first TACE, both at the 13.6% (HR 1.06, 95% CI 0.49–2.30) and 25% cut-off (HR 0.71, 95% CI 0.32–1.58) (Table 3, Fig. 4d). In a second multivariable model including baseline predictors, ΔADC (> 25% vs. ≤ 25%) and mRECIST response at first follow-up, response according to mRECIST remained independently associated with improved OS (HR 0.24, 95% 0.08–0.74), whereas Δ ADC was not (HR 1.60, 95% CI 0.60–4.25). A model including the 13.6% cut-off for Δ ADC yielded similar results (data not shown).

Discussion

In our study of HCC patients treated with TACE, we confirmed that patients with radiological tumor response as assessed per mRECIST criteria have an increased ADC after TACE. However, neither pre-TACE ADC nor ΔADC at the routine response evaluation after first TACE were significantly associated with OS, whereas response based on the mRECIST criteria was confirmed as a significant prognostic factor.

In concordance with prior studies there was a significantly higher ΔADC in patients who responded to TACE according to the mRECIST criteria [12, 14, 15, 19, 22]. Prior studies mainly showed the superiority of evaluating the post-TACE tumor ADC value and contrast enhancement (mRECIST) over the tumor size alone (RECIST) [12, 14, 15] or underscored the significant association between increased ADC and prolonged progression-free survival (PFS) [15, 16]. However, endpoints like PFS or time to progression (TTP) have limited accuracy in representing TACE benefit in terms of OS [2]. Overall survival remains the most robust endpoint for TACE benefit, but this requires a long follow-up (> 24 months). Given the increasing number of subsequent liver-directed and systemic treatment options after TACE failure, there is an increasing clinical need for validated methods to predict OS benefit following TACE.

In our study, both pre-TACE ADC and ΔADC after first TACE were not significantly associated with OS, whereas mRECIST was an independent predictor of OS. This confirms prior studies showing the prognostic value of mRECIST response after TACE [21, 26,27,28,29]. Despite a multitude of studies on ADC measurements in HCC patients treated with TACE [8,9,10,11,12, 14, 15, 17,18,19,20,21,22,23], few other studies reported on the association between ADC and OS [20,21,22,23]. One Chinese study with limited sample size (n = 23) found a significant association between pre-TACE ADC and OS [23], and 4 studies showed a significant association between ΔADC and OS [20,21,22,23]. The differences in outcomes with the present study should be considered in light of the differences in study methodology. Three prior studies only reported a univariable association between ΔADC and OS [20, 21, 23], as opposed to the study by Corona et al. [22] and the present study in which more robust multivariable analyses were performed. Moreover, the timing of follow-up imaging after TACE was earlier in prior studies (3 h to 4 weeks) than our study (6 weeks). Kamel et al. described the serial changes of ADC value following TACE, showing that the largest difference in ADC value was at 1–2 weeks which declined after 3–4 weeks [30]. Serial measurements showed that the change in arterial and venous enhancement remained stable during 1–4 weeks after TACE. Consequently, the response evaluation performed in this study may have detected these (stable) differences in contrast enhancement (mRECIST), but the decreased differences in ADC value at 6 weeks could have diluted the association with OS. This was underscored by our ADC measurements at the best response to TACE (median of 10 weeks), showing a lower median ADC value than measurement after first TACE.

As ADC quantification prior to TACE and after 6 weeks did not provide better survival prediction than mRECIST in this study, more studies are needed to validate the value of earlier DWI (i.e., < 4 weeks or < 24 h) for this purpose. DWI is a short MRI sequence which does not require contrast injection, making it a promising technique for early response evaluation or in patients who are unable to receive contrast (i.e., renal failure). Replacing mRECIST at regular follow-up intervals appears not yet feasible due to the moderate accuracy (AUROC 0.68) of ΔADC in predicting treatment response according to mRECIST. Improved DWI quality or earlier response evaluation using DWI might improve the prognostic performance of DWI and ADC measurements. However, some limitations to DWI and ADC quantification should be noted. Reflected by the significant number of not-evaluable patients (6%) and target lesions (23%), ADC measurements are limited by inconsistent image quality and decreasing accuracy in measuring small lesions (< 1 cm). Moreover, it is currently poorly defined how to handle the significant intra-patient (i.e., mixed tumor response) and intra-tumoral (i.e., viable and necrotic parts of lesions) heterogeneity in response evaluation using ADC. This may cause inter-observer variability and reduce the reproducibility of ADC values especially in diffuse infiltrating or partially responding tumors. According to RECIST and mRECIST criteria, mixed response is accounted for by measuring the sums of (arterial enhancing) tumor diameters [5]. Similarly, in this study we assessed TACE response on a patient basis by selecting the largest tumor diameter of a maximum of three target lesions and calculating the average ADC before and after TACE. Lastly, standardized DWI settings (i.e., b values) or validated cut-offs for pre-ADC or ΔADC values are lacking. Similar to prior retrospective DWI studies [14], a large number of patients had to be excluded from this study due to differences in imaging modality (n = 102) or differences in DWI protocol (n = 8). Future prospective studies with standardized protocols are needed to address these issues prior to implementation in clinical practice.

In conclusion, in HCC patients treated with TACE, we confirmed the significant increase in ADC value in patients responding according to the mRECIST criteria. mRECIST response was independently associated with improved OS, but pre-TACE ADC and change in ADC at first routine follow-up were not significantly associated with OS. Therefore, HCC staging or response assessment at routine clinical follow-up based on ADC measurements is not ready to replace the current staging parameters or response assessment using the mRECIST criteria.