Response prediction of hepatocellular carcinoma undergoing transcatheter arterial chemoembolization: unlocking the potential of CT texture analysis through nested decision tree models

Objectives To investigate if nested multiparametric decision tree models based on tumor size and CT texture parameters from pre-therapeutic imaging can accurately predict hepatocellular carcinoma (HCC) lesion response to transcatheter arterial chemoembolization (TACE). Materials and methods This retrospective study (January 2011–September 2017) included consecutive pre- and post-therapeutic dynamic CT scans of 37 patients with 92 biopsy-proven HCC lesions treated with drug-eluting bead TACE. Following manual segmentation of lesions according to modified Response Evaluation Criteria in Solid Tumors criteria on baseline arterial phase CT images, tumor size and quantitative texture parameters were extracted. HCCs were grouped into lesions undergoing primary TACE (VT-lesions) or repeated TACE (RT-lesions). Distinct multiparametric decision tree models to predict complete response (CR) and progressive disease (PD) for the two groups were generated. AUC and model accuracy were assessed. Results Thirty-eight of 72 VT-lesions (52.8%) and 8 of 20 RT-lesions (40%) achieved CR. Sixteen VT-lesions (22.2%) and 8 RT-lesions (40%) showed PD on follow-up imaging despite TACE treatment. Mean of positive pixels (MPP) was significantly higher in VT-lesions compared to RT-lesions (180.5 vs 92.8, p = 0.001). The highest AUC in ROC curve analysis and accuracy was observed for the prediction of CR in VT-lesions (AUC 0.96, positive predictive value 96.9%, accuracy 88.9%). Prediction of PD in VT-lesions (AUC 0.88, accuracy 80.6%), CR in RT-lesions (AUC 0.83, accuracy 75.0%), and PD in RT-lesions (AUC 0.86, accuracy 80.0%) was slightly inferior. Conclusions Nested multiparametric decision tree models based on tumor heterogeneity and size can predict HCC lesion response to TACE treatment with high accuracy. They may be used as an additional criterion in the multidisciplinary treatment decision-making process. Key Points • HCC lesion response to TACE treatment can be predicted with high accuracy based on baseline tumor heterogeneity and size. • Complete response of HCC lesions undergoing primary TACE was correctly predicted with 88.9% accuracy and a positive predictive value of 96.9%. • Progressive disease was correctly predicted with 80.6% accuracy for lesions undergoing primary TACE and 80.0% accuracy for lesions undergoing repeated TACE. Supplementary Information The online version contains supplementary material available at 10.1007/s00330-020-07511-3.


Introduction
Hepatocellular carcinoma (HCC) which accounts for more than 90% of primary liver cancers is the sixth most common cancer regarding incidence and the fourth most common cause of cancer-related mortality worldwide [1][2][3]. In accordance with clinical practice guidelines, patients diagnosed with intermediate or advanced stage of neoplastic disease are not amenable to curative surgical resection but are allocated to loco-regional interventional treatment or, alternatively, protein kinase inhibitor therapy such as sorafenib [3,4]. Particularly in Barcelona Clinic Liver Cancer (BCLC) stage B patients, transcatheter arterial chemoembolization (TACE) represents the standard of care in many institutions [3,5,6]. Furthermore, TACE is also the most widely used bridging therapy in BCLC stage A patients awaiting liver transplantation. The efficacy of TACE has been demonstrated in randomized control trials [7,8]. Noticeable differences in overall survival however suggest that not all treated lesions will effectively respond to TACE, as HCC patients show a wide spectrum of potential short-and long-term outcomes following treatment [9].
In recent years, efforts have been made identifying biomarkers of HCCs potentially predicting lesion response to TACE treatment, aiming to facilitate a decision process whether a patient should undergo primary or repeated TACE or be treated by different means. Utilized predictive algorithms were based on either laboratory results and clinical scores [10] or imaging parameters derived from CT or MRI texture analysis [11,12]. The advent of such concepts demonstrated the emerging trend towards precision medicine in patients with focal liver disease and depicted the general ability to extend the assessment of HCC lesions beyond current classification systems.
While already suggested prediction approaches take several parameters into account, they are generally only used individually and dichotomously. By nesting multiple factors into a decision tree model, potentially using varying thresholds of redundant factors at different locations within the decision tree, a further increase in accuracy of treatment response prediction may result.
The aim of our study was to investigate the value of histogram-based CT texture analysis-derived nested decision tree models for the prediction of HCC lesion response to TACE treatment according to modified Response Evaluation Criteria in Solid Tumors (mRECIST) criteria in order to demonstrate that accurate prediction of complete response and progressive disease prior to both primary and repeated TACE is feasible.

Study sample
This retrospective study was approved by the institutional review board; patients gave written informed consent. All patients treated with TACE and histopathologically proven HCC during the observation window between January 2011 and September 2017 were included. Exclusion criteria were (1) patients without baseline dynamic contrast-enhanced CT of the liver, (2) lack of follow-up CT imaging at the earliest 4 weeks after treatment, and (3) patients with non-diagnostic CT images. The final study sample consisted of 37 patients with a total of 92 individually treated HCC lesions (Fig. 1).
Both BCLC stage B patients undergoing palliative TACE and BCLC stage A patients being poor candidates for surgery or receiving TACE as a bridging treatment while awaiting liver transplantation were included. Due to the initial heterogeneity of disease stages, this study solely focused on target response prediction according to mRECIST and did not assess long-term outcome parameters, e.g., overall survival.

CT imaging
All patients underwent four-phase CT according to the institutional standard liver imaging protocol. Imaging acquisitions were as follows: unenhanced, late arterial phase (AP), portal venous phase (PVP) and 3-min delayed phase (DP). CT examinations were acquired on a 128-slice (Somatom Definition Edge, Siemens Healthineers; tube settings 100 kV, 110 eff. mAs; collimation 128 mm × 0.6 mm, pitch 0.6, rotation time 0.5 s, slice thickness 1.5 mm) or 256-slice (Somatom Definition Flash, Siemens Healthineers; tube settings 100 kV, 65 eff. mAs; collimation 128 mm × 0.6 mm, pitch 0.6, rotation time 0.5 s, slice thickness 1.5 mm) scanner system. Following the unenhanced scan, 1.2 ml/kg of 370 mg I/ml iopromide (Ultravist® 370, Bayer Pharma) was injected intravenously at a flow rate of 4 ml/s by a power injector (Ulrich Medical). Using bolus tracking technique, AP images were acquired 18 s after reaching 100 HU in the descending aorta at the level of the celiac trunk. PVP and DP images were obtained 70 s and 180 s after reaching the scan initiation threshold. Timepoints of CT examinations were as follows: baseline imaging within 1 week prior to TACE, first follow-up imaging at 4 weeks after TACE and subsequent follow-ups at 3-month intervals.
Transcatheter arterial chemoembolization procedure TACE procedures were performed by one of two interventional radiologists with > 10 years of interventional experience (C.J.Z.). The right femoral artery was punctured using Seldinger technique, and a 4-French cobra or sidewinder catheter was inserted into the celiac trunk and common hepatic artery, respectively. In case of variant vascular anatomy, the superior mesenteric artery was catheterized additionally. To visualize feeding arteries of the tumor, digital subtraction angiography was performed and feeding vessels were superselectively intubated with a highly flexible 2.7-French microcatheter (ProGreat, Terumo). For embolization, doxorubicin-coated 100-μm beads (Tandem Beads, Embozene, now Boston Scientific) were slowly injected under fluoroscopic guidance up to a maximum dose of 150 mg doxorubicin. If stasis in the feeding vessel and disappearance of tumor staining was observed earlier, injection was terminated at a lower total doxorubicin dose. A closing DynaCT was obtained to assess treatment success.

Image and texture analysis
Image datasets of all patients and timepoints were imported into mint Lesion™ 3.0 software (Mint Medical GmbH; commercially available) for post-processing. Every lesion was manually segmented on arterial phase images (axial plane, slice thickness 1.5 mm) according to mRECIST criteria by one radiologist with 2 years of experience in abdominal imaging (J.V.). All segmentations were reviewed by one radiologist specialized in abdominal imaging with > 15 years of experience (D.T.B.) and one radiologist specialized in abdominal imaging and interventional radiology with > 15 years of experience (C.J.Z.). Measurement discrepancies were resolved by consensus. CT texture analysis of segmented regions of interest (ROIs) was performed automatically by mint Lesion™ software based on gray-level histograms and included the following parameters: entropy, kurtosis, skewness, mean of positive pixels (MPP), and uniformity of positive pixel (UPP) distribution.
Response to TACE treatment as per mRECIST criteria was calculated after manual segmentation of lesions' enhancing portions on baseline and follow-up CT examinations; mRECIST timepoint response evaluation criteria were the following: complete response (CR), disappearance of any intratumoral arterial enhancement; partial response (PR), at least 30% decrease in the sum of diameters of viable tumor; stable disease (SD), any cases not qualifying for either partial response or progressive disease; and progressive disease (PD), an increase of at least 20% in the sum of the diameters of viable tumor. If a lesion did not achieve CR after TACE and the institutional multidisciplinary gastrointestinal tumor board decided for repeated TACE treatment, a new baseline was set for the remaining enhancing portions of the lesion. This new baseline was used for subsequent response assessment after repeated TACE.
All CT texture analysis parameters, mRECIST timepoint responses, and lesion measurements specifically short-and long-axis diameters (in mm) and area (in mm 2 ) were extracted.

Statistical analysis and graphical visualization
Data was analyzed using SPSS 14 (IBM Corporation) for descriptive statistics and JMP® 14.0 (SAS Institute, Inc.) for calculation of prediction models, both commercially available.
HCCs were divided into two groups: (1) previously untreated lesions undergoing primary TACE (VT-lesions) and (2) lesions receiving repeated TACE (RT-lesions) due to incomplete response or progressive disease after the first TACE. This step was performed in order to differentiate if the first treatment sequence has any effect on the texture of the remaining viable tumor portions, and thus, a change in parameter thresholds needed for accurate response prediction. Baseline characteristics were compared using an independent Student's t test, chi-square test, and Fisher's exact test (significance level p < 0.05).
The two datasets were imported in the Prediction Profiler module in JMP®. CR or PD was defined as outcome parameters. The module created varying testing and confirmation datasets during the modeling process and calculated prediction models based on tumor size (area), surrounding hepatic parenchyma (cirrhotic vs non-cirrhotic liver), and parameters from CT texture analysis, resulting in four different decision tree models with automatically generated optimal discrimination thresholds for the respective parameters in each model: (1) VT-lesions with CR as goal of prediction, (2) VT-lesions with PD as goal of prediction, (3) RT-lesions with CR as goal of prediction, and (4) RT-lesions with PD as goal of prediction. Categorization of lesions was based on binary splitting. The minimum split size at each node was set at ten lesions to avoid overfitting.
Parameters' contribution to the model, receiver operating characteristic (ROC) curve, and confusion matrices to depict model performance were generated. Based on confusion tables, positive prediction value (PPV), negative prediction value (NPV), and accuracies were calculated.

Baseline characteristics
Of the 92 included lesions, 17 were found in women and 75 in men. Mean age at baseline imaging was 70.3 years (± 9.3, range 49-88). Eighty-seven percent (80/92) of lesions arose from cirrhotic livers, while 13% (12/92) were located in non-cirrhotic liver parenchyma. The mean number of HCC lesions per patient was 3 (range 1-7). Distribution of lesions between the different anatomical liver segments was heterogenous, and the majority was found in the right liver lobe (67/92, 67.8%), especially i n s e g m e n t V I I I ( 3 1 / 9 2 , 3 3 . 7 % ) . A l l b a s e l i n e    Table 1. Seventy-two HCC lesions in our study were treated with a single TACE, while 20 tumors required repeated treatment sessions (range 2-4 TACE treatments). MPP was significantly higher in VT-lesions compared to RT-lesions (180.5 ± 213.6 vs 92.8 ± 21.6, p = 0.001). No other significant differences in baseline CT texture parameters were observed. Data is summarized in Table 2.

Response to TACE Treatment
In the primary TACE group, 38 of the 72 lesions (52.8%) showed CR on post-therapeutic imaging (example in Fig. 2); 16 lesions (22.2%) showed PD on follow-up CT. In the repeated TACE group, 8 of 20 lesions (40%) were rated as CR, while 8 lesions (40%) showed PD on follow-up imaging.
The course of each lesion over time and target response at follow-up imaging timepoints is illustrated as a swimmer plot (supplemental material). Mean time frame for overall disease follow-up was significantly longer for VT-lesions, compared to RT-lesions (268 ± 235 days vs 171 ± 110 days, p = 0.01). Timepoints of disease-related death or liver transplantation are visualized in the swimmer plot.

Prediction of lesion response to primary TACE
The calculated decision tree model to predict CR in VTlesions had eight binary splits and used six parameters to

Prediction of lesion response to Re-TACE
The decision tree model to predict CR in RT-lesions had three binary splits with only one contributing parameter (area; total effect 0.986). The model's AUC was 0.83. Correct target response was predicted in 15 of 20 lesions (PPV 80.0, NPV 83.3%, resultant accuracy 75.0%). The model is visualized in Fig. 5.
The decision tree model to predict PD in RT-lesions had three binary splits and three contributing parameters: area (0.97), MPP (0.125), and kurtosis (0.068). The AUC of the model was 0.86, and correct response was predicted for 16 of 20 lesions (PPV 83.3%, NPV 78.6%, resultant accuracy 80.0%). The model is visualized in Fig. 6. Parameter effects for each model are listed in Table 3.

Discussion
The aim of our study was to assess the feasibility of generating decision tree models based on pre-therapeutic CT texture parameters to predict complete response or progressive disease of HCC lesions to TACE treatment according to mRECIST criteria. Our results demonstrate that both target responses can be predicted with high accuracy for HCCs undergoing both primary TACE and repeated  Fig. 5 a Decision-tree model based on texture parameters, size, and surrounding liver parenchyma (cirrhotic vs non-cirrhotic) to predict complete response prior to repeated transcatheter arterial chemoembolization (TACE) treatment. b ROC curve with AUC values for the model TACE. To our knowledge, this is the first description of nested multiparametric prediction models for this tumor entity.
Our study sample represents the two patient populations typically triaged to TACE treatment: individuals suffering intermediate stage disease for palliative loco-regional treatment and patients with early stage of disease being poor surgical candidates or undergoing bridging TACE while awaiting liver transplantation. Mean age and gender distribution of our study sample match the typical HCC epidemiology in the Western world [13].
Image-based texture analysis is an emerging methodology to gain additional quantitative information on lesion heterogeneity based on gray-level histograms. It is used to extract and process pixel distribution within a region or interest. Both CT and MRI imaging are generally suitable modalities for analyses, potentially serving as non-invasive imaging biomarkers for prognosis and treatment response [14,15]. Many studies assessed the feasibility of this method for various tumor entities with promising results, e.g., endometrial cancer [16], pancreatic cancer [17], and non-small cell lung cancer [18].
Several published studies already focused on prediction of therapeutic response of HCCs to transcatheter arterial chemoembolization [11,12,19,20]. Strong arterial enhancement, smaller tumor size, and lower homogeneity were found to be significant predictors of a complete response outcome. Our results confirm this observation, since especially the size of a lesion was the parameter with the highest total effect in all prediction models. Tumor heterogeneity, represented by the parameters' uniformity and MPP, also proved to have relevant effects when aiming to predict target response in lesions undergoing primary TACE.
When targeting response prediction to repeated TACE, arterial enhancement seemed to be of less importance though, since it only had an effect in the model predicting progressive disease of RT-lesions, but not in the complete response model. We interpret this finding as resultant change in the underlying lesion vascularization caused by the primary TACE treatment. Due to the desired synergistic effects of vessel blockage and cytotoxic chemotherapeutic agent, the vascular bed of tumors has been damaged sufficiently to impact parenchyma perfusion. Success of repeated TACE hence does not primarily  Fig. 6 a Decision-tree model based on texture parameters, size, and surrounding liver parenchyma (cirrhotic vs non-cirrhotic) to predict progressive disease prior to repeated transcatheter arterial chemoembolization (TACE) treatment. b ROC curve with AUC values for the model A study that supports our hypothesis was performed by Fujita et al [21] in 2008, who revealed discrepancies between arterial enhancement on pretherapeutic CT scans and uptake of ethiodized oil (Lipiodol). In their study, 14.5% of tumors with poor to no enhancement on baseline CT images showed, however, moderate to complete accumulation of Lipiodol, emphasizing that success of treatment does not always correlate with lesion enhancement on baseline CT or hepatic angiography. This likely affects prediction of therapeutic effects of TACE treatment and underlines the benefit of nested multiparametric models which place impacting tissue characteristics in an additive matrix rather than single parameterbased attempts of prediction in patients suffering hepatocellular carcinoma.

Repeated TACE -Prediction of Progressive Disease
Our prediction models and especially decision tree visualization with thresholds for the utilized parameters allowing binary splits are a novel paradigm to transfer CT texture analyses of HCCs into future clinical practice. Decision trees are popular in a wide range of medical and non-medical professions for a variety of reasons, easy interpretability probably being the most important advantage. In contrast to other artificial intelligence models, which also increasingly find their way into clinical practice but are usually based on nontraceable neural networks, decision tree models are comprehensible and reproducible for the user.
Besides clinical relevance, implementability into routine workflows is one of the main challenges when aiming to transfer research innovations into clinical practice. We extracted CT texture parameters from software that radiologists in our department use for longitudinal follow-up imaging in oncologic patients. As the segmentation of target lesions is performed routinely for the radiology report and CT texture parameters are computed automatically, no additional tasks would have to be performed by the interpreting radiologist. This is an advantage over analysis with additional software solutions, resulting in increased workload in daily routine. However, some of these texture analysis tools offer additional parameters, e.g., gray-level co-occurrence matrices, which may further increase prediction model accuracies.
Our study has several limitations. The study sample was rather small; however, by analyzing all treated HCCs of patients, we reached a sufficient number of lesions for modeling. CT scans were performed on two different scanner systems. Resulting possible slight differences in CT attenuation values may have caused a bias in texture parameters. Since both scanners were however manufactured by the same vendor and we consistently used the same imaging protocol, we rate this possible error of lesser importance. Segmentation of lesions was performed manually which is always prone to errors. Consensus reading of the segmented ROIs by three radiologists, two of them with long-term experience in liver imaging, should have reduced this error to a minimum. Finally, our analysis is based on 2D segmentation on axial image datasets. This is attributed to the analysis according to mRECIST criteria, even though 3D analysis of lesions would possibly have been more accurate in terms of tumor heterogeneity.
In conclusion, our study provides strong evidence that CT texture analyses of HCC lesions at baseline imaging prior to TACE may be used to accurately predict therapeutic response when using nested multiparametric decision tree models, which are easily understandable for everyone involved in the decision process of triaging patients to TACE.
Funding Open access funding provided by University of Basel.

Compliance with ethical standards
Guarantor The scientific guarantor of this publication is Daniel T. Boll.

Conflict of interest
The authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article.

Statistics and biometry
No complex statistical methods were necessary for this paper.
Informed consent Written informed consent was obtained from all subjects (patients) in this study.
Ethical approval Institutional review board approval was obtained.

Methodology
• retrospective • diagnostic or prognostic study • performed at one institution Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.