Abstract
Objectives
Independent internal and external validation of three previously published CT-based radiomics models to predict local tumor progression (LTP) after thermal ablation of colorectal liver metastases (CRLM).
Materials and methods
Patients with CRLM treated with thermal ablation were collected from two institutions to collect a new independent internal and external validation cohort. Ablation zones (AZ) were delineated on portal venous phase CT 2–8 weeks post-ablation. Radiomics features were extracted from the AZ and a 10 mm peri-ablational rim (PAR) of liver parenchyma around the AZ. Three previously published prediction models (clinical, radiomics, combined) were tested without retraining. LTP was defined as new tumor foci appearing next to the AZ up to 24 months post-ablation.
Results
The internal cohort included 39 patients with 68 CRLM and the external cohort 52 patients with 78 CRLM. 34/146 CRLM developed LTP after a median follow-up of 24 months (range 5–139). The median time to LTP was 8 months (range 2–22). The combined clinical-radiomics model yielded a c-statistic of 0.47 (95%CI 0.30–0.64) in the internal cohort and 0.50 (95%CI 0.38–0.62) in the external cohort, compared to 0.78 (95%CI 0.65–0.87) in the previously published original cohort. The radiomics model yielded c-statistics of 0.46 (95%CI 0.29–0.63) and 0.39 (95%CI 0.28–0.52), and the clinical model 0.51 (95%CI 0.34–0.68) and 0.51 (95%CI 0.39–0.63) in the internal and external cohort, respectively.
Conclusion
The previously published results for prediction of LTP after thermal ablation of CRLM using clinical and radiomics models were not reproducible in independent internal and external validation.
Clinical relevance statement
Local tumour progression after thermal ablation of CRLM cannot yet be predicted with the use of CT radiomics of the ablation zone and peri-ablational rim. These results underline the importance of validation of radiomics results to test for reproducibility in independent cohorts.
Key Points
• Previous research suggests CT radiomics models have the potential to predict local tumour progression after thermal ablation in colorectal liver metastases, but independent validation is lacking.
• In internal and external validation, the previously published models were not able to predict local tumour progression after ablation.
• Radiomics prediction models should be investigated in independent validation cohorts to check for reproducibility.
Graphical Abstract
![](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs00330-023-10417-5/MediaObjects/330_2023_10417_Figa_HTML.png)
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The preferred treatment choice for colorectal liver metastases (CRLM) is resection, but not all metastases nor patients are eligible for resection. An alternative and complementary strategy is thermal ablation, including microwave ablation (MWA) and radiofrequency ablation (RFA) [1, 2]. After thermal ablation of CRLM, local tumour progression (LTP) rates of 6–46% have been reported [3,4,5,6,7,8]. LTP is defined as the recurrence of tumour foci at the edge of the ablation zone after initial follow-up imaging showing adequate ablation [9, 10]. The detection of LTP can be challenging since post-ablation effects and recurrent disease have comparable densities on contrast enhanced (ce) CT [11]. This results in a sensitivity of 53% for ceCT for the detection of LTP [4]. So to detect LTP, imaging at multiple subsequent time points may be necessary, consequently causing a delay in the detection and treatment of LTP.
To overcome this delay, we recently performed a study to predict LTP in CRLM with the use of radiomics of the post-ablation CT images [12]. If the prediction of LTP is successful, patients with a high risk for LTP can undergo complementary treatment without delay, and a de-intensified follow-up schedule can be considered for low-risk patients. In the previously published original study, we developed and compared three prediction models, including clinical parameters, radiomics features of both the ablation zone (AZ) and the peri-ablational rim (PAR), as well as a combination of clinical and radiomics parameters. The combined clinical-radiomics model yielded the highest performance with a concordance (c-) statistic of 0.78 (95% confidence interval (95%CI) 0.65–0.87). The performances were retrieved with leave-one-out cross-validation (LOOCV), i.e., the models were not validated on independent patient cohorts. To evaluate whether results can be applied to other populations, external validation is crucial [13]. Therefore, the aim of the current study is to validate the clinical-radiomics prediction models from the original study to predict LTP after thermal ablation of CRLM using both independent internal and external validation cohorts.
Material and methods
Patient selection
This multicentre retrospective study was approved by the Institutional Review Board of both institutions (IRBd18.066/MEC-2019–0850), and informed consent was waived. A data license agreement was established to transfer all data to the primary research centre. For the internal validation cohort, medical records were reviewed from April 2018 until August 2021 in the same institution (The Netherlands Cancer Institute Amsterdam) where the original study was performed. In the original study, patients were included up until April 2018. For the external validation cohort, medical records were searched from January 2007 until October 2019 in the second institution (Erasmus Medical Centre Rotterdam).
The patient selection process was in line with the original study in order to select a comparable patient cohort. The original inclusion criteria comprised of (1) patients successfully treated with thermal ablation for CRLM; (2) histopathological confirmation of CRLM; (3) portal venous phase (PVP) CT available 2–8 weeks after ablation. The exclusion criteria were (1) < 6 months of follow-up without LTP; (2) > 5 CRLM; (3) unclear origin of liver metastases; (4) ablated CRLM of size > 3 cm; (5) history of diffuse liver disease; (6) history of liver treatment which could affect the parenchyma (such as stereotactic body radiation therapy (SBRT), portal vein embolisation (PVE), transarterial chemoembolisation (TACE)); (6) incomplete ablation (including residual disease, ablation margins < 5 mm and re-ablations); (7) missing clinical data (e.g. no pre-ablation imaging available); (8) delineation problems including artefacts, air or abscess within the AZ and insufficient scan quality. Due to a relatively short inclusion period compared to the external and original cohorts, the number of eligible patients for the internal validation was small. Hence, to increase the sample size for the internal cohort, the exclusion criterion ‘ > 5 CRLM’ was changed into ‘ > 5 CRLM ablated’. This adjustment was deemed not to influence the results, since it was made under the assumption that the AZ texture is not correlated with the number of CRLM present in one liver. A flowchart of the patient selection process is depicted in Fig. 1. Patient characteristics were collected from the medical records and are presented per cohort in Table 1.
Ablation procedures
Ablation procedures were performed either percutaneously under CT or ultrasound guidance or open, guided by intraoperative ultrasound. All percutaneous ablations were performed by an interventional radiologist under sedation analgesia, epidural, or general anaesthesia. The open ablations were performed under general anaesthesia by a liver surgeon, either with or without the assistance of an interventional radiologist. The choice between RFA and MWA was based on the availability and physician’s preferences. Three different systems were used for RFA: the Cool-tip™ RF Ablation System E Series (Medtronic), the StartBurst® Radiofrequency Ablation system (AngioDynamics), and the AMICA Microwave and RF system (HS Hospital Service). For MWA, the NeuWave™ Microwave Ablation System of Ethicon (Johnson&Johnson), the Emprint™ Ablation System with Thermosphere™ Technology (Medtronic), and the AMICA Microwave and RF system (HS Hospital Service) were used. Procedures were carried out in accordance with the CIRSE Standards of Practice [10].
CT image acquisition
Contrast enhanced CT image acquisition was performed on a total of 19 different CT scanners. Intravenous contrast was injected at a rate of 3 ml/s followed by a 30 ml saline flush. Both bolus triggering software and fixed delay times (70 s post-injection for PVP) were used, depending on the CT scanner. Detailed information on scanning parameters is displayed in Table 2.
Standard of reference to establish LTP
LTP was defined as any new tumour foci occurring in a 10 mm vicinity of the AZ on follow-up imaging within 24 months after thermal ablation [9]. Lesions were categorised as no LTP if the patient developed (1) no new CRLM; (2) new CRLM > 10 mm distance to the AZ; or (3) new CRLM within 10 mm of the AZ after > 24 months. Follow-up imaging consisted of regular follow-up ceCT, scheduled every 3 months in the first year, and 6 monthly thereafter until 5 years after ablation. In case of doubt, magnetic resonance imaging (MRI) or positron emission tomography (PET)-CT was used as a problem-solver. All liver imaging until the end of follow-up was checked for disease progression.
Delineation and radiomics features
The manual delineations, pre-processing steps, and features extraction process were similar to the original study [12]. An example of the delineations is displayed in Fig. 2.
Prediction models and analysis
Baseline patient characteristics were compared between the cohorts, using the Kruskal Wallis test and chi-square test. p values ≤ 0.05 were considered statistically significant. The included features per model are presented in Table 3. For the two validation cohorts, the discriminative power of all three models was assessed using the c-statistic. ComBat harmonisation was applied to the radiomics features to harmonise between the three cohorts [14]. All statistical analyses were performed using RStudio software v1.4.1103. To assess the quality of this study, the Radiomics Quality Score (RQS) was calculated [15]. The methods of this study and the original study are schematically presented in Fig. 3.
Results
Patient and lesion characteristics
The internal validation cohort included 68 CRLM in 39 patients. LTP was found in 11/68 CRLM (16%). The median time to LTP was 8 months (range 2–22), and the median follow-up for CRLM without LTP was 25 months (range 8–50). The external cohort comprised of 78 CRLM in 52 patients. Twenty-three out of 78 CRLM (29%) developed LTP with a median time to LTP of 10 months (range 2–22 months). The CRLM without LTP had a median follow-up of 29 months (range 6–139). The median ablation to CT interval was 31 days (range 14–50, IQR 24–44 days) and 42 days (range 14–56, IQR 20–48 days) for the internal and external cohort, respectively. Patient and lesion characteristics were similar in terms of sex, primary tumour characteristics, and chemotherapy treatment. A higher mean age (66 vs 61 and 63) was found in the external validation cohort (p = 0.047). Larger CRLM were ablated (p = 0.047) in the original cohort (18 ± 6), compared to the internal (11 ± 7 mm) and external cohorts (13 ± 7). Significantly more metachronous metastases were included in the validation cohorts compared to the original cohort (21 and 23% vs 45%, p < 0.01). Lastly, all CRLM (100%) were treated with MWA in the internal cohort, while the majority were treated with RFA in the original and external cohorts (80% and 87%, respectively, p < 0.01).
Model performance
For the internal validation cohort, a c-statistic of 0.47 (95%CI 0.30–0.64) was found for the combined model. The radiomics model showed a c-statistic of 0.46 (95%CI 0.29–0.63) and the clinical model 0.51 (95%CI 0.34–0.68). In external validation, the combined model yielded a c-statistic of 0.50 (95%CI 0.38–0.62), the radiomics model 0.40 (95%CI 0.28–0.52), and the clinical model 0.51 (95%CI 0.39–0.63). ComBat harmonisation yielded no improvement in the combined or radiomics models. Results are presented in Table 4. This study reached an RQS of 50%. The distribution of RQS points is displayed in Supplementary Table 1.
Discussion
This study evaluated the reproducibility of three previously published clinical-radiomics models to predict LTP after thermal ablation of CRLM. The models were validated in an independent internal and external validation cohort, and poor performances were found (C-statistics 0.40–0.51). The poor validation performance is most probably explained by overfitting: the models were trained too specifically for the training data and probably (also) used image noise or random fluctuations instead of true differences between the studied groups [16, 17]. In the original study, LOOCV was applied after model development. However, this is rather a test of the fit of the training data than of the quality of the model, which can result in an overoptimistic estimate of the performance [18].
We hypothesise our radiomics models overfitted on image noise caused by acquisition differences. Multiple studies show that acquisition parameters affect the values of the radiomics features [19,20,21,22,23]. Our cohorts were heterogeneous in terms of CT acquisition parameters, with 19 different CT scanners involved in validation and 5 scanners in the original study. In an attempt to account for the variability between scanners, we applied ComBat harmonisation to the three cohorts. The features were only marginally adjusted without a relevant effect on the performance, possibly because each batch already included multiple scanners. Preferably, the radiomics features would have been harmonised per CT scanner, but the number of patients allocated per batch was insufficient to allow for such harmonisation. Other acquisition differences were less likely to contribute to the low validation performance, such as the difference in iodine concentration per contrast agent or the tube current and voltage [23]. The differences in slice thickness were corrected by image resampling. Furthermore, additional steps, such as testing the intra-observer correlation of the segmentations or harmonising the features across scanners, could have been undertaken to enhance the reproducibility during model development.
Clinical heterogeneity between the cohorts might have contributed to the failure of the clinical model in validation. Despite the similar selection methodology, differences may have occurred due to (1) variations in hospital protocols and (2) adjustments over time due to treatment and scanner development. Both centres follow the Dutch clinical guidelines on the treatment of CRLM, but still, hospital variation occurs [24]. Especially, the eligibility of patients for thermal ablation based on ‘CRLM size’ and ‘number of CRLM ablated’ has evolved over the years. The use of MWA has rapidly increased over the last years, which resulted in technique differences between the cohorts. However, we do not think this is the reason for the low validation performance since the original study showed that the ablation technique did not significantly influence the radiomics features [12]. Moreover, two out of three parameters in the clinical model were ‘patient-specific’ (adjuvant chemotherapy and T-stage), while the prediction of LTP is a ‘lesion-specific’ outcome. A study exploring the risk factors for LTP found only ‘lesion-specific’ parameters were associated with LTP, and none of the ‘patient-specific’ parameters investigated were predictive for LTP [25]. This raises the question of how robust ‘patient-specific’ characteristics can be for the prediction of a ‘lesion-specific’ outcome.
Our study has several limitations. Firstly, the study design was retrospective and included a relatively small sample. Secondly, the LTP rates in our study were relatively high, which could be attributed to the long inclusion period, considering LTP rates were higher 15 years ago. The diagnosis of LTP was based on imaging, and the absence of histopathological evaluation could be considered a limitation, but it resembles how LTP is detected in clinical practice. Next, the minimum follow-up period of 6 months might have resulted in a small subset of patients being allocated to the wrong outcome group, given the median time to LTP of 8 months. Lastly, an arbitrary cut-off of 24 months was applied for the detection of LTP, as LTP after 24 months is rare and possibly involves new metastases rather than residual tumour clusters.
Due to the risk of overfitting the original model, we cannot draw any conclusions on the feasibility of LTP prediction based on CT radiomics. This study emphasises the need to assess the reproducibility of radiomics prediction models in independent patient cohorts. It underlines that no definite conclusions can be drawn from studies without proper internal and external validation. Future research aiming to explore radiomics in a similar setting should strive to minimise heterogeneity between and within patients’ cohorts, both in terms of clinical differences and imaging acquisition.
Abbreviations
- AZ:
-
Ablation zone
- ceCT:
-
Contrast enhanced computed tomography
- CRLM:
-
Colorectal liver metastases
- c-statistic:
-
Concordance statistic
- LOOCV:
-
Leave-one-out cross-validation
- LTP:
-
Local tumour progression
- MRI:
-
Magnetic resonance imaging
- MWA:
-
Microwave ablation
- PAR:
-
Peri-ablational rim
- PET:
-
Positron emission tomography
- PVP:
-
Portal venous phase
- RFA :
-
Radiofrequency ablation
- RQS:
-
Radiomics quality score
References
Takahashi H, Kahramangil B, Kose E, Berber E (2018) A comparison of microwave thermosphere versus radiofrequency thermal ablation in the treatment of colorectal liver metastases. HPB (Oxford) 20:1157–1162
Meijerink MR, Puijk RS, van Tilborg A et al (2018) Radiofrequency and microwave ablation compared to systemic chemotherapy and to partial hepatectomy in the treatment of colorectal liver metastases: a systematic review and meta-analysis. Cardiovasc Intervent Radiol 41:1189–1204
Liu M, Huang GL, Xu M et al (2017) Percutaneous thermal ablation for the treatment of colorectal liver metastases and hepatocellular carcinoma: a comparison of local therapeutic efficacy. Int J Hyperthermia 33:446–453
Samim M, Molenaar IQ, Seesing MFJ et al (2017) The diagnostic performance of (18)F-FDG PET/CT, CT and MRI in the treatment evaluation of ablation therapy for colorectal liver metastases: a systematic review and meta-analysis. Surg Oncol 26:37–45
Takahashi H, Berber E (2020) Role of thermal ablation in the management of colorectal liver metastasis. Hepatobiliary Surg Nutr 9:49–58
Groeschl RT, Pilgrim CH, Hanna EM et al (2014) Microwave ablation for hepatic malignancies: a multiinstitutional analysis. Ann Surg 259:1195–1200
Kurilova I, Bendet A, Petre EN et al (2021) Factors associated with local tumor control and complications after thermal ablation of colorectal cancer liver metastases: a 15-year retrospective cohort study. Clin Colorectal Cancer 20:e82–e95
Shady W, Petre EN, Do KG et al (2018) Percutaneous microwave versus radiofrequency ablation of colorectal liver metastases: ablation with clear margins (A0) provides the best local tumor control. J Vasc Interv Radiol 29:268-275.e261
Ahmed M, Solbiati L, Brace CL et al (2014) Image-guided tumor ablation: standardization of terminology and reporting criteria–a 10-year update. Radiology 273:241–260
Crocetti L, de Baére T, Pereira PL, Tarantino FP (2020) CIRSE standards of practice on thermal ablation of liver tumours. Cardiovasc Intervent Radiol 43:951–962
Maas M, Beets-Tan R, Gaubert JY et al (2020) Follow-up after radiological intervention in oncology: ECIO-ESOI evidence and consensus-based recommendations for clinical practice. Insights Imaging 11:83
Staal FCR, Taghavi M, van der Reijd DJ et al (2021) Predicting local tumour progression after ablation for colorectal liver metastases: CT-based radiomics of the ablation zone. Eur J Radiol 141:109773
Staal FCR, van der Reijd DJ, Taghavi M, Lambregts DMJ, Beets-Tan RGH, Maas M (2020) Radiomics for the prediction of treatment outcome and survival in patients with colorectal cancer: a systematic review. Clin Colorectal Cancer. https://doi.org/10.1016/j.clcc.2020.11.001
Horng H, Singh A, Yousefi B et al (2022) Generalized ComBat harmonization methods for radiomic features with multi-modal distributions and multiple batch effects. Sci Rep 12:4493
Lambin P, Leijenaar RTH, Deist TM et al (2017) Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14:749–762
Mayerhoefer ME, Materka A, Langs G et al (2020) Introduction to Radiomics. J Nucl Med 61:488–495
Wagner MW, Namdar K, Biswas A, Monah S, Khalvati F, Ertl-Wagner BB (2021) Radiomics, machine learning, and artificial intelligence-what the neuroradiologist needs to know. Neuroradiology 63:1957–1967
Demšar J, Zupan B (2021) Hands-on training about overfitting. PLoS Comput Biol 17:e1008671
Hu HT, Shan QY, Chen SL et al (2020) CT-based radiomics for preoperative prediction of early recurrent hepatocellular carcinoma: technical reproducibility of acquisition and scanners. Radiol Med 125:697–705
Hu P, Wang J, Zhong H et al (2016) Reproducibility with repeat CT in radiomics study for rectal cancer. Oncotarget 7:71440–71446
Li Y, Reyhan M, Zhang Y et al (2022) The impact of phantom design and material-dependence on repeatability and reproducibility of CT-based radiomics features. Med Phys 49:1648–1659
Kalendralis P, Traverso A, Shi Z et al (2019) Multicenter CT phantoms public dataset for radiomics reproducibility tests. Med Phys 46:1512–1518
Espinasse M, Pitre-Champagnat S, Charmettant B et al (2020) CT texture analysis challenges: influence of acquisition and reconstruction parameters: a comprehensive review. Diagnostics (Basel) 10
Elfrink AKE, Nieuwenhuizen S, van den Tol MP et al (2021) Hospital variation in combined liver resection and thermal ablation for colorectal liver metastases and impact on short-term postoperative outcomes: a nationwide population-based study. HPB (Oxford) 23:827–839
Han K, Kim JH, Yang SG et al (2021) A single-center retrospective analysis of periprocedural variables affecting local tumor progression after radiofrequency ablation of colorectal Cancer Liver Metastases. Radiology 298:212–218
Funding
The authors state that this work has not received any funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Guarantor
The scientific guarantor of this publication is Monique Maas.
Conflict of interest
Doenja Lambregts and Regina Beets-Tan are members of the European Radiology Scientific Editorial Board. They have not taken part in review and decision process for this article. The remaining authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article.
Statistics and biometry
S. Roberti kindly (PhD Candidate at the Department of Epidemiology and Biostatistics, Antoni van Leeuwenhoek - The Netherlands Cancer Institute) provided statistical advice for this manuscript.
Informed consent
Written informed consent was waived by the Institutional Review Board.
Ethical approval
Institutional Review Board approval was obtained.
Center 1: Antoni van Leeuwenhoek – The Netherlands Cancer Institute; IRBd18.066
Center 2: Erasmus MC Cancer Institute, University Hospital Rotterdam; MEC-2019-0850
Study subjects or cohorts overlap
No study subjects or cohorts have been previously reported.
Methodology
• retrospective
• diagnostic study
• multicentre study
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
van der Reijd, D.J., Guerendel, C., Staal, F.C.R. et al. Independent validation of CT radiomics models in colorectal liver metastases: predicting local tumour progression after ablation. Eur Radiol 34, 3635–3643 (2024). https://doi.org/10.1007/s00330-023-10417-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00330-023-10417-5