Diffusion-weighted imaging-based radiomics model using automatic machine learning to differentiate cerebral cystic metastases from brain abscesses

Objectives To develop a radiomics model based on diffusion-weighted imaging (DWI) utilizing automated machine learning method to differentiate cerebral cystic metastases from brain abscesses. Materials and methods A total of 186 patients with cerebral cystic metastases (n = 98) and brain abscesses (n = 88) from two clinical institutions were retrospectively included. The datasets (129 from institution A) were randomly portioned into separate 75% training and 25% internal testing sets. Radiomics features were extracted from DWI images using two subregions of the lesion (cystic core and solid wall). A thorough image preprocessing method was applied to DWI images to ensure the robustness of radiomics features before feature extraction. Then the Tree-based Pipeline Optimization Tool (TPOT) was utilized to search for the best optimized machine learning pipeline, using a fivefold cross-validation in the training set. The external test set (57 from institution B) was used to evaluate the model’s performance. Results Seven distinct TPOT models were optimized to distinguish between cerebral cystic metastases and abscesses either based on different features combination or using wavelet transform. The optimal model demonstrated an AUC of 1.00, an accuracy of 0.97, sensitivity of 1.00, and specificity of 0.93 in the internal test set, based on the combination of cystic core and solid wall radiomics signature using wavelet transform. In the external test set, this model reached 1.00 AUC, 0.96 accuracy, 1.00 sensitivity, and 0.93 specificity. Conclusion The DWI-based radiomics model established by TPOT exhibits a promising predictive capacity in distinguishing cerebral cystic metastases from abscesses. Supplementary Information The online version contains supplementary material available at 10.1007/s00432-024-05642-4.


Introduction
Cerebral cystic metastases and abscesses present similar patterns on conventional magnetic resonance imaging (MRI), making it difficult to distinguish between them (Muccio et al. 2014).However, accurate differential diagnosis is crucial for appropriate clinical management due to the different prognoses and treatment options for each condition (Bodilsen et al. 2023;Aizer et al. 2022).Advanced MRI techniques may provide additional information to aid in distinguishing between these two entities (Lai et al. 2019;Martín-Noguerol et al. 2021;Falk Delgado et al. 2019).Among them, diffusion-weighted imaging (DWI) is the most used due to its accuracy and convenience.Brain abscesses typically exhibit markedly hyperintense signals in cavities with restricted diffusion of contents on DWI, while the cavities of cystic brain tumors generally show hypointense signal.
However, some cystic brain metastases have been reported to present high intensity with low apparent diffusion coefficient (ADC) value on DWI because of highly viscous mucin or many inflammatory cells in the cystic cavity (Sakatani et al. 2019;Takayasu et al. 2018;Pérez-Riverola et al. 2023;Hartmann et al. 2001;Yikilmaz et al. 2009).In 2010, (Duygulu et al. 2010) reported that 19.7% of intracerebral metastasis showed hyperintensity for DWI in a larger patient cohort.Additionally, 5-21% of untreated abscesses display low DWI signal, mimicking necrotic tumors within the central portion (Reddy et al. 2006).In short, differentiation of cerebral cystic metastases from abscesses with DWI sometimes remains a challenge, and it is necessary to explorea more accurate and effective method.
Numerous studies have demonstrated that radiomics exhibit superior diagnostic capabilities compared to visual analysis in the diagnosis, classification, and outcome prediction of brain lesions (Rudie et al. 2019;Abdel Razek et al. 2021;Forghani 2020;Kalasauskas et al. 2022).Radiomics has the potential to better differentiate cerebral cystic metastases and abscesses.Although radiomics has advantage for various applications, several challenges still need to be addressed (Lohmann et al. 2022).One challenge is the issue of robustness, which arises due to the use of different image datasets.To address this, an image preprocessing pipeline has been proposed to overcome the problem of incomparability among datasets.In the past, machine learning required manual testing to select appropriate features and models, which was cumbersome and often relied heavily on human expertise.However, an automated machine learning tool has been developed to improve this process.The tree-based pipeline optimization tool (TPOT) is an example of a tool that can automatically optimize the best machine learning pipeline using genetic algorithms (Le et al. 2020).Recent studies have demonstrated TPOT's superior ability to construct radiomic models, outperforming standard manual machine learning analysis (Peng et al. 2022;Zhang et al. 2021;Su et al. 2020;Radzi et al. 2021).
The aim of our study was to establish a radiomics model based on DWI with TPOT using dual-center MRI datasets and to evaluate its diagnostic accuracy in distinguishing cerebral cystic metastases from abscesses.Furthermore, we aimed to validate the reliability and resilience of our image preprocessing methodology in bolstering the validity of our conclusions.

Patients
This retrospective study received approval from the institutional review board, and informed consent was waived.
We searched the data of 382 patients with cerebral cystic metastases and brain abscesses identified by MRI on picture archiving and communication systems (PACS) from institution A and institution B between January 2012 and January 2021.The inclusion criteria were as follows: (1) cerebral cystic metastasis was confirmed by the pathological diagnosis of the primary tumor and clinical materials; brain abscess diagnosis depended on pathological findings and laboratory tests; (2) the pattern of cerebral cystic metastasis was solitary or multiple lesions appearing as rim-enhancing masses that were completely cystoid, namely enhancement wall and cystic fluid core; all cases of brain abscesses were in the capsule stage; (3) patients underwent plain and enhanced brain MRI scans before surgery or systemic medication.The exclusion criteria included: (1) lesion with a maximum diameter of less than 1 cm; (2) poor quality images; (3) large cystic with small nodular or partial cystic change in MRI scans depicting brain metastasis; abscess cavity containing air.Representative cases are shown in Fig. 1.
This study involved a total of 186 patients who were diagnosed with either cerebral cystic metastases (n = 98) or brain abscesses (n = 88).Among the cases of cerebral cystic metastases, the primary tumors were identified as lung carcinoma (n = 87), esophageal carcinoma (n = 4), hepatic carcinoma (n = 2), renal carcinoma (n = 1), endometrial carcinoma (n = 1), breast carcinoma (n = 1), gastric carcinoma (n = 1), and rectal carcinoma (n = 1).The 88 patients with brain abscesses were categorized according to their pathogen, which included 64 bacterial, 8 fungal, 6 tubercular, 1 mixed infection of bacterial and fungal abscess, and 9 cases with unknown pathogens.The patients were divided into three groups: a training set (n = 96 patients) from institution A, an internal test set (n = 33 patients) from institution A, and an external test set (n = 57 patients) from institution B. The enrollment process for the study is depicted in Fig. 2.

Image acquisition
MRI examinations were conducted at institution A using three imaging systems including Achieva 1.5 T and 3.0 T MRI scanner (Philips Healthcare, Best, The Netherlands), and Verio 3 T MRI scanner (Siemens Healthcare, Erlangen, Germany).The independent external data was gathered at institution B on the following MRI scanners: signa HDxt 1.5 T and signa HDX 3.0 T (GE Healthcare, Milwaukee, USA), Skyra 3 T (Siemens Healthcare, Erlangen, Germany).The scan sequence involved axial T2-weighted imaging (T2WI), T1-weighted imaging (T1WI), DWI, and contrast-enhanced T1WI (CE-T1WI).The DWI had b values of 0 and 1000 s/mm 2 , with the latter being used for analysis.The ADC maps were generated automatically by MRI scanners or manually reconstructed on the MRI scanner's post-processing workstation.Please refer to Table 1 for detailed parameters.

Clinical and conventional MR analysis
Two neuroradiologists (reader A and reader B), with 6 and 18 years of experience, respectively, independently reviewed all MRI scans.The radiologists were blinded to clinical and pathological data and reached a consensus.In cases where multiple lesions were present, analysis was based on the largest lesion.MRI features were assessed based on the following criteria: (1) location (lobe, basal ganglion and thalamus, brain stem, cerebellum, multiple); (2) presence of hypointense rims on T2-weighted images; (3) pattern of wall enhancement (smooth inner and outer walls, rough inner and smooth outer walls, smooth inner and rough outer walls, or rough inner and outer walls); (4) thickness of the enhancement wall (< 3 mm or ≥ 3 mm); (5) degree of edema (none, slight, or obvious); (6) ADC value of the wall; (7) ADC value of the core; and (8) maximum diameter of the mass.The degree of edema was divided into none, slight (less than 10 mm) and obvious (at or above10 mm), based on the classification suggested by Schoenegger for the glioblastoma (Schoenegger et al. 2009).ADC values were computed using the post-processing workstation.The region of interest (ROI) for the enhancement wall and cystic fluid core was delineated separately at the largest sectional area of the mass and its two adjacent layers on the ADC map.In cases where the mass was too small for the three layers, the ROI was outlined three times on the maximum cross section of the mass.Three ADC values for the core and wall were calculated by two neuroradiologists, and the average value was determined.The maximum diameters were measured independently on CE-T1WI, and the average was taken.Clinical features such as age, sex, presence or absence of fever, and leukocytosis were obtained from the medical records.

Image annotation
The process of image segmentation was carried out using the open-source software 3D Slicer 4.11.0, based on DWI sequences.The 3D ROI was manually delineated slice by slice on the DWI images (b = 1000 s/mm 2 ) to cover the core and wall, with reference to CE-T1WI, without any prior medical information.All manual segmentations were performed by reader A and the results were verified by reader B. To assess the intra-and inter-class correlation coefficient (ICC), reader A performed the segmentation of 30 randomly selected cases twice at the 3-month interval, and reader B independently performed the segmentation of 30 patients following the same procedure.Features with ICCs greater than 0.75 were selected for subsequent analysis.

Image preprocess
In this study conducted across two institutions, a thorough preprocessing method was developed for the analysis of various types of brain MR images.The method involved three steps: skull stripping to remove the skull, resampling to normalize spacing heterogeneity, and histogram normalization to reduce histogram distribution variance.The software HD-BET (Isensee et al. 2019), which is based on deep learning, was used to extract the brain and and strip the skull in the DWI sequence of MRI.The DWI images were then resampled to a consistent physical size of 1 mm, 1 mm, and 1 mm using Python SimpleITK package along with the simultaneous resampling of the mask of ROIs.

Feature extraction and automated clinical and radiomics model
Radiomics features were extracted from two subregions on diffusion-weighted MR images, namely the cystic fluid core and solid wall (short: core and wall).Subregions (core, wall), individually or in combination, were assigned to three groups (core, wall, combination of core and wall).Combination of core and wall referred to extracting features from the core and wall, respectively, and subsequently combining these features.We compared the performance of the radiomic model using different combinations of features from different groups to identify the most significant features.TPOT is a Python-based automated machine learning tool for constructing radiomics and clinical models.During the training phase, the features extracted from the DWI sequence in the training dataset were fed into TPOT in Python to search for the optimal machine learning pipeline through fivefold cross-validation.Subsequently, the best machine learning pipeline was tested on the internal and external dataset to assess its generalizability.We placed equal importance on clinical and radiomic features.Thus, we conducted comparative experiments on both types of features using TPOT to identify the most significant machine learning pipeline.The model's performance was evaluated by calculating the accuracy, sensitivity, specificity, and receiver operating characteristics area under the curve (ROC-AUC) values on the internal and external test dataset.DeLong's test was used to compare the AUC value of clinical and all the radiomics models.The workflow of this study is shown in Fig. 4.

Clinical characteristics
Table 2 presents the clinical and radiological characteristics of the cerebral cystic metastases and brain abscess groups.Our statistical analysis revealed significant differences (p < 0.05) in age leukocytosis, hypointense rims on T2WI, ADC value of the core, and ADC value of the wall between the two conditions on the training set.There were significant differences (p < 0.05) of fever, pattern of enhancement wall, and ADC value of the core between the two conditions in the internal test set.The factors including age, fever, location, hypointense rims on T2WI, pattern of enhancement wall, degree of edema, ADC value of the core, and the maximum diameter of the mass between the two conditions are statistically significant (p < 0.05) in the external test set.

Image preprocessing
The images normalized by a thorough preprocessing method have better feature performance, compared to the non-processed images both in the internal (AUC 1.00 vs. 0.86) and external test sets (AUC 0.98 vs. 0.55), as shown in Fig. 5.

Feature extraction and automated model building
On the manual segmentation, the intra-and inter-observer ICC values were 0.96 and 0.95, respectively, as shown in Supplemental Fig. 1.A total of 107 basic features, including firstorder statistical features, shape features, and gray-level features were extracted after MR image preprocessing.To further enhance the model performance, we utilized wavelet filters to extract more subtle features, resulting in an additional 744 features.Seven TPOT models were created to distinguish cerebral cystic metastases from brain abscesses.Table 3

Comparison between the clinical and radiomics model
No significant differences were observed between the clinical model and all the radiomics models in the internal testing

Discussion
In this research, we have identified 12 commonly observed clinical and imaging characteristics to develop a clinical prediction model.Additionally, we have extracted features from three groups including core, wall, and combined regions on DWI to establish six radiomics models using an automatic machine learning method.The objective of this study was to differentiate between cystic brain metastases and abscesses.
Our findings indicate that both the clinical and radiomics models have achieved high AUCs.The optimal radiomics model demonstrated excellent predictive value in distinguishing cerebral cystic metastases from abscesses, with AUCs of 1.00 both in the internal and external test sets.Previous research (Muccio et al. 2014) suggested that certain features of routine MRI sequences and clinical signs can aid in the differential diagnosis of cystic brain metastases and brain abscesses, and the DWI signal or ADC value has been particularly useful in increasing diagnostic effectiveness.However, related studies have shown varying sensitivities (64-100%) and specificities (77-100%) for DWI in this regard (Xu et al. 2014).Additionally, these researches have been limited by small sample sizes and a combination of few characteristics (Salice et al. 2016;Kolakshyapati et al. 2019;Schwartz et al. 2006;Alam et al. 2012).In this study, we incorporated 12 clinical and image features to build a clinical model.Our results showed that age, fever, leukocytosis, location, hypointense rims on T2WI, pattern of enhancement wall, degree of edema, ADC value of the core, ADC value of the wall, and maximum diameter of the mass were significantly different between the training and/ or test sets (p < 0.05).The clinical model performed well, achieving an AUC of 0.93 in the internal test set and an AUC of 0.97 in the external test set.The larger sample size and increased number of characteristic combinations likely contributed to the improved performance of routine clinical data in distinguishing between cystic brain metastases and brain abscesses.Some studies have shown that radiomics models based on DWI or ADC have higher values and benefits Fig. 5 ROC curves of features' performance before and after normalization in the internal and external test sets.The images normalized have better feature performance, compared to the original images both in the internal (AUC 1.00 vs. 0.86) and external test set (AUC 0.98 vs. 0.55) in differential diagnosis, evaluating biological factors, and predicting tumor prognosis (Xu et al. 2020;Park et al. 2020;Hu et al. 2022;Kim et al. 2022;Wang et al. 2021).In this study, we analyzed DWI as the single MRI sequence to build a radiomics model, which demonstrated superior diagnostic values for these two conditions.Our work showed that the DWI-based radiomics optimal model had AUCs of 1.00 both in the internal and external test sets, indicating its high efficiency in the differential diagnosis of cerebral cystic metastases and abscesses.This marks the first instance of radiomics being utilized for the differentiation of these two conditions.Unlike most radiomics research (Su et al. 2021;Priya et al. 2021;Li et al. 2022a, b), which manually conducted feature   selection and chose trivial machine learning models, we used an automatic machine learning method for automatic feature selection, model selection, and parameter optimization.By intelligently exploring thousands of possible pipelines, TPOT automates the tedious part of machine learning and identifies the best pipeline (Le et al. 2020).In Wang's study (Wang et al. 2023), TPOT was used to identify IDH-mutant TERT promoter-mutant gliomas, reaching an AUC of 0.952 in the independent validation set.In another study (Liu et al. 2022), TPOT was shown to differentiate brain metastases from glioblastoma with a higher AUC of 0.988 than using other classifiers.Our study, TPOT also showed excellent ability in differentiating cerebral cystic metastases from abscesses.Combining features from the cystic fluid core and solid wall improved the accuracy, sensitivity, specificity, and AUC.The top ten radiomics features were all first-order features, which describe the histogram distribution of voxel intensity in the image region.The features of mean, median, 90th percentile, and maximum mainly reflected the average and high voxel intensity.Skewness reflects the symmetric degree of data distribution.Root mean squared indicates the magnitude of image values.These parameters representing the density of the pathological lesion have the potential to quantify micro-architectural properties of tissues.Seven of the features were further processed using wavelet transforms, allowing for a comprehensive and accurate reflection of the original image.Although these features are difficult to identify with the naked eye, radiomics can make full use of them for disease identification.Significantly, the weight of wall_wavelet-HLL_first-order_Mean extracted from solid wall were the largest among all the radiomics features.Meanwhile the wallwavelet model also demonstrated excellent performance in the internal and external set.The result indicated that the DWI characteristics of solid wall, which might otherwise be overlooked in most studies of the two conditions, have offered added value to current radiomics study.Although there was no statistical difference between the optimal radiomics model and the clinical model, the radiomics model from a single DWI sequence yielded higher AUC value than the clinical model with many clinical features combined with multiple sequence MRI characteristics.In future studies, the inclusion of more heterogeneous group of cerebral metastases and abscesses in large samples will highlight the advantages of radiomics and may help to reach statistical differences.Notably, MR images can exhibit significant variations depending on the scanning equipment, acquisition parameters, and inherent acquisition artifacts.These factors can cause instability in radiomics features (Cui and Yin 2022;Veres et al. 2022).Additionally, susceptibility artifacts and chemical displacement due to signal acquiring methods can impact the performance of DWI (Hu et al. 2022).Preprocessing methods can improve the reproducibility and stability of quantitative MRI analysis, leading to more reliable radiomics feature values (Moradmand et al. 2020).Our study utilized pyradiomics in Python to develop a preprocessing method that effectively reduces discrepancies in image data, resulting in improved robustness of feature extraction and model establishment.
The current study has some limitations.Firstly, it was conducted retrospectively and included a limited amount of data from only two medical centers.To enhance the generalizability and effectiveness of the model in clinical practice, it is recommended to conduct large-scale and prospective studies across multiple centers.Secondly, there was some selection bias in the retrospective study, as most patients with cerebral metastases had primary lung tumors, and only a small number had tumors originating from other sites, such as the esophagus, kidney, or colon.Additionally, most brain abscess cases were confirmed to be bacterial, with fungal and tubercular abscesses being rare.The predictive accuracy of the model may be affected by a more heterogeneous group of cerebral metastases and abscesses.Therefore, further research is needed to include more patients with these types of tumors.

Conclusion
In summary, we have successfully constructed a highperforming radiomics model, utilizing automated machine learning techniques that can effectively differentiate cerebral cystic metastases from abscesses based on DWI.Furthermore, our preprocessing methodology has improved the dependability and durability of the initial results, which could greatly facilitate the practical applications of this model in clinical settings.

Fig. 1
Fig. 1 Representative examples of cerebral cystic metastases and brain abscesses of the capsular stage.Cerebral cystic metastases (a-d) and brain abscesses of the capsular stage (e-h) exhibit similarly rim enhancing on CE-T1WI.a, b A 51-year-old male with cerebral cystic metastasis from lung adenocarcinoma.The core of the lesion presented hyperintensity on DWI.c, d A 56-year-old Finally, the DWI sequences were normalized in histogram using a histogram match algorithm based on the feature of the histogram of a template collected from brain MRI image of a normal case in institution A. The pipeline of preprocessing is illustrated in Fig. 3.The importance of the DWI sequence preprocessing in radiomic model performance was further validated by comparing the baseline of the radiomic model before and after preprocessing.

Fig. 2
Fig. 2 Study enrollment flowchart displays the classifiers and parameters for all the models.All prediction models performed reasonably well during training with the current best internal CV score greater than 0.80.With the exception of the radiomics model, based on the solid wall of lesion without wavelet transform, all TPOT models demonstrated excellent performance, with high accuracy and favorable AUC both on the internal and external test sets.The results

Fig. 6 Fig. 7
Fig. 6 The ROC curves of seven distinct TPOT models.a The ROC curve of seven distinct TPOT models in the internal test set.b The ROC curve of seven distinct TPOT models in the external test set

Table 1
MR scan protocols

Table 2
Clinical and radiological features in the training and test set

Table 3
The classifiers and parameters of TPOT models

Table 4
TPOT models performance with the internal test dataset