Abstract
Objectives
To evaluate the robustness of radiomics features among photon-counting detector CT (PCD-CT) and dual-energy CT (DECT) systems.
Methods
A texture phantom consisting of twenty-eight materials was scanned with one PCD-CT and four DECT systems (dual-source, rapid kV-switching, dual-layer, and sequential scanning) at three dose levels twice. Thirty sets of virtual monochromatic images at 70 keV were reconstructed. Regions of interest were delineated for each material with a rigid registration. Ninety-three radiomics were extracted per PyRadiomics. The test-retest repeatability between repeated scans was assessed by Bland-Altman analysis. The intra-system reproducibility between dose levels, and inter-system reproducibility within the same dose level, were evaluated by intraclass correlation coefficient (ICC) and concordance correlation coefficient (CCC). Inter-system variability among five scanners was assessed by coefficient of variation (CV) and quartile coefficient of dispersion (QCD).
Results
The test–retest repeatability analysis presented that 97.1% of features were repeatable between scan–rescans. The mean ± standard deviation ICC and CCC were 0.945 ± 0.079 and 0.945 ± 0.079 for intra-system reproducibility, respectively, and 86.0% and 85.7% of features were with ICC > 0.90 and CCC > 0.90, respectively, between different dose levels. The mean ± standard deviation ICC and CCC were 0.157 ± 0.174 and 0.157 ± 0.174 for inter-system reproducibility, respectively, and none of the features were with ICC > 0.90 or CCC > 0.90 within the same dose level. The inter-system variability suggested that 6.5% and 12.8% of features were with CV < 10% and QCD < 10%, respectively, among five CT systems.
Conclusion
The radiomics features were non-reproducible with significant variability in values among different CT techniques.
Clinical relevance statement
Radiomics features are non-reproducible with significant variability in values among photon-counting detector CT and dual-energy CT systems, necessitating careful attention to improve the cross-system generalizability of radiomic features before implementation of radiomics analysis in clinical routine.
Key Points
-
CT radiomics stability should be guaranteed before the implementation in the clinical routine.
-
Radiomics robustness was on a low level among photon-counting detectors and dual-energy CT techniques.
-
Limited inter-system robustness of radiomic features may impact the generalizability of models.
Avoid common mistakes on your manuscript.
Introduction
Radiomics derives objective and quantifiable imaging biomarkers from medical images to provide insights beyond subjective and qualitative image analysis [1, 2]. This pixel-level image analysis approach allows additional in-depth features that are invisible to the human naked eye, creates new possibilities, and promises for fostering the big data trends in healthcare [3, 4]. Radiomics has shown its possible capabilities in tumor classification [5], response prediction [6], and risk stratification [7] in oncologic imaging. Additionally, the potential of radiomics has also been presented in non-oncologic diseases, such as coronary plaque [8], pancreatitis [9], pneumonia [10], Crohn’s disease [11], kidney stone [12], etc. There is a huge number of academic papers on radiomics research, and it is still increasing [13, 14]. However, the radiomics analysis has not been widely implanted into clinical routine [15,16,17,18], since it is currently not supported by adequate scientific evidence.
One of the most significant challenges of radiomics analysis is the lack of robustness [19,20,21,22,23,24]. The influence of acquisition and reconstruction parameters have been demonstrated to have impacts on the robustness of CT radiomics features in conventional single-energy CT (SECT) systems, including scan system, radiation dose level, voxel size, reconstruction algorithm, reconstruction kernel, etc. [25,26,27]. The dual-energy CT (DECT) systems have introduced more potential influencing factors on radiomics features, as there is a heterogeneous technique to realize the DECT scans, such as dual-source dual-energy CT (dsDECT), rapid kV-switching dual-energy CT (rsDECT), dual-layer dual-energy CT (dlDECT), sequential scanning dual-energy CT (ssDECT), and split filter dual-energy CT systems [28]. The radiomics features were non-reproducible neither between the conventional CT and DECT scans, nor across different DECT techniques [29,30,31,32,33], even though the parameters were carefully adjusted. The photon-counting detector CT (PCD-CT) system are believed to allow high feature stability and better characterization of disease [34,35,36,37,38,39] since it directly converses photons into electric pulses without the intermediate step of visible light [40]. Nevertheless, it may lead to extra differences in radiomics features due to higher image resolution compared to traditional energy-integrating CT systems [34,35,36]. As the PCD-CT systems are not widely available nowadays, it is important to determine whether the images from generated using different CT techniques are consistent enough for radiomics analysis.
Therefore, this study aimed to evaluate robustness of radiomics features on texture phantom scans using one PCD-CT system and four DECT systems.
Materials and methods
The workflow of the study is presented in Fig. 1. The institution’s ethics approval and written informed consent are not required since this was a phantom study.
Study workflow. This study consisted of three steps: image acquisition, image processing, and statistical analysis. A homemade texture phantom was scanned on one PCT-CT system and four types of DECT systems, at three dose levels of 5, 10, and 20 mGy. The raw data was generated into VMIs at 70 keV. Pyradiomics was employed to extract 18 first-order and 75 texture radiomics features from ROIs segmented with a rigid registration. Test-retest repeatability between repeated scans was assessed by Bland-Altman analysis. The intra-system reproducibility between dose levels, and inter-system reproducibility within the same dose level, were evaluated by ICC and CCC. Inter-system variability among five scanners was estimated by CV and QCD
Phantom
We established a texture phantom consisting of twenty-eight different materials as shown in Fig. 2. There were five wood blocks and twenty-three bottles filled with different materials [25, 41]. The wood block was a cuboid with a size of 150 mm × 55 mm × 45 mm. The cuboid part of the bottle was with a size of 130 mm × 55 mm ×45 mm. The cuboid part bottle was filled with materials as tightly as possible. These materials were selected to give us varying textures. The materials were positioned to avoid beam-hardening artifacts and were kept unchanged throughout all the scans in the study. The details of the set-ups of the texture phantom are presented in Supplementary Note S1.
Phantom construction and image segmentation. A The inserts of the homemade phantom were made of wood blocks and bottles filled with different materials. B The inserts were placed in a foam plastic box and kept stable across all scans. The inserts used in the current study were: (1) air, (2) mesoporous sponge, (3) iodize-free salt, (4) granulated sugar, (5) flour, (6) iodized salt, (7) coarse-pore sponge, (8) nutritive soil for succulent plants, (9) rosewood, (10) chicken wing wood, (11) beechwood, (12) zebra wood, (13) basswood, (14) sand, (15) microporous sponge, (16) coix seed, (17) buckwheat, (18) sago, (19) cat litter, (20) oat, (21) sawdust, (22) soybean, (23) red bean, (24) mung bean, (25) rice, (26) quinoa, (27) millet, and (28) chia seed. C CT image of a representative axial slice of each material in the phantom. D A total of twenty-eight regions of interest were manually contoured on the reference scan and then copied to all other scans with a rigid registration
Image acquisition and reconstruction
The phantom was scanned on five CT scanners including one PCD-CT system (NAEOTOM Alpha, Siemens Healthineers) and four dual-energy CT systems (dsDECT, SOMATOM Force, Siemens Healthineers; rsDECT, Revolution CT Apex, GE Healthcare; dlDECT, Hawk spectral CT, Philips Healthcare; and ssDECT, Aquilion ONE, Canon Medical Systems) from two centers, respectively. The comparable acquisition and reconstruction parameters for scans are presented in Table 1. Each scan was repeated several minutes apart to allow scan-rescan repeatability assessment. The field of view (500 mm × 500 mm), reconstruction matrix (512 × 512), and slice thickness (5 mm) were kept unchanged to allow stable voxel size. The milliamperage, rotation time, and pitch value were adjusted to meet the volume CT dose index of 5, 10, and 20 mGy. The tube voltage, iteration reconstruction method, and reconstruction kernel were selected to present a typical abdomen-pelvic examination. All the images were reconstructed into virtual monoenergetic images (VMIs) at the energy level of 70 keV per vendor-specific workstations relying on comparable linear energy blending approaches. The kilo-electron volt level of 70 keV was chosen because this energy level was used as a clinical standard of reference at our institution and has been suggested to be comparable to conventional images [42,43,44].
Segmentation and feature extraction
The images were exported in Digital Imaging and Communications in Medicine (DICOM) format, and then converted to Neuroimaging Informatics Technology Initiative (NIFTI) format using MRIcroGL version 1.2.20220720b (https://www.nitrc.org/frs/?group_id=889). The images were loaded into ITK-SNAP version 4.0.2 (http://www.itksnap.org/pmwiki/pmwiki.php) for segmentation by a radiologist with 5 years of experience in radiology and radiomics phantom research [30,31,32,33]. Twenty-eight regions of interest (ROIs) of 35 pixels (approximately 34 mm) in diameter were put at the center of each wood block or bottle with different materials following a rigid registration to avoid unexpected variations [25]. The ROIs were placed on one reference scan and then copied to other scans. Each ROI was copied to the continuous middle five layers of the image of each wood block or bottle with different materials for radiomics feature extraction. We did not perform any pre-processing steps before the feature extraction. Python version 3.12.1 (https://www.python.org) with PyRadiomics package version 3.0.1 (https://pyradiomics.readthedocs.io/en/latest/) was used to extract 18 first-order features and 75 texture features, namely 24 gray-level co-occurrence matrix (GLCM), 14 gray-level run length matrix (GLRLM), 16 gray-level zone length matrix (GLZLM), 16 gray-level dependence matrix (GLDM), and 5 neighborhood gray-tone difference matrix (NGTDM) features [45]. The 26 shape features were not included since the ROIs were fixed in this study. The settings of feature extraction and calculated features are presented in Supplementary Note S2.
Radiomics robustness analysis
The test-retest repeatability was assessed using the middle five layers of images from two repeating scans with unchanged acquisition and reconstruction parameters from the same system. The intra-system reproducibility between different dose levels was evaluated between images from 5 vs. 10 mGy, 5 vs. 20 mGy, and 10 vs. 20 mGy scans, respectively, within the same system. The inter-system reproducibility was calculated using images acquired at three dose levels of 5, 10, and 20 mGy scans, respectively, between each two out of the five CT systems. The inter-system variability at three dose levels of 5, 10, and 20 mGy scans was estimated across five systems for each of the twenty-eight materials. The robustness of radiomics was also analyzed according to five feature types. The signal-to-noise ratio of each scan was calculated.
Statistical analysis
The statistical analysis was performed using R language version 4.1.3 (https://www.r-project.org/) within RStudio version 1.4.1106 (https://posit.co/). The mean relative change of the radiomics features across the different datasets was calculated. The test-retest repeatability was assessed using Bland-Altman analysis with a cutoff of 90% [46, 47]. The intra-system reproducibility between different dose levels, and inter-system reproducibility within the same dose level, were evaluated by intraclass correlation coefficient (ICC) of two-way mixed effects, single rater, absolute agreement type [48] and concordance correlation coefficient (CCC) [49]. The inter-system variability among the five systems was assessed by coefficient of variation (CV) [50] and quartile coefficient of dispersion (QCD) [51]. The ICC and CCC values were interpreted as follows: poor, < 0.50; moderate, 0.50–0.75; good, 0.75–0.90; or excellent, ≥ 0.90, while the CV and QCD values were interpreted as follows: acceptable, < 10%; moderate but still adequate, 11%–20%; and too high and inadequate, ≥ 20% [52].
Results
Test-retest repeatability of radiomics features
The percentage of repeatable features ranged from 82.8 to 100.0%, and the overall percentage ± standard deviation of repeatable features was 97.1 ± 6.2%, according to Bland-Altman analysis (Supplementary Table S1 and Supplementary Fig. S1). The signal-to-noise ratio of each scan (Supplementary Table S2) and the mean relative change of the radiomics feature in reference to PCD-CT (Supplementary Table S3) were calculated. The results were also summarized according to five feature types (Supplementary Tables S4 to S7).
Intra-system reproducibility among three dose levels
The overall mean ± standard deviation ICC and CCC values for intra-system reproducibility were 0.945 ± 0.079 and 0.945 ± 0.079, respectively (Table 2), and the percentage of features with ICC > 0.90 and CCC > 0.90 were 86.0% and 85.7%, respectively (Fig. 3). The mean ± standard deviation ICC and CCC values of five CT systems ranged from 0.916 ± 0.112 to 0.978 ± 0.041, and from 0.915 ± 0.112 to 0.977 ± 0.041, respectively. The percentage of features with ICC > 0.90 and CCC > 0.90 ranged from 76.3% to 95.0%, and from 76.3% to 95.0%, respectively. The results for each feature were summarized (Supplementary Fig. S2).
Percentage of robust radiomics features according to intra-system reproducibility. The percentage of robust features in terms of intra-system reproducibility among three dose levels of (A) 5 vs. 10 mGy, (B) 5 vs. 20 mGy, and (C) 10 vs. 20 mGy, according to ICC and CCC values. The ICC and CCC values were interpreted as follows: poor, < 0.50; moderate, 0.50–0.75; good, 0.75–0.90; or excellent, ≥ 0.90
Inter-system reproducibility within the same dose level
The overall mean ± standard deviation ICC and CCC values for inter-system reproducibility were 0.157 ± 0.174 and 0.157 ± 0.174, respectively (Table 3). None of the features were with ICC > 0.90 or CCC > 0.90, while 92.6% and 92.7% of features were with ICC < 0.50 and CCC < 0.50 (Fig. 4). There were only between dsDECT and rsDECT systems that showed 8.2% and 8.2% of features with ICC of 0.75–0.90 and CCC of 0.75–0.90, respectively. The results for each feature were summarized (Supplementary Fig. S3).
Percentage of robust radiomics features according to inter-system reproducibility. The percentage of robust features in terms of inter-system reproducibility among five scanners within the same dose level of (A) 5 mGy, (B) 10 mGy, and (C) 20 mGy, according to ICC and CCC values. The ICC and CCC values were interpreted as follows: poor, < 0.50; moderate, 0.50–0.75; good, 0.75–0.90; or excellent, ≥ 0.90
Inter-system variability among five scanners
The overall mean ± standard deviation CV and QCD values for inter-system reproducibility were 88.8 ± 478.3% and 91.8 ± 2797.5%, respectively (Table 4 and Supplementary Fig. S4). The percentage of features with CV < 10% and QCD < 10% were 6.5% and 12.8%, respectively (Fig. 5). The inter-system reproducibility was heterogeneous among different materials, with mean ± standard deviation CV values ranged from 44.0 ± 42.1% to 437.6% ± 344.2%, and mean ± standard deviation QCD values from 25.6% ± 21.5% to 641.6% ± 182.2%, respectively. The percentage of features with CV < 10% and QCD < 10% ranged from 3.2% to 15.1%, and from 4.3% to 35.5%, respectively.
Percentage of robust radiomics features according to inter-system variability. The percentage of robust features in terms of inter-system variability among five scanners within the same dose level of A 5 mGy, B 10 mGy, and C 20 mGy, according to CV and QCD values. The CV and QCD values were interpreted as follows: acceptable, < 10%; moderate but still adequate, 11%–20%; and too high and inadequate, ≥ 20%
Discussion
Our study showed that the repeatability of radiomics features was heterogeneous among CT techniques, in which PCD-CT, dsDECT, and rsDECT have relatively higher repeatability. On the other hand, the difference in radiation dose levels has less impact on the radiomics features. Notably, the radiomics features derived from images using different CT techniques were not reproducible to each other with significant variability in radiomics feature values, despite using carefully adjusted protocols.
The influence of the DECT technique has a great impact on the robustness of radiomics features. The phantom study showed that different DECT techniques led to the variability of radiomics features even though comparable parameters were used [30]. The deep learning image reconstruction algorithms cannot harmonize the variability of radiomics due to different DECT techniques [31]. However, the deep learning image reconstruction algorithms showed potential for minimizing radiomics variability that related to radiation dose level difference [33]. One opportunity for improving radiomics robustness across DECT systems is synchronizing energy levels of VMI to reach similar CT number values [32], while the potential influence on the diagnostic performance of the approach has not been estimated yet. Further studies in patients supported the phantom results that the radiomic features across different DECT systems were low, but the robust radiomics features were not reflected in the phantom experiment using the same parameters [29]. These studies compared dsDECT, rsDECT, and dlDECT systems, while our study further strengthened the results by including the ssDECT system. We used acquisition and reconstruction parameters as comparable as possible between the systems because we aimed to focus on the difference due that are specific to DECT systems. Therefore, the main sources of the low inter-system reproducibility were considered to be multi-energy acquisition or material decomposition techniques [29]. It is evident that they have a significant impact on the quantification of iodine [53,54,55]. Further, our study applied a phantom with materials of heterogeneous texture to validate the results. We summarized that the radiomics features were hard to be reproducible across the DECT systems whether the phantoms were homogeneous or heterogeneous, or were imitating physiological organ parenchyma [29, 41].
The image noise can be an important source of variability of radiomics features among CT systems. The studies showed that PCD-CT systems can provide high radiomics feature repeatability [37, 39], and high reproducibility between different radiation dose levels [34, 37], but heterogeneous reproducibility between VMIs at different keV levels [38]. Our study supported the high repeatability and high reproducibility of PCD-CT between different radiation dose levels. These results were in accordance with those in organic phantoms which the repeatability after repositioning and reproducibility between different tube currents were high [37]. It is reasonable since the PCD-CT has allowed for improved visualization and quantification even with ultra-low-dose imaging and obese patients [56], as it can remove the electronic background noise [57]. It is theoretically beneficial for radiomics analysis since this pixel-level image analysis approach is fragile to slight differences in images [25, 30]. In contrast, the repeatability in DECT systems can be suboptimal depending on the material decomposition technique [58, 59]. However, the dose level in our study was not that low to challenge some of the DECT systems in our study by electronic background noise. Therefore, the comparable high repeatability can be partially attributed to the relatively high radiation dose used in the study, in addition to the technique itself. Moreover, the same dose level does not guarantee the same signal-to-noise ratio as different DECT systems rely on very different technologies. However, it may be not possible to obtain the same signal-to-noise ratio among different CT systems with comparable acquisition parameters. Our study found that the first-order radiomics features have less variation than texture, which is in accordance with the previous phantom studies [30,31,32,33]. It is related to the fact that the texture features might be changed by image quality differences among CT systems. In addition, the pre-processing parameters also have a great impact on the radiomic analysis. Although the influence of these parameters is out of the scope of this work, we believe the selection of the bin size is especially important in our study. The selection of bin size can significantly influence on the image noise and thereby impact on the radiomics features in DECT systems [30]. However, the images from PCD-CT systems are less likely to be influenced by the bin size, since the PCD-CT systems are capable of removing the electronic background noise [57]. This should be investigated in future study to improve the reproducibility between different CT systems [60, 61].
It is believed that the radiomics features may be benefited from higher spatial resolution, higher contrast-to-noise ratio, and improved detection of lower-energy photons of PCD-CT system for better pathology characterization [34,35,36]. A phantom study of pulmonary nodules indicated that the estimation of morphological features may be improved in PCD-CT than in conventional CT systems [36] since the higher resolution in PCD-CT system allows better delineation of the nodule. Another organic phantom study showed a great difference of more than fifty percent were identified in 13 out of 14 selected radiomics features between PCD-CT and dsDECT systems [34]. On the other hand, a patient study compared radiomics features of non-scarred left ventricular myocardium suggested that first-order features were nearly comparable between PCD-CT and dsDECT systems, but texture features would be strongly changed [35]. Our study compared PCD-CT with four DECT systems and extended results that the radiomics features were not expected to be comparable between PCD-CT and DECT systems. We considered that the repeatability and reproducibility of radiomics features may not be substantially changed in some of the acquisition parameters such as radiation dose levels [25, 27, 34, 37]. In contrast, the acquisition and reconstruction parameters such as material decomposition technique, spatial resolution, and reconstruction kernels, have substantial impact on the radiomics feature values and result in a great decrease in the robustness of radiomics features [26, 27, 34, 35, 38, 39]. It is necessary to investigate the influence of reconstruction parameters on radiomics features within the PCT-CT system. However, it remains unknown whether the phantom experiments can reflect the reproducibility of radiomics features in clinical patient scans with various protocols. The agreement between phantom and patient experiments in the context of radiomics may be limited due to the difference in texture and the use of contrast media [38].
The following limitations of our study should be addressed. First, this was a phantom study without validation of human data. The texture of phantoms may differ from the physiological human parenchyma or pathological tissues [29, 41, 62]. Further, it is still unclear whether the low reproducibility will damage the presentation of the biological phenomenon in a clinical study. It may be not important if the change in radiomics features is significant enough between different biological phenomena. Therefore, the results may not be directly transferrable to patients. An improved organic phantom model with a specific disease would be preferable. However, our study gives an important insight into the variation of radiomics derived from images using heterogeneous CT techniques, as patient data of multiple scans on different CT systems are not always available [29]. Second, we did not include DECT systems using the split filter technique, and only one PCD-CT system was included in our study. These issues should be addressed in further studies to show whether the PCD-CT systems allow more stable radiomics features than traditional energy-integrating CT systems [34,35,36] and whether the radiomics analysis among PCD-CT systems is more generalizable. Third, we did not include traditional SECT systems in our study. This may reduce the translational value of our work. The DECT-like 70 keV VMI is recommended by the vendor for clinical use in abdominal scans instead of SECT-like low-energy-threshold polychromatic images [30,31,32,33]. Therefore, the 70 keV VMI from the PCD-CT was selected as the reference in our study. Accordingly, we chose the 70 keV VMI from DECT systems for comparison. In our future study, we will compare the SECT-like low-energy-threshold polychromatic images with the traditional SECT images in the terms of radiomics features. Fourth, the lowest radiation dose level in this study is relatively high. We did not test the stability of radiomics features at extra low radiation dose levels to present the advantage of PCD-CT systems providing stable quantification without disruption from electronic background noise [56]. It is expected that the PCD-CT systems allow reliable quantification within a wider range of radiation dose levels. Fifth, only VMI at 70 keV has been compared in this study. Both PCD-CT and DECT systems allow post-processing of material decomposition and linear blending to generate VMIs at different keV levels, iodine mappings, and virtual unenhanced images [38, 39]. The robustness of radiomics features derived from these images is also of interest because it is notable that different CT techniques has influence on the quantification of iodine [53,54,55,56]. Sixth, we applied bi-dimensional ROIs instead of three-dimensional ROIs. The selection has a potential impact on reproducibility. However, the influence of bi-dimensional or three-dimensional ROIs may be relatively small [63] and does not change the conclusion of the current study. Finally, this study did not evaluate the relationship between radiomics robustness and characterization ability. It should be considered in later clinical studies, as the radiomics analysis based on PCD-CT systems may change the clinical interpretation or classification in pathologies with rich textures or small volumes [34, 37].
To conclude, this study outlined the variability of radiomics features derived from VMIs generated using one PCD-CT system and four traditional energy-integrating DECT systems, despite using comparable protocols. Different radiation dose levels did not substantially change radiomics features, while the repeatability of radiomics features was heterogeneous across CT techniques. Radiomics analysis based on one CT technique should not be directly transferred to others without validation. Future investigations are encouraged to mitigate radiomics variability due to CT techniques.
Abbreviations
- CCC:
-
Concordance correlation coefficient
- CV:
-
Coefficient of variation
- DECT:
-
Dual-energy computed tomography
- dlDECT:
-
Dual-layer dual-energy CT
- dsDECT:
-
Dual-source dual-energy CT
- ICC:
-
Intraclass correlation coefficient
- PCD-CT:
-
Photon-counting detector CT
- QCD:
-
Quartile coefficient of dispersion
- ROI:
-
Region of interest
- rsDECT:
-
Rapid kV-switching dual-energy CT
- SECT:
-
Single-energy CT
- ssDECT:
-
Sequential scanning DECT
- VMI:
-
Virtual monochromatic image
References
Lambin P, Rios-Velazquez E, Leijenaar R et al (2012) Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 48:441–446. https://doi.org/10.1016/j.ejca.2011.11.036
Gillies RJ, Kinahan PE, Hricak H (2016) Radiomics: images are more than pictures, they are data. Radiology 278:563–577. https://doi.org/10.1148/radiol.2015151169
Lambin P, Leijenaar RTH, Deist TM et al (2017) Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14:749–762. https://doi.org/10.1038/nrclinonc.2017.141
Huang EP, O’Connor JPB, McShane LM et al (2023) Criteria for the translation of radiomics into clinically useful tests. Nat Rev Clin Oncol 20:69–82. https://doi.org/10.1038/s41571-022-00707-0
Lin P, Lin YQ, Gao RZ et al (2023) Integrative radiomics and transcriptomics analyses reveal subtype characterization of non-small cell lung cancer. Eur Radiol 33:6414–6425. https://doi.org/10.1007/s00330-023-09503-5
Kawahara D, Murakami Y, Awane S et al (2024) Radiomics and dosiomics for predicting complete response to definitive chemoradiotherapy patients with oesophageal squamous cell cancer using the hybrid institution model. Eur Radiol 34:1200–1209. https://doi.org/10.1007/s00330-023-10020-8
Deniffel D, McAlpine K, Harder FN et al (2023) Predicting the recurrence risk of renal cell carcinoma after nephrectomy: potential role of CT-radiomics for adjuvant treatment decisions. Eur Radiol 33:5840–5850. https://doi.org/10.1007/s00330-023-09551-x
Feng C, Chen R, Dong S et al (2023) Predicting coronary plaque progression with conventional plaque parameters and radiomics features derived from coronary CT angiography. Eur Radiol 33:8513–8520. https://doi.org/10.1007/s00330-023-09809-4
Xue M, Lin S, Xie D et al (2023) The value of CT-based radiomics in predicting the prognosis of acute pancreatitis. Front Med 10:1289295. https://doi.org/10.3389/fmed.2023.1289295
Yu X, Zhang S, Xu J et al (2023) Nomogram using CT radiomics features for differentiation of pneumonia-type invasive mucinous adenocarcinoma and pneumonia: multicenter development and external validation study. AJR Am J Roentgenol 220:224–234. https://doi.org/10.2214/AJR.22.28139
Chen Y, Feng J, Feng Q, Shen J (2023) Infliximab response associates with radiologic findings in bio-naïve Crohn’s disease. Eur Radiol 33:5247–5257. https://doi.org/10.1007/s00330-023-09542-y
Kaviani P, Primak A, Bizzo B et al (2023) Performance of threshold-based stone segmentation and radiomics for determining the composition of kidney stones from single-energy CT. Jpn J Radiol 41:194–200. https://doi.org/10.1007/s11604-022-01349-z
Volpe S, Mastroleo F, Krengli M, Jereczek-Fossa BA (2023) Quo vadis radiomics? Bibliometric analysis of 10-year radiomics journey. Eur Radiol 33:6736–6745. https://doi.org/10.1007/s00330-023-09645-6
Kocak B, Baessler B, Cuocolo R, Mercaldo N, Pinto Dos Santos D (2023) Trends and statistics of artificial intelligence and radiomics research in radiology, nuclear medicine, and medical imaging: bibliometric analysis. Eur Radiol 33:7542–7555. https://doi.org/10.1007/s00330-023-09772-0
Zhong J, Lu J, Zhang G et al (2023) An overview of meta-analyses on radiomics: more evidence is needed to support clinical translation. Insights Imaging 14:111. https://doi.org/10.1186/s13244-023-01437-2
Kocak B, Chepelev LL, Chu LC et al (2023) Assessment of RadiomIcS rEsearch (ARISE): a brief guide for authors, reviewers, and readers from the Scientific Editorial Board of European Radiology. Eur Radiol 33:7556–7560. https://doi.org/10.1007/s00330-023-09768-w
Kocak B, Baessler B, Bakas S et al (2023) CheckList for EvaluAtion of Radiomics research (CLEAR): a step-by-step reporting guideline for authors and reviewers endorsed by ESR and EuSoMII. Insights Imaging 14:75. https://doi.org/10.1186/s13244-023-01415-8
Kocak B, Akinci D’Antonoli T, Mercaldo N et al (2024) METhodological RadiomICs Score (METRICS): a quality scoring tool for radiomics research endorsed by EuSoMII. Insights Imaging 15:8. https://doi.org/10.1186/s13244-023-01572-w
Zwanenburg A (2019) Radiomics in nuclear medicine: robustness, reproducibility, standardization, and how to avoid data analysis traps and replication crisis. Eur J Nucl Med Mol Imaging 46:2638–2655. https://doi.org/10.1007/s00259-019-04391-8
Park JE, Park SY, Kim HJ, Kim HS (2019) Reproducibility and generalizability in radiomics modeling: possible strategies in radiologic and statistical perspectives. Korean J Radiol 20:1124–1137. https://doi.org/10.3348/kjr.2018.0070
Pfaehler E, Zhovannik I, Wei L et al (2021) A systematic review and quality of reporting checklist for repeatability and reproducibility of radiomic features. Phys Imaging Radiat Oncol 20:69–75. https://doi.org/10.1016/j.phro.2021.10.007
Zwanenburg A, Vallières M, Abdalah MA et al (2020) The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295:328–338. https://doi.org/10.1148/radiol.2020191145
Whybra P, Zwanenburg A, Andrearczyk V et al (2024) The image biomarker standardization initiative: standardized convolutional filters for reproducible radiomics and enhanced clinical insights. Radiology 310:e231319. https://doi.org/10.1148/radiol.231319
Akinci D’Antonoli T, Cuocolo R, Baessler B, Pinto Dos Santos D (2024) Towards reproducible radiomics research: introduction of a database for radiomics studies. Eur Radiol 34:436–443. https://doi.org/10.1007/s00330-023-10095-3
Berenguer R, Pastor-Juan MDR, Canales-Vázquez J et al (2018) Radiomics of CT features may be nonreproducible and redundant: influence of CT acquisition parameters. Radiology 288:407–415. https://doi.org/10.1148/radiol.2018172361
Meyer M, Ronald J, Vernuccio F et al (2019) Reproducibility of CT radiomic features within the same patient: influence of radiation dose and CT reconstruction settings. Radiology 293:583–591. https://doi.org/10.1148/radiol.2019190928
Peng X, Yang S, Zhou L et al (2022) Repeatability and reproducibility of computed tomography radiomics for pulmonary nodules: a multicenter phantom study. Invest Radiol 57:242–253. https://doi.org/10.1097/RLI.0000000000000834
Goo HW, Goo JM (2017) Dual-energy CT: new horizon in medical imaging. Korean J Radiol 18:555–569. https://doi.org/10.3348/kjr.2017.18.4.555
Lennartz S, O’Shea A, Parakh A, Persigehl T, Baessler B, Kambadakone A (2022) Robustness of dual-energy CT-derived radiomic features across three different scanner types. Eur Radiol 32:1959–1970. https://doi.org/10.1007/s00330-021-08249-2
Chen Y, Zhong J, Wang L et al (2022) Robustness of CT radiomics features: consistency within and between single-energy CT and dual-energy CT. Eur Radiol 32:5480–5490. https://doi.org/10.1007/s00330-022-08628-3
Zhong J, Xia Y, Chen Y et al (2022) Deep learning image reconstruction algorithm reduces image noise while alters radiomics features in dual-energy CT in comparison with conventional iterative reconstruction algorithms: a phantom study. Eur Radiol 33:812–824. https://doi.org/10.1007/s00330-022-09119-1
Zhong J, Pan Z, Chen Y et al (2023) Robustness of radiomics features of virtual unenhanced and virtual monoenergetic images in dual-energy CT among different imaging platforms and potential role of CT number variability. Insights Imaging 14:79. https://doi.org/10.1186/s13244-023-01426-5
Zhong J, Wu Z, Wang L et al (2024) Impacts of adaptive statistical iterative reconstruction-V and deep learning image reconstruction algorithms on robustness of CT radiomics features: opportunity for minimizing radiomics variability among scans of different dose levels. J Imaging Inform Med 37:123–133. https://doi.org/10.1007/s10278-023-00901-1
Dunning CAS, Rajendran K, Fletcher JG, McCollough CH, Leng S (2022) Impact of improved spatial resolution on radiomic features using photon-counting-detector CT. Proc SPIE Int Soc Opt Eng 12032:1203221. https://doi.org/10.1117/12.2612229
Ayx I, Tharmaseelan H, Hertel A et al (2022) Comparison study of myocardial radiomics feature properties on energy-integrating and photon-counting detector CT. Diagnostics 12:1294. https://doi.org/10.3390/diagnostics12051294
Sharma S, Pal D, Abadi E et al (2023) Can photon-counting CT improve estimation accuracy of morphological radiomics features? a simulation study for assessing the quantitative benefits from improved spatial resolution in deep silicon-based photon-counting CT. Acad Radiol 30:1153–1163. https://doi.org/10.1016/j.acra.2022.06.018
Hertel A, Tharmaseelan H, Rotkopf LT et al (2023) Phantom-based radiomics feature test-retest stability analysis on photon-counting detector CT. Eur Radiol 33:4905–4914. https://doi.org/10.1007/s00330-023-09460-z
Wolf EV, Müller L, Schoepf UJ et al (2023) Photon-counting detector CT-based virtual monoenergetic reconstructions: repeatability and reproducibility of radiomics features of an organic phantom and human myocardium. Eur Radiol Exp 7:59. https://doi.org/10.1186/s41747-023-00371-8
Tharmaseelan H, Rotkopf LT, Ayx I et al (2022) Evaluation of radiomics feature stability in abdominal monoenergetic photon counting CT reconstructions. Sci Rep 12:19594. https://doi.org/10.1038/s41598-022-22877-8
Rajendran K, Petersilka M, Henning A et al (2022) First clinical photon-counting detector CT system: technical evaluation. Radiology 303:130–138. https://doi.org/10.1148/radiol.212579
Li Y, Reyhan M, Zhang et al (2022) The impact of phantom design and material-dependence on repeatability and reproducibility of CT-based radiomics features. Med Phys 49:1648–1659. https://doi.org/10.1002/mp.15491
Matsumoto K, Jinzaki M, Tanami Y, Ueno A, Yamada M, Kuribayashi S (2011) Virtual monochromatic spectral imaging with fast kilovoltage switching: improved image quality as compared with that obtained with conventional 120-kVp CT. Radiology 259:257–262. https://doi.org/10.1148/radiol.11100978
Darras KE, McLaughlin PD, Kang H et al (2016) Virtual monoenergetic reconstruction of contrast-enhanced dual energy CT at 70keV maximizes mural enhancement in acute small bowel obstruction. Eur J Radiol 85:950–956. https://doi.org/10.1016/j.ejrad.2016.02.019
Atwi NE, Smith DL, Flores CD et al (2019) Dual-energy CT in the obese: a preliminary retrospective review to evaluate quality and feasibility of the single-source dual-detector implementation. Abdom Radiol (NY) 44:783–789. https://doi.org/10.1007/s00261-018-1774-y
van Griethuysen JJM, Fedorov A, Parmar C et al (2017) Computational radiomics system to decode the radiographic phenotype. Cancer Res 77:e104–e107. https://doi.org/10.1158/0008-5472
Sullivan DC, Obuchowski NA, Kessler LG et al (2015) Metrology standards for quantitative imaging biomarkers. Radiology 277:813–825. https://doi.org/10.1148/radiol.2015142202
Bland JM, Altman DG (1999) Measuring agreement in method comparison studies. Stat Methods Med Res 8:135-160. https://doi.org/10.1177/096228029900800204
Koo TK, Li MY (2016) A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 15:155–163. https://doi.org/10.1016/j.jcm.2016.02.012
Carrasco JL, Phillips BR, Puig-Martinez J, King TS, Chinchilli VM (2013) Estimation of the concordance correlation coefficient for repeated measures using SAS and R. Comput Methods Programs Biomed 109:293–304. https://doi.org/10.1016/j.cmpb.2012.09.002
Reed GF, Lynn F, Meade BD (2002) Use of coefficient of variation in assessing variability of quantitative assays. Clin Diagn Lab Immunol 9:1235–1239. https://doi.org/10.1128/cdli.9.6.1235-1239.2002
Bonett DG (2006) Confidence interval for a coefficient of quartile variation. Comput Stat Data Anal 50:2953–2957. https://doi.org/10.1016/j.csda.2005.05.007
Zhong J, Liu X, Hu Y et al (2023) Robustness of quantitative diffusion metrics from four models: a prospective study on the influence of scan-rescans, voxel size, coils, and observers. J Magn Reson Imaging. https://doi.org/10.1002/jmri.29192
Lennartz S, Pisuchpen N, Parakh A et al (2022) Virtual unenhanced images: qualitative and quantitative comparison between different dual-energy CT scanners in a patient and phantom study. Invest Radiol 57:52–61. https://doi.org/10.1097/RLI.0000000000000802
Chen Y, Zhong J, Wang L et al (2022) Multivendor comparison of quantification accuracy of iodine concentration and attenuation measurements by dual-energy CT: a phantom study. AJR Am J Roentgenol 219:827–839. https://doi.org/10.2214/AJR.22.27753
Lennartz S, Cao J, Pisuchpen N et al (2024) Intra-patient variability of iodine quantification across different dual-energy CT platforms: assessment of normalization techniques. Eur Radiol. https://doi.org/10.1007/s00330-023-10560-z
Liu LP, Shapira N, Chen AA et al (2022) First-generation clinical dual-source photon-counting CT: ultra-low-dose quantitative spectral imaging. Eur Radiol 32:8579–8587. https://doi.org/10.1007/s00330-022-08933-x
McCollough CH, Rajendran K, Leng S et al (2023) The technical development of photon-counting detector CT. Eur Radiol 33:5321–5330. https://doi.org/10.1007/s00330-023-09545-9
Yu Z, Leng S, Kappler S et al (2016) Noise performance of low-dose CT: comparison between an energy integrating detector and a photon counting detector using a whole-body research photon counting CT scanner. J Med Imaging 3:043503.
Yu Z, Leng S, Li Z et al (2016) How low can we go in radiation dose for the data-completion scan on a research whole-body photon-counting computed tomography system. J Comput Assist Tomogr 40:663–670. https://doi.org/10.1097/RCT.0000000000000412
Choe J, Lee SM, Do KH et al (2019) Deep learning–based image conversion of CT reconstruction kernels improves radiomics reproducibility for pulmonary nodules or masses. Radiology 292:365–373. https://doi.org/10.1148/radiol.2019181960
Zhovannik I, Bussink J, Traverso A et al (2019) Learning from scanners: bias reduction and feature correction in radiomics. Clin Transl Radiat Oncol 19:33–38. https://doi.org/10.1016/j.ctro.2019.07.003
Michallek F, Genske U, Niehues SM, Hamm B, Jahnke P (2022) Deep learning reconstruction improves radiomics feature stability and discriminative power in abdominal CT imaging: a phantom study. Eur Radiol 32:4587–4595. https://doi.org/10.1007/s00330-022-08592-y
Kocak B, Yardimic AH, Nazli MA et al (2023) REliability of consensus-based segMentatIoN in raDiomic feature reproducibility (REMIND): a word of caution. Eur J Radiol 165:110893. https://doi.org/10.1016/j.ejrad.2023.110893
Acknowledgements
The authors would like to express their gratitude to their colleagues who drank the juice, cleaned the bottles, and provided various materials, in order to establish the texture phantom used in the current study. The authors would like to thank Ms. Hongyan Huang for her English editing. This study is supported by TRILOGY, a group of young radiologists from the Department of Imaging, Tongren Hospital, Shanghai Jiao Tong University School of Medicine, who work, learn, and play together.
Funding
This study has received funding from the National Natural Science Foundation of China (82302183, 82101986), Yangfan Project of Science and Technology Commission of Shanghai Municipality (22YF1442400), Research Found of Health Commission of Changing District, Shanghai Municipality (2023QN01), Laboratory Open Fund of Key Technology and Materials in Minimally Invasive Spine Surgery (2024JZWC-YBA07), Research Fund of Tongren Hospital, Shanghai Jiao Tong University School of Medicine (TRKYRC-XX202204, TRYJ2021JC06, TRYXJH18, TRYXJH28), and Guangci Innovative Technology Launch Plan of Ruijin Hospital, Shanghai Jiao Tong University School of Medicine (YW20220014). They played no role in the study design, data collection or analysis, decision to publish, or manuscript preparation.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Guarantor
The scientific guarantor of this publication is Dr. Jingyu Zhong from Department of Imaging, Tongren Hospital, Shanghai Jiao Tong University School of Medicine.
Conflict of interest
Dr. Jingyu Zhong acknowledges his position as a member of the Scientific Editorial Board of European Radiology. He has not taken part in the review or selection process of this paper. All other authors of this manuscript have no competing interests to declare.
Statistics and biometry
One of the authors has significant statistical expertise.
Informed consent
Written informed consent was not required for this study because of the nature of our study, which was a phantom study.
Ethical approval
Institutional Review Board approval was not required because of the nature of our study, which was a phantom study.
Study subjects or cohorts overlap
The abstract of this article entitled “Robustness of radiomics among five CT systems: a texture phantom study” (Control Number 4368) has been accepted as a poster presentation at the 110th Scientific Assembly and Annual Meeting of the Radiological Society of North America, December 1-5, 2024, Chicago, Illinois. The presenting author of this abstract is Dr. Jingyu Zhong.
Methodology
-
Prospective
-
Experimental
-
Multicenter study
Additional information
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhu, L., Dong, H., Sun, J. et al. Robustness of radiomics among photon-counting detector CT and dual-energy CT systems: a texture phantom study. Eur Radiol (2024). https://doi.org/10.1007/s00330-024-10976-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00330-024-10976-1