Low-dose liver CT: image quality and diagnostic accuracy of deep learning image reconstruction algorithm

Objectives To perform a comprehensive within-subject image quality analysis of abdominal CT examinations reconstructed with DLIR and to evaluate diagnostic accuracy compared to the routinely applied adaptive statistical iterative reconstruction (ASiR-V) algorithm. Materials and methods Oncologic patients were prospectively enrolled and underwent contrast-enhanced CT. Images were reconstructed with DLIR with three intensity levels of reconstruction (high, medium, and low) and ASiR-V at strength levels from 10 to 100% with a 10% interval. Three radiologists characterized the lesions and two readers assessed diagnostic accuracy and calculated signal-to-noise ratio (SNR), contrast-to-noise ratio (CNR), figure of merit (FOM), and subjective image quality, the latter with a 5-point Likert scale. Results Fifty patients (mean age: 70 ± 10 years, 23 men) were enrolled and 130 liver lesions (105 benign lesions, 25 metastases) were identified. DLIR_H achieved the highest SNR and CNR, comparable to ASiR-V 100% (p ≥ .051). DLIR_M returned the highest subjective image quality (score: 5; IQR: 4–5; p ≤ .001) and significant median increase (29%) in FOM (p < .001). Differences in detection were identified only for lesions ≤ 0.5 cm: 32/33 lesions were detected with DLIR_M and 26 lesions were detected with ASiR-V 50% (p = .031). Lesion accuracy of was 93.8% (95% CI: 88.1, 97.3; 122 of 130 lesions) for DLIR and 87.7% (95% CI: 80.8, 92.8; 114 of 130 lesions) for ASiR-V 50%. Conclusions DLIR yields superior image quality and provides higher diagnostic accuracy compared to ASiR-V in the assessment of hypovascular liver lesions, in particular for lesions ≤ 0.5 cm. Clinical relevance statement Deep learning image reconstruction algorithm demonstrates higher diagnostic accuracy compared to iterative reconstruction in the identification of hypovascular liver lesions, especially for lesions ≤ 0.5 cm. Key Points • Iterative reconstruction algorithm impacts image texture, with negative effects on diagnostic capabilities. • Medium-strength deep learning image reconstruction algorithm outperforms iterative reconstruction in the diagnostic accuracy of ≤ 0.5 cm hypovascular liver lesions (93.9% vs 78.8%), also granting higher objective and subjective image quality. • Deep learning image reconstruction algorithm can be safely implemented in routine abdominal CT protocols in place of iterative reconstruction. Supplementary Information The online version contains supplementary material available at 10.1007/s00330-023-10171-8.


Introduction
Computed tomography (CT) is considered the reference standard for diagnosis, staging, and monitoring response to therapy of abdominal oncologic disease, owing to its fast execution, high availability, and consistent reproducibility [1].Oncologic patients need to undergo strict follow-up consisting of multiple CT examinations [2]; in this scenario, it is crucial to minimize radiation dose and cumulative effective dose.
Filtered back-projection (FBP) has represented the conventional image reconstruction algorithm for over 30 years, owing to its good performances at standard radiation dose levels.However, increased awareness of radiation exposure along with soaring progresses in computational power paved the way for iterative reconstruction algorithms to replace FBP.Although this new technology is effective in reducing image noise and, consequently, in enabling low-dose CT examinations, many radiologists have complained about the "unnatural" and "unfamiliar" appearance of the images in clinical practice [3].Steady rise in computing power enabled the implementation of deep learning image reconstruction (DLIR) algorithms, based on neural network models [4] and capable of learning from input data; DLIR exploits the capabilities of artificial intelligence to overcome IR limitations and further improve image quality.Preliminary studies [5][6][7] have proved DLIR algorithm effective in improving image quality without producing unnatural images at lower radiation doses in cardiovascular [8] and chest imaging [9] and in detecting abdominal lesions [10].Recent studies evaluating the differences between DLIR and adaptive statistical iterative reconstruction (ASiR-V) showed that DLIR datasets acquired at low dose displayed improved image noise, signal-to-noise (SNR) ratio, and contrast-to-noise ratio (CNR) compared to iterative images at standard-dose CT, and were favored by most readers [6,[11][12][13][14].However, to the best of our knowledge, a broad comparison of DLIR and ASiR-V at their respective full set of strength levels has not been reported yet.
Thus, the aim of our study was to perform a comprehensive within-subject image quality analysis of abdominal CT examinations reconstructed with DLIR and to evaluate diagnostic accuracy compared to the routinely applied ASiR-V algorithm.

Study population
This prospective randomized study was conducted at Sant'Andrea University Hospital, Rome, Italy, and was approved by the Institutional Review Board.Written informed consent was obtained from all patients.

CT image acquisition
All patients underwent CECT on a 128-slice CT (GE Revolution EVO CT Scanner, GE Medical Systems) in supine position at full inspiration, in cranio-caudal direction, before and after CM injection.
The volume of CM was calculated based on lean body weight, using the James formula [15].Each patient was injected 0.7 g of iodine per kilogram of lean body weight, which was then divided by the concentration of CM, as follows: A non-ionic contrast medium (400 mgI/mL iomeprol, Iomeron 400; Bracco Imaging) was intravenously injected to all patients at a flow rate of 3/3.5 mL/s through an 18-gauge antecubital access, by means of a triple-syringe power injector (MEDRAD® Centargo CT Injection System; Bayer AG), chased by 50 mL of saline solution at corresponding flow rate.
Scan delay was set by a dedicated bolus-tracking technique application (SmartPrep, GE Healthcare), by placing a 120 HU threshold region of interest (ROI) within the abdominal aorta at the level of the celiac axis, a 15 s delay was used for the arterial phase, a 60 s delay was used for the portal venous phase, and a 180 s delay was used for the delayed phase.
Patients were scanned with a low-dose protocol with the following parameters: tube voltage of 80 kVp for the arterial phase and 100 kVp for the portal venous and delayed phases, automatic current modulation range 100-240 mA

CT image reconstruction
Raw data were reconstructed at scan FOV: 50 cm and DFov: 34/36 cm (variable), utilizing standard abdominal kernel, matrix of 512*512, with a 1.250 mm slice spacing and thickness using two different algorithms: iterative reconstruction (ASiR-V; GE Healthcare) at strength levels from 10 to 100% with a 10% interval, and DLIR (TrueFidelity™, GE Healthcare) with three intensity levels of reconstruction (high, medium, and low); therefore, a total of thirteen datasets were generated for each examination.

Objective image quality analysis
Objective image quality was evaluated in portal venous phase by a reader with 16 years of experience in abdominal imaging on a dedicated workstation (adw4.7,GE Healthcare), for each patient and in all reconstructed datasets.On axial slices, liver attenuation values (HU) were calculated by placing three circular ROIs of identical size (1cm 2 ) in the hepatic segments IVb, V, and VI, avoiding intrahepatic vessel, and eventually averaged.Standard deviation (SD) of the ROI drawn in the left latissimus dorsi muscle was defined as image noise.All ROIs were placed three times, and measurements have been averaged to minimize measurement inaccuracies.Consistency on ROI placement throughout the datasets was ensured by using the copy-paste tool of the workstation.
Signal-to-noise ratio (SNR) was calculated as follows: Contrast-to-noise ratio (CNR) was calculated as follows:

Subjective image quality analysis
Subjective image quality analysis was performed by two readers with 12 and 10 years of experience in abdominal imaging, blinded to reconstruction protocol, on ASiR-V 50%, ASiR-V 100%, DLIR_M, and DLIR_H datasets, in consensus reading.The analysis was limited to these datasets based on results of objective image quality analysis (ASiR-V 100% and DLIR_H), routine clinical practice (ASiR-V 50%), and vendor recommendations (DLIR_M).Images

CNR = HU liver − HU muscle noise
were evaluated with standard window setting (width, 350 HU; level, 40 HU) but freely adjustable to suit readers' preferences.Ambient light was kept constant at circa 35-40 lx.
To minimize recall bias, images were randomly assessed and no more than two different reconstructed datasets from each patient were analyzed during each interpretation, maintaining a time interval of 7 days between sessions.

Figure of merit
The dose-length product (DLP) of the arterial and delayed phase was annotated for each patient.The effective radiation dose (ED) was calculated for each patient by multiplying the DLP with a conversion factor k of 0.015 mSv•mGy −1 • cm −1 [16,17].
Since acquisitions in the arterial phase and in the delayed were performed at different tube voltages (80 kV vs 100 kV, respectively), in order to evaluate differences in objective image quality independently of the ED [18], the SNR and figure of merit (FOM) of the latissimus dorsi muscle were calculated as follows.
Muscle was preferred over liver parenchyma due to its stable density measurement after contrast medium injection [19].

Reference standard and lesion detection
The reference standard was assessed by three radiologists with 38, 27, and 26 years of experience in abdominal imaging, in consensus, using all clinical data and cross-sectional imaging examinations available at our institution; liver lesions were classified in a dichotomous fashion as benign or malignant.Benign lesions scored ≥ 3 on the malignancy scale were deemed false-positive; malignant lesions either scored ≤ 2 on the malignancy scale or not identified were considered false-negative [20].
Two board-certified radiologists, with 12 and 10 years of experience in abdominal radiology, respectively, performed lesion detection on the portal venous phase, blinded to patients' information except cancer diagnosis.DLIR and ASIR-V datasets of each patient were assessed in a randomized order, in five sessions; to minimize recall biases, DLIR and ASiR-V of the same patient were always assessed in different sessions.Hypoattenuating liver lesions measuring ≥ 2 mm were marked and characterized with a 5-point Likert scale (1, definitely benign; 2, likely benign; 3, malignancy not excluded; 4, likely malignant; 5, definitely malignant); diagnostic confidence was also assessed with a 5-point Likert scale (from 1: very low confidence to 5: very high confidence) [21].

Radiation dose
The CTDI vol and DLP were recorded for each examination; ED was eventually calculated as previously mentioned [16].Continuous variables were expressed as mean ± SD or as median and interquartile range (IQR), according to their distribution; categorical variables were expressed as median and IQR.

Statistical analysis
Liver attenuation values, image noise, objective image quality, and lesion confidence score were compared using the repeated-measures ANOVA test or Friedman test, as appropriate.The Wilcoxon signed-rank test was conducted to assess the differences in FOM between DL_M and ASiR-50% reconstructions.Differences in subjective image quality among the different reconstruction datasets were assessed with the Kruskal-Wallis H test. Diagnostic accuracy differences between DLIR_M and ASiR-V 50% were assessed with the McNemar test.A p-value < 0.05 was considered to indicate a statistically significant result; Bonferroni correction was applied to adjust post hoc pairwise comparisons.

Patient population
Patient characteristics are listed in

Objective image quality
Full objective image quality scores are displayed in Table 2.

Radiation dose
The mean CTDI vol and DLP were 24.1 ± 8.6 mGy and 786.3 ± 291.7 mGy cm, for an estimated mean ED of 11.8 ± 4.4 mSv.

Discussion
Our investigation demonstrates that DLIR at medium strength improves liver lesion detection rate compared to ASiR-V 50% (p = 0.016).While the two algorithms detected a comparable number of lesions larger than 0.5 cm, DLIR outperformed ASiR-V in the detection of liver lesions smaller than 0.5 cm (p = 0.031).Additionally, DLIR obtained a higher overall diagnostic accuracy (p = 0.039) and a higher lesion confidence score (p < 0.001) compared to ASiR-V 50%.Along with better diagnostic performance, our investigation documented higher objective and subjective image quality of DLIR compared to ASiR-V: while DLIR at high strength achieved the highest SNR and CNR, DLIR at medium strength obtained the highest subjective quality score.
Full exploitation of iterative reconstruction algorithms is hampered by their detrimental effect on image texture, especially at high strength levels, resulting in the generation oversmoothed images [22,23], which ultimately might have a negative effect on diagnostic capabilities.On the contrary, DLIR algorithm does not have a detrimental impact on  image texture [7], returning higher objective image quality at same radiation dose levels and comparable image quality when used to reconstruct low-dose CT acquisitions.As a result, DLIR is now under current investigation in different clinical settings, outperforming IR in terms of image noise and image quality in abdominal [14,24,25], cardiac [8], and chest imaging [6,9,26,27]; focusing on abdominal imaging, its higher performance compared to IR in terms of image quality and lesion conspicuity has been also demonstrated in the setting of dual-energy CT [28].Therefore, its implementation in clinical practice is constantly growing, preluding a gradual replacement of iterative reconstruction algorithms [13].
Our investigation demonstrated DLIR is effective in achieving a significantly higher FOM compared to ASiR-V, despite a 29% lower radiation dose.The possibility of sensibly lowering radiation exposure without sacrificing the diagnostic yield of a CT examination is strictly related to the specific clinical task [29,30], and abdominal studies are typically quite sensible to radiation dose due to the intrinsic low contrast differences between different abdominal organs.In particular, a high image quality is mandatory in liver imaging in order to identify and adequately characterize liver lesions, especially small ones, whose evaluation might be compromised by modest radiation dose reduction not counterbalanced by iterative reconstruction algorithms [31].In this regard, DLIR, already proven effective in maintaining noise texture and adequate low contrast liver lesion detectability at low-dose settings [32], might enable further dose optimization in  abdominal CT with no detrimental impact on diagnostic performances [33].
Jensen et al demonstrated that DLIR applied to reduceddose CT preserved detection of liver lesions larger than 0.5 cm when compared to standard-dose CT reconstructed with FBP, while the latter outperformed DLIR in detecting smaller lesions [20].Our investigation transfers these results to iterative reconstruction, demonstrating similar performance between DLIR and ASiR-V in the detection of lesion larger than 0.5 cm.On the other hand, our findings demonstrated that DLIR outperformed ASiR-V in the detection of lesions smaller than 0.5 cm.These differences might be explained by differences in study design, since our investigation compared the two reconstruction algorithms in the same CT acquisition.
Clinical implications of such findings indicate that DLIR can be safely implemented in routinely used clinical protocols in place of iterative reconstruction algorithms.On the contrary, particular attention should be paid to the design of dedicated low-dose DLIR CT protocols, since the benefits of a low dose burden might not be sustained by adequate diagnostic performance on the new algorithm in the detection of small liver lesions, making it unsuitable in clinical practice.Hence, large prospective trials should be performed in order to establish adequate and robust low-dose scan protocol clinically suitable for DLIR reconstruction.
Our investigation should be evaluated in light of some limitation.First, despite the study population was formed by oncologic patients, the characterization of liver lesions was based on clinical data and cross-sectional imaging examination; nevertheless, the creation of a consensus-based reference standard is robust and consistent with earlier examinations [20,31].Second, the sample size is relatively small and further studies with a larger number of participants are highly advisable to strengthen and expand upon our results.Third, this investigation analyzed the performances of a single vendor algorithm, specifically in liver parenchyma; therefore, our results might not be directly applicable to other vendors and in different body regions; investigations comparing the diagnostic performance of different DLIR algorithms might be indeed of great interest.
In conclusion, DLIR yields superior image quality and provides higher diagnostic accuracy compared to ASiR-V in the assessment of hypovascular liver lesions, in particular for lesions smaller than 0.5 cm.These higher diagnostic performances allow the design of low-dose acquisition protocols able to maintain current diagnostic accuracy with lower radiation burden.Nevertheless, further investigations are needed to establish appropriate radiation dose levels Statistical analyses were performed by means of a dedicated software (IBM Corp. Released 2017.IBM SPSS Statistics for Macintosh, Version 25.0.IBM Corp).Normality of data distribution was assessed with the Kolmogorov-Smirnov test.

Fig. 2
Fig. 2 Axial CT images reconstructed with ASiR-V from 10 to 100%, with 10% intervals (A to L), and with DLIR at low (M), medium (N), and high (O) strength levels.ASiR-V, hybrid iterative reconstruction algorithm; DLIR, deep learning image reconstruction algorithm

Table 1
Patient characteristics *

Table 2
Objective image quality scores of ASiR-V and DLIR reconstruction

Table 3
Subjective image quality scores of ASiR-V and DLIR reconstructions, with related pairwise comparisons * Non-statistically significant p-values

Table 4
Diagnostic accuracyPerformance data are per lesion Numbers in parentheses are 95% CIs; numbers in brackets are numbers of lesions