Introduction

Coronary computed tomography angiography (CCTA) plays a pivotal role as non-invasive tool in the diagnosis of obstructive coronary artery disease (CAD), due to its widespread availability, high diagnostic accuracy, excellent negative predictive value, and continuous technical advancements [1,2,3]; additionally, thanks to a thorough assessment of coronary atherosclerosis burden, CCTA is able to predict future major adverse cardiac events [4]. Nevertheless, despite a trend in radiation dose reduction, radiation exposure still represents a major concern and there is ample room for improvements in patient safety [5,6,7].

Filtered back-projection (FBP) represented the standard reconstruction algorithm for over three decades, ensuring robust and time-effective CT images. However, this method does not maintain adequate image quality when applying dose reduction strategies. Such limitations have been overcome in 2009 by the implementation of iterative reconstruction (IR) algorithms. IR techniques, either hybrid (combined with FBP) or model-based (stand-alone), grant low-dose CT examinations with acceptable image noise. Nevertheless, along with their denoising capabilities, IR is burdened by modification of image texture, ultimately leading to over-smoothed images, and preventing their full exploitation [8,9,10].

Deep learning image reconstruction (DLIR) algorithms, based on deep convolutional neural networks, have been recently released by vendors holding promise for shorter reconstruction time and significantly reduced noise while preserving image texture [11,12,13]. DLIR applied to CCTA is currently under active investigation in different tasks, such as image optimization, classification, segmentation, prognosis and outcome prediction [14,15,16]; in particular, DLIR is achieving promising results compared to IR at specific strength levels [17,18,19]. Nevertheless, to the best of our knowledge, no previous investigations have assessed DLIR image quality in a broad comparison with IR and FBP.

Thus, the purpose of our study was to perform a comprehensive intraindividual objective and subjective image quality evaluation of CCTA reconstructed with DLIR and to assess correlation with routinely applied hybrid iterative reconstruction algorithm.

Materials and methods

Patient population

This prospective, single-center study was approved by local institutional review board and written informed consent was obtained from all patients. Consecutive patients who underwent clinically indicated CCTA for known or suspected CAD were enrolled from April to December 2021. Exclusion criteria were: (a) severe motion artifacts on CCTA, (b) contraindication to contrast medium injection, and (c) heart rate > 90 bpm.

Intravenous β-blocker (Metoprolol, 5 mg) was administrated to patients with heart rate > 75 bpm, after exclusion of contraindications. Nitrates (Trinitrine, 0.8 mg) were sublingual administrated to all patients in order to induce vasodilatation for a better evaluation of coronary arteries.

Image acquisition

All CCTAs were performed in a cranio-caudal direction during end-inspiration, with retrospective ECG-gating, on a 128-slice CT scanner (GE Revolution EVO, GE Medical Systems, Milwaukee, WI). The following parameters were applied: detector collimation width of 0.625 mm, gantry rotation time of 0.6 s, spiral pitch automatically adjusted on heart rate and ranging from 0.16 to 0.30, and matrix of 512 × 512 pixels. Tube voltage and tube current modulation were fixed according to patient’s body mass index (BMI): 80 kV and 150 mA for patients with BMI < 30, 100 kV and 200 mA for patients with BMI > 30.

A fixed amount (50 mL) of non-ionic high-iodine concentration contrast medium (400 mgI/mL iomeprol, Iomeron 400; Bracco Imaging, Italy) was intravenously injected at a fixed flow rate of 5 mL/s through an 18-gauge antecubital access, by using an automated triple-syringe power injector (MEDRAD® Centargo CT Injection System; Bayer AG, Berlin, Germany), followed by saline chaser bolus of 40 mL at the same flow rate. Scan delay was determined using a bolus-tracking software program (SmartPrep, GE Healthcare): CCTA acquisition started after automatic minimum diagnostic delay as soon as the trigger attenuation threshold (100 HU) was reached into a region-of-interest (ROI) placed in the ascending aorta at the level of pulmonary arteries.

Image reconstruction

Every examination was reconstructed at a thickness of 0.625 mm by means of three different algorithms: (1) FBP, (2) hybrid IR (ASiR-V, GE Healthcare) at strength levels from 10% to 100% with 10%-increments, and (3) DLIR (TrueFidelity™, GE Healthcare) at three strength levels: low, medium, and high (DLIR_L, DLIR_M, and DLIR_H, respectively). Thus, fourteen different image datasets have been eventually generated for each CT examination. TrueFidelity™ applies a deep neural network, previously trained with high-quality FBP datasets, able to discern image noise from signal and to reconstruct CT images by selectively suppressing noise [13].

Objective image quality analysis

Quantitative measurements were performed on all fourteen reconstructed image datasets, by a radiologist with 5 years of experience in cardiovascular imaging, on a dedicated workstation (Advantage Workstation 4.7, GE Healthcare) for each patient and in all reconstructed datasets.

In axial sections, ROIs were drawn in the left pectoral muscle, ascending aorta (at the origin of the left main coronary artery), left main artery, left anterior descending artery, circumflex artery, and right coronary artery, carefully avoiding inclusion of the vessel wall and atherosclerotic plaques. Image noise was defined as the standard deviation (SD) of the ROI drawn in the pectoral muscle.

All ROIs were placed three times, and measurements have been averaged to minimize measurement inaccuracies. Consistency on ROIs placement throughout the datasets was ensured by applying the copy and paste function of the workstation.

Signal-to-noise ratio (SNR) was calculated as follows:

$${\text{SNR}} = \frac{{{\text{HU}}_{{{\text{artery}}}} }}{{{\text{SD}}_{{{\text{muscle}}}} }}$$

Contrast-to-noise ratio (CNR) was calculated as follows:

$${\text{CNR}} = \frac{{{\text{HU}}_{{{\text{artery}}}} - {\text{HU}}_{{{\text{muscle}}}} }}{{{\text{SD}}_{{{\text{muscle}}}} }}$$

Subjective image quality analysis

Two radiologists with 8 and 12 years of experience in CCTA, blinded to reconstruction protocol, performed subjective image analysis on ASiR-V 50%, ASiR-V 100%, DLIR_M, and DLIR_H datasets, in consensus reading. Datasets were selected for subjective image quality based on routine clinical practice (ASiR-V 50%), results of objective image quality analysis (ASiR-V 100% and DLIR_H), and vendor recommendations (DLIR_M). Images were evaluated with standard window setting (width, 1200 HU; level, 240 HU) but freely adjustable to suit readers’ preferences. Ambient lighting condition was kept constant at approximately 35–40 lx.

To minimize recall bias, images were evaluated in a randomized order and no more than two different reconstructed datasets from each patient were analyzed during each interpretation, maintaining a time interval of 7 days between sessions.

Image quality focused on plaques was assessed using an ordinal 4-point Likert scale from 1 to 4 (1, poor; 2, adequate; 3, good; and 4, excellent contour delineation). Coronary segments were considered diagnostic when image quality was deemed adequate, good, or excellent (scores 2–4) [20], in case of multiple plaques per segment, the plaque with the highest degree of stenosis was used for further analysis.

Correlation between DLIR and ASiR-V

The analysis was performed on ASiR-V 50% (dataset routinely used in clinical practice) and DLIR_M (dataset achieving the highest overall subjective image quality) by the two radiologists who had performed the subjective image quality analysis, in consensus reading, after a 2-week interval. The coronary artery tree was analyzed based on the segmentation described by the SCCT guidelines for the interpretation and reporting of CCTA [21]. Axial images and curved multiplanar reformats were used for image evaluation, and the window level setting was freely modifiable. Each coronary segment was visually scored for CAD as follows: absent (0% luminal narrowing), non-obstructive CAD (1–49% luminal narrowing), and obstructive CAD (≥ 50% luminal narrowing) [22]; in case of multiple lesions per segment, the coronary segment was classified based on the lesion with the highest degree of stenosis.

Statistical analysis

Statistical analyses were performed using commercially available software (IBM Corp. Released 2017. IBM SPSS Statistics for Macintosh, Version 25.0. Armonk, NY: IBM Corp). The Kolmogorov–Smirnov test was used to assess the normality of data distribution.

Continuous variables were expressed as mean ± SD if normally distributed and as median and interquartile range (IQR) if non-normally distributed; categorical variables were expressed as median and IQR.

Vascular attenuation values, image noise, and image quality of the different reconstruction datasets were compared using repeated-measures ANOVA test or Friedman test, as appropriate.

Correlation between DLIR_M and ASiR-V 50% datasets was measured by means of Pearson correlation coefficients. A P value < 0.05 was considered to indicate a statistically significant result; post hoc pairwise comparisons were adjusted for multiple comparisons by the Bonferroni correction.

Results

Patient population

Comprehensive results of patient characteristics are reported in Table 1, and corresponding flow diagram is depicted in Fig. 1.

Table 1 Patient characteristics
Fig. 1
figure 1

Flow diagram of patient population. [CM, contrast medium]

Of 55 patients initially identified, 1 was excluded due severe motion artifacts, 1 due to contraindication to CM injection, and 2 were excluded due to heart rate > 90 bpm. Hence, the final population eligible for image quality analysis consisted of 51 patients (22 females), with a mean age of 64 ± 15 years (range 18–84 years) and a mean BMI of 26.6 ± 4.9 kg/m2 (range 16.3–37.2). The mean heart rate measured during CCTA acquisitions was 63 ± 9 bpm.

Objective image quality

Comprehensive objective image quality scores are summarized in Table 2.

Table 2 Objective image quality scores of FBP, ASiR-V, and DLIR reconstructions

A total of 714 datasets were analyzed. DLIR algorithm did not affect vascular attenuation values compared with FBP and every ASiR-V reconstruction (P ≥ 0.374). Graphical representation of image noise and objective image quality is shown in Figs. 2 and 3, respectively.

Fig. 2
figure 2

Box-and-whisker plots for image noise. DLIR_H achieved the lowest image noise, followed by ASiR-V 100%, DLIR_M, DLIR_L, and all the remaining ASiR-V datasets. Boxes represent the middle 50% of the data and solid lines represent the median, whiskers represent minimum and maximum values

Fig. 3
figure 3

Box-and-whisker plots for SNR and CNR. DLIR_H reached the highest SNR and CNR, comparable with ASiR-V 100% and significantly higher than DLIR_M, DLIR_L, and all the remaining ASiR-V datasets. DLIR_M showed comparable values with ASiR-V 80% and ASiR-V 90%, and higher values than DLIR_L. DLIR_L dataset showed comparable values with ASiR-V 60% and ASiR-V 70%. Boxes represent the middle 50% of the data and solid lines represent the median, whiskers represent minimum and maximum values. [CNR: contrast-to-noise ratio; SNR: signal-to-noise ratio]

The lowest noise was obtained by DLIR_H (median: 9.7, IQR: 4.5–15.5), comparable with ASiR-V 100% (median: 10.4; IQR: 5.0–17.9; P = 1) and significantly lower than DLIR_M (median: 13.9; IQR: 5.6–21.6; P = 0.011), DLIR_L (median: 15.8; IQR: 6.5–24.7; P < 0.001), and all the remaining ASiR-V datasets (P ≤ 0.021). DLIR_M dataset showed comparable image noise with ASiR-V 80% and ASiR-V 90% (P = 1), and lower noise than DLIR_L (P = 0.023). DLIR_L dataset exhibited comparable image noise with ASiR-V 60% and ASiR-V 70% (P = 1). The highest image noise was measured with FBP dataset (median: 41.7; IQR: 17.9–72.6), significantly different than every DLIR reconstructions (P < 0.001).

The highest SNR was achieved by DLIR_H (median: 65.1, IQR: 37.7–159.4), comparable with ASiR-V 100% (median: 55.7; IQR: 32.6–144.0; P = 0.139) and significantly higher than DLIR_M (median: 49.5; IQR: 28.3–121.4; P < 0.001), DLIR_L (median: 41.5; IQR: 23.5–103.1; P < 0.001), and all the remaining ASiR-V datasets (P < 0.001). DLIR_M dataset showed comparable SNR with ASiR-V 80% and ASiR-V 90% (P = 1), and higher SNR than DLIR_L (P < 0.001). DLIR_L dataset showed comparable SNR with ASiR-V 60% (P = 0.157) and ASiR-V 70% (P = 1). The lowest SNR was measured with FBP reconstruction (median: 13.7; IQR: 8.4–37.6), significantly different than every DLIR reconstructions (P < 0.001).

The highest CNR was achieved by DLIR_H (median: 59.1, IQR: 34.0–153.4), comparable with ASiR-V 100% (median: 51.0; IQR: 30.0–134.6; P = 0.075) and significantly higher than DLIR_M (median: 43.7; IQR: 25.7–114.0; P < 0.001), DLIR_L (median: 36.7; IQR: 21.3–101.4; P < 0.001), and all the remaining ASiR-V datasets (P < 0.001). DLIR_M dataset showed comparable CNR with ASiR-V 80% (P = 0.281) and ASiR-V 90% (P = 1), and higher CNR than DLIR_L (P < 0.001). DLIR_L dataset showed comparable CNR with ASiR-V 60% (P = 0.113) and ASiR-V 70% (P = 1). The lowest CNR was measured with FBP reconstruction (median: 12.2; IQR: 7.4–32.8), significantly different than every DLIR reconstructions (P < 0.001). Pairwise comparisons between groups are reported in Table 3 and in Table 4, for CNR and SNR, respectively.

Table 3 Pairwise comparisons of CNR achieved by FBP, ASiR-V, and DLIR reconstructions
Table 4 Pairwise comparisons of SNR achieved by FBP, ASiR-V, and DLIR reconstruction

Subjective image quality

Subjective image quality scores and corresponding pairwise comparisons are showed in Table 5. No examination reconstructed with the DLIR datasets was deemed of non-diagnostic image quality. DLIR_M returned the highest overall median image quality (score: 4; IQR: 4–4), significantly higher than all the other reconstructions (P ≤ 0.001). DLIR_H and ASiR-V 50% datasets achieved comparable results (scores: 3, P = 0.085); followed by ASiR-V 100% (score: 2; IQR: 2–3); Fig. 4.

Table 5 Subjective image quality scores of ASiR-V and DLIR reconstructions, with related pairwise comparisons
Fig. 4
figure 4

A 64-year-old male with familiar history of CAD. Curved multiplanar reformations reconstructed with ASiR-V 50% (a), ASiR-V 100% (b), DLIR_M (c), and DLIR_H (d) show obstructive (> 50%) soft plaque of proximal LAD; DLIR_M achieved the highest subjective image quality score. [ASiR-V: hybrid iterative reconstruction algorithm; CAD: coronary artery disease; DLIR: deep learning image reconstruction algorithm; LAD: left anterior descending artery; window width, 1200 HU; level, 240 HU]

Correlation between DLIR and ASiR-V

A total of 721 coronary segments were assessed for the presence of coronary stenosis. With routinely applied ASiR-V 50% dataset, no stenoses were reported in 584 segments, stenoses 1–49% were found in 101 segments, while stenoses ≥ 50% were found in 36 segments. With DLIR_M, no stenoses were reported in 577 segments, stenoses 1–49% were found in 107 segments, while stenoses ≥ 50% were found in 37 segments, with a concordance between the two datasets of 98%, 73%, and 76%, respectively, and an overall very strong correlation (r = 0.874 P = 0.001) Fig. 5.

Fig. 5
figure 5

A 81-year-old female with family history of CAD. Curved multiplanar reformation of the ASiR-V 50% dataset (a) and DLIR_M dataset (b) show a concentric mixed plaque determining non-obstructive (1–49%) stenosis of the proximal RCA (arrow). The two datasets show an overall very strong correlation in the assessment of CAD. [CAD: coronary artery disease; DLIR_M: medium-strength deep learning image reconstruction algorithm; RCA: right coronary artery; window width, 1200 HU; level, 240 HU]

Discussion

The aim of our study was to perform a comprehensive intraindividual image quality evaluation of CCTA reconstructed with DLIR and to assess correlation between ASiR-V and DLIR in the diagnosis of CAD. Our investigation demonstrated that DLIR did not affect vascular attenuation compared with ASiR-V and FBP, DLIR_H showed the lowest noise, comparable with ASiR-V 100% and significantly lower than every other reconstruction. DLIR_H achieved the highest objective image quality, with SNR and CNR comparable with ASiR-V 100%. DLIR_M returned the highest subjective image quality, significantly higher than all the other reconstructions. Additionally, a very strong correlation was found between DLIR_M and ASiR-V 50% datasets in the diagnosis of CAD.

An adequate tradeoff between diagnostic image quality and radiation dose has always been a crucial aim of CT technical advancements. A radiation output reduction of the X-ray tube not counterweighted by compensating strategies translates into non-diagnostic images, due to unacceptable increase in image noise. Lack of computational power prevented IR technology to be clinically feasible until 2009, when they started replacing FBP as the reconstruction modality of choice [23]. Despite providing significant radiation dose reduction and overall increase in image quality, IR algorithms tend to alter image texture, generating over-smoothed “plastic-like” images. Imaging over-smoothing increases as function of strength of IR [24]; therefore, middle strength levels (usually from 50% to 70%) are commonly implemented in clinical practice, representing a good tradeoff between image noise and texture. The constant increase in computation power, along with increasing availability of big data, paved the way for DLIR algorithms. Two major CT vendors, GE Healthcare and Canon Medical System, had their DLIR algorithms cleared by the FDA, both based on a deep neural network, respectively, trained with high-quality FBP images [13] and model-based IR datasets [12]. Both algorithms are under extensive investigation and are achieving promising results: Recent clinical studies have documented DLIR capability of generating images with lower image noise and superior image quality compared to IR; favorable results have been obtained in CCTA [17, 25,26,27], abdominal CT [28,29,30,31,32], chest CT examinations [33,34,35], and brain CT scans [36, 37].

DLIR is a more stable reconstruction method compared to IR since its performances are less influenced by variation of dose and levels of reconstructions, proven effective in reducing image noise and improving image quality without alteration of the typical FBP noise texture [38]. DLIR_H has been proved effective in achieving 37% noise reduction compared to DLIR_L and 40% noise reduction compared to ASiR-V 50% in a phantom experiment, our investigation translated these results in vivo with solid consistency, demonstrating a denoising power up to 39% compared to DLIR_L and 40% compared to ASiR-V 50%. Our widespread comparison of fourteen different datasets shed further light on DLIR performance in reducing image noise, demonstrating DLIR_H better performances compared to FBP (77% noise reduction) and ASiR-V from 10% to 90% (65% to 13% noise reduction). Specifically, such findings are also in accordance with recent studies focused on CCTA and reporting a DLIR_H denoising performance of 54% compared with ASiR-V 60% [26] and 43% compared with ASiR-V 70% [17].

As results of DLIR denoising efficacy, our investigation demonstrated that SNR and CNR increased as a function of DLIR strength, peaking at DLIR_H: median image quality measured with DLIR_H was roughly 44% higher than ASiR-V 60% and 39–40% higher than ASiR-V 70%, in accordance with existing literature [17, 26]. DLIR_H reached also greater objective image quality than stronger ASiR-V levels; however, despite achieving 14% higher score than ASiR-V 100%, the two values were comparable.

Our study also demonstrated that DLIR at medium strength levels provided the highest subjective image quality. No examinations reconstructed with DLIR datasets were deemed of unacceptable image quality; however, the median image quality of DLIR_M was deemed excellent, followed by DLIR_H and ASiR-V 50%, with no statistical differences between these two datasets. These results differ from Benz and colleagues [17], who reported DLIR_H to be superior or comparable to DLIR_M. Nevertheless, in our investigation readers pointed out minor blurring of small plaque contours with DLIR_H, ultimately leading to loss of image details. These subtle effects might explain the highest performances of DLIR_M, also considering that similar blurring has been reported for DLIR_M algorithm in abdominal setting [29]. The alteration of noise texture and consequent imaging over-smoothing is a well-known limitation of IR algorithms, representing the main reason for opting to lower strengths level in clinical practice. As opposite with ASiR-V, the investigated DLIR is trained with FBP data, has little to no impact noise texture and reconstructs crisper datasets characterized by diagnostic equal accuracy.

The high image quality ensured by DLIR, along with its a time-effective reconstruction process (≤ 50 s for axial CCTAs [38]), paves the way for its implementation in routine clinical practice. Dedicated CCTA acquisition protocols can be designed to exploit DLIR capabilities of reconstructing high-quality of low-dose examinations [17, 18, 27]. The improved image quality ensured by DLIR might also allow the use of dedicated low-volume contrast media injection protocols, particularly useful in elderly individuals or in patients with heart failure and impaired renal function. Hence, the application of DLIR algorithm might be beneficial in clinical practice to allow gentler radiation dose protocols without detrimental effect on image quality and diagnostic accuracy.

The findings of this study should be seen in light of some limitations. First, the investigated DLIR algorithm and the obtained results are vendor-specific; therefore, our findings might not be directly transposable to other DLIR algorithms. However, all the investigation performed so far has achieved promising preliminary results. Second, despite we included patient who had undergone coronary stenting, we did not perform a subgroup analysis to specifically test the DLIR performances on stented segments. Further diagnostic accuracy was not evaluated, because invasive coronary catheterization was not routinely available as a reference standard; therefore, our results need to be strengthened with larger patients cohorts and multi-institutional investigations.

In conclusion, DLIR algorithm at medium strength level significantly improves CCTA image quality and has very strong correlation with routinely applied ASiR-V 50% dataset in the diagnosis of CAD.