Artificial intelligence based image quality enhancement in liver MRI: a quantitative and qualitative evaluation

Purpose To compare liver MRI with AIR Recon Deep Learning™(ARDL) algorithm applied and turned-off (NON-DL) with conventional high-resolution acquisition (NAÏVE) sequences, in terms of quantitative and qualitative image analysis and scanning time. Material and methods This prospective study included fifty consecutive volunteers (31 female, mean age 55.5 ± 20 years) from September to November 2021. 1.5 T MRI was performed and included three sets of images: axial single-shot fast spin-echo (SSFSE) T2 images, diffusion-weighted images(DWI) and apparent diffusion coefficient(ADC) maps acquired with both ARDL and NAÏVE protocol; the NON-DL images, were also assessed. Two radiologists in consensus drew fixed regions of interest in liver parenchyma to calculate signal-to-noise-ratio (SNR) and contrast to-noise-ratio (CNR). Subjective image quality was assessed by two other radiologists independently with a five-point Likert scale. Acquisition time was recorded. Results SSFSE T2 objective analysis showed higher SNR and CNR for ARDL vs NAÏVE, ARDL vs NON-DL(all P < 0.013). Regarding DWI, no differences were found for SNR with ARDL vs NAÏVE and, ARDL vs NON-DL (all P > 0.2517).CNR was higher for ARDL vs NON-DL(P = 0.0170), whereas no differences were found between ARDL and NAÏVE(P = 1). No differences were observed for all three comparisons, in terms of SNR and CNR, for ADC maps (all P > 0.32). Qualitative analysis for all sequences showed better overall image quality for ARDL with lower truncation artifacts, higher sharpness and contrast (all P < 0.0070) with excellent inter-rater agreement (k ≥ 0.8143). Acquisition time was lower in ARDL sequences compared to NAÏVE (SSFSE T2 = 19.08 ± 2.5 s vs. 24.1 ± 2 s and DWI = 207.3 ± 54 s vs. 513.6 ± 98.6 s, all P < 0.0001). Conclusion ARDL applied on upper abdomen showed overall better image quality and reduced scanning time compared with NAÏVE protocol.


Introduction
Medical imaging plays a central role in modern medicine and technical improvements are fundamental to achieve the best diagnostic performances with the most efficient imaging protocol. In recent years Artificial Intelligence (AI) allowed huge improvements in different aspects such as image acquisition/reconstruction, image post-processing, image analysis, image storage, and integration of complex data for medical decision-making process [1][2][3][4][5][6][7].
Focusing on image acquisition, several in-house AI and deep-learning (DL) tools have been developed to improve image quality and their application since now has been mainly in research field [8].
In September 2020 one of the first AI-derived tool has been approved for clinical use for all anatomies at 1.5 T, the AIR™ Recon DL (ARDL), GE Healthcare, Waukesha, WI [9]. Unlike post-processing-based approaches that might alter image detail, this method is a deep learning-based reconstruction model applied directly on k-space-based raw data for maximum image quality, by preserving non-DL data acquired. In addition, to improve signal-to-noise ratio (SNR), this technology has a unique intelligent ringing suppression that preserves fine image details, helping address two common delicate aspects for radiologists and technologists that are image noise and ringing. In addition, with ARDL is possible to obtain simultaneously a dataset, called NON-DL, resulted from the same MRI parameters set without the intervention of ARDL on the sequence.
Until now, MRI acquisition has been a compromise between image quality and scan time; in fact, to achieve high-quality images with improved SNR and/or spatial resolution, necessitated long scan times. On the contrary, shorter scan acquisition time enables to improve patient comfort and productivity, with reduced image quality. With new DL algorithms, the possibility to achieve good image quality in reduced acquisition time is becoming a reality. Moreover, it is important to compare the new algorithms not only with standard protocol but also with high-quality protocol, more similar to the ground truth.
Up to now, a few studies has assessed the impact of AI on MRI acquisition in clinical setting and in particular on upper abdomen [10][11][12][13].
The aim of the study is to compare ARDL-protocol, NON-DL dataset and high-resolution NAÏVE protocol applied to upper abdomen district, in terms of quantitative and qualitative image analysis and acquisition time.

Study design and patient population
This prospective single-center study included 50 volunteers from September 2021 to November 2021, in accordance with local IRB, and informed consent was signed by all participants.
All volunteers underwent upper abdomen MRI scan and both protocols (ARDL and NAÏVE) were performed; details of the two protocols and three datasets obtained are described below in Deep-learning algorithm and Imaging protocol sections Volunteers with contraindications to perform MRI (e.g. non-compatible MRI devices, claustrophobia), and volunteers who underwent upper abdomen surgery were excluded; also MRI acquisitions with severe motion artefacts or magnetic susceptibility artefacts related to abdominal metallic devices were not included in the analysis.

Deep-learning algorithm
The specific DL algorithm applied during image acquisition is the AIR™ Recon DL, GE Healthcare, Waukesha, WI. It consists of a feed-forward deep convolutional neural network (CNN) that enables to reconstruct images with higher SNR, reduced truncation artifacts, and higher spatial resolution [14]. This innovative CNN works directly integrated with the standard reconstruction pipeline, on the raw data obtained during MRI examination. A supervised learning approach with over 4.4 million parameters in over 10.000 kernels has been used to train this CNN. AIR™ Recon DL works by accepting raw, unfiltered input and to provide improved output in terms of SNR and image artefacts; another important aspect is the possibility to set the range of applicability of the CNN before the acquisition to generate images with a different CNN application strength (Low, Medium and High).
From each acquisition are also obtained two different dataset available simultaneously during the scanning, without different reconstruction time. In particular, one dataset is reconstructed with ARDL applied and, the other one, defined as NON-DL, resulted from the same parameters set without the ARDL reconstruction on k-space. It is important to underline that NON-DL dataset returns the conventional reconstruction with non-optimized parameters so, without the ARDL reconstruction. As a result, images will appear noisier, due to the lack of the usual parameters that are set to balance image quality and acquisition time.
An image artefact that the AIR™ Recon DL is trained to recognize and reduce is the truncation (Gibbs) artefact, usually present near structures with transition between regions of high and low signal intensity such as spinal cord and cerebral-spinal fluid or liver parenchyma and abdominal fat [15]. The CNN is able to recognize Gibbs artefact in proximity of sharp edges and reduce consistently the ringing artefact to improve image sharpness. As a result, ARDL tool has the possibility to acquire MRI images with less compromise than usually in terms of MRI parameters (e.g. reduced NEX, thickness) and acquisition time.

Imaging protocol
Each acquisition has been performed on the same scanner (1.5 T Signa Voyager, GE Healthcare, Waukesha, WI) with the same protocol. All scans have been performed in supine position, with a dedicated 16-channel highly flexible Adaptive Image Receiver (AIR, GE Healthcare, Waukesha, WI) coil and, no contrast medium administration. Field of View (FOV) will include upper abdomen coverage from pulmonary bases to the upper margin of iliac bones, with the same number of slices for both protocols.
Acquisition protocol includes high-quality NAÏVE protocol and ARDL protocol, the latter with two reconstruction dataset (ARDL and NON-DL datasets) as described above.

Quantitative Image analysis
A total of three dataset have been obtain for each sequence as follows: ARDL dataset, NAÏVE dataset and NON-DL dataset, each for T2 SSFSE sequence, DWI and for ADC maps. Each sequence was anonymized in order to avoid bias. Two readers (B.M. and M.Z., with 3 and 8 years of experience in abdominal MRI imaging respectively) assessed objective image quality in consensus and simultaneously. In particular, to quantify signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) three circular regions of interest (area 1cm 2 ) were placed in the liver parenchyma at the V, VI and VIII segments, by avoiding vessels and biliary tree, and liver edge, and then, the averaged measurements of the three was reported. In addition and a single region of interest (ROI) in the background and in the gallbladder were placed. Explicatory single slice of the ROIs placement is provided in Fig. 1.
SNR and CNR were calculated with the following formulas modified by previous study [12]: where S represents mean signal intensity while SD represents signal intensity standard deviation.

Qualitative image analysis
Two readers (M.P. and G.G.) with 6 and 7 years of experience in abdominal MRI imaging respectively, independently assessed subjective image quality. Subjective image quality has been assessed with a five-point Likert scale considering: (a) upper abdomen parenchyma edge sharpness: 1 = poor; 2 = mild; 3 = moderate; 4 = good; 5 = very good.

Statistical analysis
Continuous variables were compared using the paired-samples t-test or the Wilcoxon test according to the normality of data distribution priorly assessed with Kolmogorov-Smirnov Test. Multiple tests were assessed by One way repeated measures ANOVA test for the SNR and the CNRand, for the qualitative analysis, with Bonferroni correction of the P values. The t-test or Wilcoxon signed-rank test will be adopted to compare the acquisition times between ARDL and NAÏVE MRI protocols.
A two-sided P value < 0.05 was considered statistically significant.

Patient population
From an initial population of 57 volunteers, 5 were excluded for ferromagnetic artifacts due to surgery, one for incomplete SNR = S liver ∕SD background CNR = |S gall -bladder −S liver |∕ SD background MRI protocol due to claustrophobia and one due to several motion artifact. The final population included 50 volunteers (19 male, 31 female), mean age 55.5 ± 20 years old. ADC maps CNR showed no significant differences, for ARDL protocol and NAÏVE protocols (7.97 ± 5.34 vs. 7.19 ± 3.89, P = 1), ARDL protocol compared with NON-DL group (7.97 ± 5.34 vs. 7.55 ± 5.45, P = 0.32) and NAÏVE protocol in comparison with NON-DL dataset (7.19 ± 3.89 vs. 7.55 ± 5.45, P = 0.5733). All results are listed in Table 1.  Overall image quality for ADC maps resulted higher in ARDL protocol compared with NAÏVE one (4.17 ± 0.77 vs.
Interrater agreement regarding image analysis was excellent (k = 0.8273 for SSFSE T2, k = 0.8143 for DWI and k = 0.8165 for ADC maps).

Scanning time
Scanning time resulted significantly lower for Axial SSFSE T2 ARDL compared to NAÏVE with an average time of 19.08 ± 2.58 s compared with 24.11 ± 2.03 s (P < 0.001), with 31% of time reduction for ARDL protocol compared to NAÏVE protocol.
Similar results are appreciable for DWI sequence with a mean acquisition time of 207.33 (3.45 min) ± 54.03 s vs 513.60 (8.56 min) ± 98.69 s (P < 0.001), with 60% of time reduction for ARDL protocol compared to NAÏVE protocol.

Discussion
The present study compared image quality of deep-learning MRI sequences with non-deep learning sequences. Results showed how SNR and CNR are higher for SSFSE Very positive results were also obtained for qualitative analysis that showed how ARDL protocol was superior to NAÏVE and NON-DL datasets for both SSFSE T2, DWI and ADC. In addition, acquisition time resulted significantly lower in ARDL protocol for both sequences tested.
Significant results in terms of quantitative and qualitative analysis for SSFSE T2 sequence are in line with data reported by Wang and colleagues [16] that reported how deep-learning sequences showed better image quality and fewer artifacts in prostate MRI compared with non-deep learning protocol.
Diffusion-weighted imaging analysis showed interesting results that need some consideration. DWI acquired without ARDL protocol was performed with high-quality parameters and it is possible to consider them as the ground truth. These acquisition parameters have returned a high-quality image with the disadvantages of a long acquisition time. In fact, quantitative analysis showed how there were no significant differences in terms of SNR and CNR with ARDL protocol acquired with a mean time of three minutes, 60% less than NAÏVE protocol. Moreover, the deep learning protocol showed improved image quality compared with the NAÏVE one. To the best of our knowledge, there are no studies performed on abdominal DWI; however, Misaka et al. [17] conducted an interesting study on female pelvis. They built a deep learning algorithm to improve single-shot turbo spin-echo (SSTSE) sequences and compare them with TSE, considered as ground truth, and SSTSE. Results showed how DL SSTSE were superior in terms of contrast ratio and SNR to SSTSE while no differences were observed with TSE. Image quality was higher for DL protocol compared to the others. Even though the sequences and the district are dissimilar, the approach of the analysis is comparable and in line with our results.
Similar results for DWI were obtained in other anatomical districts [18,19]. An example is provided by Kaye and colleagues on prostate MRI [19]. They performed a comparison study on DWI obtained with a high average number with a retrospective guided denoising CNN that worked on low b values. Results performed on 118 patients divided into training and validation datasets of prostate MRI showed higher peak signal-to-noise ratio, higher structural similarity index and lower normalized mean square error compared to standard images. Also image quality resulted better for guided denoised CNN compared with the reference standard. In addition, the ADC map values obtained from denoised DW images had good agreement with the reference ADC values. Despite the retrospective method used and the different algorithm tested, our results are in line in terms of image quality and reduction of acquisition time and this aspect strength the role of AI and DL for clinical imaging. In fact, reduced acquisition time without affecting image quality would allow to improve clinical imaging scheduling.
Another aspect of the results obtained concerns the NON-DL dataset. In fact, in all the sequences analyzed, NON-DL dataset resulted inferior in terms of quantitative image analysis and image quality. Results are coherent with the actual meaning of the NON-DL dataset, which represents the acquisition obtained with non-optimized MRI parameters without the application of ARDL. NON-DL dataset is usually kept in the examinations for radiologists' consultation. In fact, at the beginning radiologists might need some time to get confident with ARDL algorithm and the idea of non-DL images helps in that direction, even if they actually do use the ARDL dataset after a few scans.
The last aspect that needs important consideration is represented by the impact of ARDL on scanning time; it was faster for ARDL protocol compared with NAÏVE protocol, from 31% for SSFSE to 60% for DWI. Our results are in line with literature present with an average reduction of scanning time ranging from 30 to 60% [18,20]. By reducing scanning time without affecting image quality represents an important goal achievable for AI. By so doing it would be possible to increase number of patients in the scheduling, without affecting the quality of examination, with a net improvement of healthcare process.
Despite the interesting results, our study has some limitations such as: lack of analysis of pathologic entities; only two sequences of the MRI upper abdomen protocol were tested also because ARDL for now has been approved for clinical use only for 2D sequences; lack of comparison with a standard NAÏVE protocol because we aimed to test a higher image quality protocol with DL to assess image quality; in fact, some acquisition parameters such as matrix or average for b values are different in SSFSE T2 and DWI acquisitions respectively.

Conclusion
In conclusion, deep learning algorithm applied on MRI sequence acquisition for SSFSE T2 and DWI resulted better for image quality and comparable in terms of signal-to-noise ratio and contrast-to-noise ratio with high quality non deep learning protocol, by enabling a considerable reduction of acquisition time.
Author Contributions All authors contributed, read and approved the final manuscript.
Funding Open access funding provided by Università degli Studi di Roma La Sapienza within the CRUI-CARE Agreement. The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Ethics approval This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of Sant'Andrea University Hospital, Rome, Italy.
Consent to participate Informed consent was obtained from all individual participants included in the study.

Consent to publish
The authors affirm that human research participants provided informed consent for publication of the images in Figs. 1 and 2.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.