Deep learning-based ultrasound transducer induced CT metal artifact reduction using generative adversarial networks for ultrasound-guided cardiac radioablation

In US-guided cardiac radioablation, a possible workflow includes simultaneous US and planning CT acquisitions, which can result in US transducer-induced metal artifacts on the planning CT scans. To reduce the impact of these artifacts, a metal artifact reduction (MAR) algorithm has been developed based on a deep learning Generative Adversarial Network called Cycle-MAR, and compared with iMAR (Siemens), O-MAR (Philips) and MDT (ReVision Radiology), and CCS-MAR (Combined Clustered Scan-based MAR). Cycle-MAR was trained with a supervised learning scheme using sets of paired clinical CT scans with and without simulated artifacts. It was then evaluated on CT scans with real artifacts of an anthropomorphic phantom, and on sets of clinical CT scans with simulated artifacts which were not used for Cycle-MAR training. Image quality metrics and HU value-based analysis were used to evaluate the performance of Cycle-MAR compared to the other algorithms. The proposed Cycle-MAR network effectively reduces the negative impact of the metal artifacts. For example, the calculated HU value improvement percentage for the cardiac structures in the clinical CT scans was 59.58%, 62.22%, and 72.84% after MDT, CCS-MAR, and Cycle-MAR application, respectively. The application of MAR algorithms reduces the impact of US transducer-induced metal artifacts on CT scans. In comparison to iMAR, O-MAR, MDT, and CCS-MAR, the application of developed Cycle-MAR network on CT scans performs better in reducing these metal artifacts.


Introduction
Cardiac radioablation is a new non-invasive modality for the treatment of cardiac arrhythmias.This treatment method is based on delivering a radiation dose to the arrhythmogenic tissues using external beam radiation therapy [1][2][3].A typical treatment workflow includes acquiring a planning computed tomography (CT) used to delineate the arrhythmogenic tissue (target) and the organs-at-risk (OARs) for the Hounsfield Units (HU) derived foreseen radiation dose calculation.At treatment, the dose is then delivered to the target while sparing the OARs as much as possible.However, the complex cardiorespiratory motion may impact the accuracy of dose delivery [4,5].This makes real-time monitoring of the cardiorespiratory motion of paramount importance to achieve safe and effective treatment delivery.
A possible candidate image modality for real-time guidance in cardiac radioablation is transthoracic ultrasound (US) imaging [6,7].This approach relies on identifying the cardiac tissue position using US imaging at both the simulation and the treatment delivery stages.Comparing the two allows to compensate for possible displacements of cardiac structures at treatment.To minimize clinical workflow steps and ease treatment planning, it may be favourable to acquire the US scans simultaneously with the planning CT scan at the simulation stage.However, this approach is prone to creating transducer-induced metal artifacts on the CT scans, caused by the internal metal components of the US transducers [8,9].
Metal artifacts may generate improper representations of anatomical structures and incorrect HU values, resulting in potentially inaccurate radiation dose calculation [10][11][12].Several algorithms for metal artifact reduction (MAR) have been developed, mainly focusing on the artifacts generated by implanted metal structures.The majority of these MAR algorithms, both commercially available and researchbased, follow a conventional analytical approach.Recently, new works based on deep learning have been proposed [11,12].Commercially available MAR algorithms use an iterative approach of correction of CT projection data [12].Among them, the Orthopaedics Metal Artifact Reduction (O-MAR, Philips Health System) and the iterative Metal Artifact Reduction (iMAR, Siemens Healthcare) algorithms were widely investigated for the improvement of radiation therapy planning [13][14][15][16][17][18][19][20][21].Among the research-based MAR algorithms, Metal Deletion Technique (MDT, ReVision Radiology) [22] was the one most often compared to the commercially available algorithms [23][24][25][26].All these algorithms typically suffer from improper restoration of HU values, from distortion of anatomical structures, and even from creation of secondary artifacts [11].Recently, our group [27] investigated a newly developed MAR algorithm, Combined Clustered Scan-based MAR (CCS-MAR), which followed the traditional analytical approach, and compared its performance to commonly used commercial and research-based MAR algorithms.The results of this study revealed that further development or improvements were needed to reduce residual artifacts and further improve HU value restoration capabilities.
In recent years, the development of deep learning-based algorithms for metal artifact reduction in CT imaging gained significant interest [28][29][30][31][32][33][34][35][36][37][38].In general, to learn in a supervised manner [39,40] the complex metal artifact patterns and propagation, these algorithms require data including CT scans with artifacts (CT art ), and corresponding artifactfree (CT ref ) scans.In the absence of adequate paired data, typical metal artifacts which resemble real clinical scenarios can be simulated [28,30,31,38].For our work, we have used generative adversarial networks (GAN) [41].Typically, these architectures consist of a generator and a discriminator, which are types of convolutional neural networks (CNN) [41,42].The generator and the discriminator are trained in an adversarial manner to perform the transformation.We have focused in particular on their extension CycleGAN [43], which utilizes two GAN architectures, and has already been studied in literature for the transformation of CT art scans into artifact-reduced CT (CT cor ) scans for radiation therapy applications [28,29,31].This work aims to develop a deep learning-based MAR algorithm for the reduction of US transducer-induced metal artifacts on CT scans.The proposed algorithm was designed following a supervised learning scheme with a CycleGAN architecture using paired clinical CT scans.Then it was evaluated on phantom CT scans with real artifacts and clinical CT scans with simulated artifacts.In addition, the performance of CycleGAN has been compared with the performance of iMAR, O-MAR, MDT and CCS-MAR.

Clinical CT scans
Paired clinical CT scans with and without US transducerinduced metal artifacts were used.In particular, DICOM CT scans of the thoracic region were utilized from the "COVID19-CT-Dataset" online database [44].Initially, CT scans from 180 patients were downloaded and visually inspected for suitability.CT scans or CT slices with large COVID19 induced density changes in the lungs region, with suboptimal quality due to low resolution, and/or with presence of metal artifacts resulting from external foreign bodies were excluded.This resulted in CT scans from 84 patients, and from each of these CT scans, axial CT slices composed of 22-100 slices per patient covering the cardiac structure from the apex to the base were selected.These CT scans consisted of 512 × 512-pixel slices with 1.5 or 3 mm slice thickness [45].
To simulate the metal artifacts on the CT scans, Sakamoto et al. [46].developed a MatLab package based on the study by Zhang et al. [38].This package was modified in our work for the specific simulation of US transducer-induced metal artifacts on the selected COVID19-CT-Dataset (See Fig. 1 a).In our procedure, initially the pixels identifying the US transducers on the phantom CT art scans were manually segmented and stored separately.Then, the segmented US transducers were copied to the clinical CT scans and positioned on the scans using rigid rotation-translations to be imaging the cardiac structures.A threshold value of 2000 HU was applied on the imported US transducer to extract metal components, which were then saved as binary images.The metal extraction threshold value of 2000 HU was chosen according to the previous research published in the literature [27,47,48].Then, pre-defined HU thresholds were used to segment the bone, lung, and water-equivalent tissues on the clinical CT ref slices.To convert the HU values of the pixels into linear attenuation coefficient for varying X-ray energies, corresponding mass attenuation coefficients were used from the NIST [49] database.Subsequently, the polychromatic projection data for corresponding X-ray energies were simulated from the segmented bone, lung, water equivalent tissue and from the metal binary image.As the metal components of a US transducer consist of lead, zirconate, and titanate [50], the average mass attenuation coefficient value of these metals was used to generate the projection data.Consequently, metal-containing projection data was created from those simulated projections with Poisson distribution for the reconstruction of a CT art slice.
To check the correctness of this artifact simulation method, phantom CT scans were utilized (See Fig. 1 b).From a particular phantom CT scan, initially, CT ref and the corresponding CT art slices were selected.Then, the US transducer-induced metal artifacts were simulated on the phantom CT ref slices based on the procedure described above.The simulated CT art slices were visually validated against the corresponding phantom CT art slices comparing them to the real US transducer-induced metal artifacts.

Phantom CT scans
Table 1 shows the combinations of CT scanners, anthropomorphic phantoms, and US transducers used in this work.In particular, three types of adult anthropomorphic phantoms were used to scan with and without a total of four types of US transducers.The utilized anthropomorphic phantoms were an ART-211 male phantom (ART, Radiology Support Devices, Long Beach, CA, USA); an ATOM® male phantom (CIRS, Model-701, Norfolk, VA, USA); and a CT torso phantom (CT Torso, Model CTU-41, Kyoto Kagaku Ltd, Japan).These anthropomorphic phantoms were constructed using tissue-equivalent epoxy materials that mimics the density and attenuation characteristics of human tissues.They include a range of components, such as cardiac structures, air-equivalent materials for simulating lungs, and bone-like materials with simulated air pockets.To obtain the paired CT scans, each phantom was CT scanned with and without a US transducer, resulting in a CT art scan and the corresponding CT ref scan, respectively (See Fig. 2 for an example of the procedure).The US transducers were positioned on the phantoms at various angles to be suitable for proper imaging of the heart.As the dimensions of US transducers, including the size and width of their metal components, can have an impact on the creation of metal artifacts on CT scans.Among the US transducers used, the linear volume array transducer was the largest and had the widest metal component, which was measured to be 6 cm using CT

Cycle-MAR network
The CycleGAN model [43] has been proposed in literature for unpaired training data.However, we used paired data in this work to enforce the restoration accuracy of anatomical structures and HU values [51].The workflow of the developed Cycle-MAR network is illustrated in Fig. 3.
The CycleGAN translates the metal artifact domain (X) into an artifact-free domain (Y) by using adversarial loss ( L adv ) , cycle consistency loss L (cycle) , and identity loss L (identity) .This process includes two mapping func- tions, G X ∶ X → Y and G Y ∶ Y → X .The mapping function G X translates a CT art slice into a CT cor slice, whereas G Y translates a CT ref slice into CT art slice.The network also consists of two adversarial discriminators D X and D Y which aim to distinguish the translated domain as fake.D X aims to distinguish between x from G X (y) and D Y aims to distin- guish y from G Y (x) .For G Y , the ( L adv ) is the mean squared error (MSE) between output G Y (x) and target domain Y .The L (cycle) calculates the translation error between xand G X G Y (x) through the translation of X → Y → X and in vice versa of Y → X → Y .The L (identity) was introduced to regularize the G X and G Y to not induce any changes when x and y were the input for them, respectively.
In this study, regularization parameters values of cycle = 10 for, and identity = 15 were chosen among several examined parameter sets.To implement Cycle-MAR, the ResNet [52] and the PatchGAN [42] architectures were used as the generator and the discriminator, respectively.The network was trained using the Adam optimizer [53]     values on the clinical CT scans were clipped between − 1000 and 1000, and the remaining pixel values were normalized between − 1 and 1 to improve the training efficiency [28].

Comparison with commercial and research-based MAR algorithms
The performance of the developed Cycle-MAR network was compared with commercially available MAR algorithms and research-based MAR algorithms.iMAR and O-MAR were directly applied during the reconstruction of the phantom CT scans by the scanners.iMAR was not applied to the Siemens PET-CT scans, because it was not available on this particular scanner.In addition to this, MDT, CCS-MAR, and Cycle-MAR were also applied to all phantom and clinical CT scans.

Image quality metrics analysis
Structural similarity (SSIM) index, root mean square error (RMSE) of the HU values, and peak signal-to-noise ratio (PSNR) [38,55] were calculated to evaluate the performance of the Cycle-MAR network for metal artifact reduction and image quality improvement.These metrics were calculated for the CT art and the CT cor scans compared to the CT ref scans.The analysis was performed using the overall mean values of these image quality metrics calculated for all CT art and CT cor scans.

HU value restoration evaluation
For the clinical CT scans, HU value measurements on specific regions were performed on the CT art scans, CT cor scans and the CT ref scan.The contour-based mean HU values and standard deviation (STD) were calculated for the entire heart, lungs, and bone regions from all CT slices using MatLab (The MathWorks Inc, USA) (See Fig. 5).The percentage of mean HU value improvement for CT cor scans was calculated using the following equation,

Phantom scans analysis
Figure 4 shows the CT ref scans, and the corresponding CT art and CT cor scans from the ART, ATOM®, and CT torso phantoms.In general, Cycle-MAR outperformed other MAR algorithms to reduce the intense dark or bright regions near the US transducer during the visual inspection.Residual streak artifacts were observed on CT cor after the Cycle-MAR application, especially in the ART phantom scans (red mark in Fig. 4).In the ATOM® phantom scans, O-MAR and MDT applications induced secondary dark streak artifacts (yellow marks in Fig. 4).The Cycle-MAR application on the CT art scans generally improved the calculated SSIM and PSNR values, while the RMSE values were decreased (Table 3).

Clinical scans analysis
An example of a CT art scans from three randomly selected patients and the effect of MDT, CCS-MAR, and CycleGAN applications on them for metal artifact reduction is shown in Fig. 5. Based on the visual inspection, Cycle-MAR application on CT art scans restored the soft tissue and bone details better than the MDT and CCS-MAR algorithms (red and green arrows in Fig. 5).The overall mean values of SSIM, PSNR, and RMSE for all clinical CT scans from 14 patients are shown in Table 4. Cycle-MAR application on CT art scans generally resulted in higher mean SSIM and PSNR values and lower mean RMSE values.Table 5 shows the overall mean (± STD) HU values for the CT ref Scans, and the calculated absolute differences between the overall mean and the differences of standard deviation (STD) of HU values for the CT art Scans, and CT cor Scans compared to CT ref scans for the heart, lungs, and bone regions on the clinical CT scans.The application of MAR algorithms improved the HU value measurements across all regions.Cycle-MAR application restored the mean HU values for the heart and lung region better than MDT and CCS-MAR.
The percentage of mean HU value improvement for the heart, lungs and bone regions is shown in Fig. 6.For the heart region, in which typically the target is located, the improved HU value percentage after MDT, CCS-MAR, and Cycle-MAR applications was 59.58%, 62.22%, and 72.84%, respectively.The regions of lungs and bone were considered as OARs in this study, for these regions, the highest improvement percentage was found after the application of Cycle-MAR and MDT, respectively.

Discussion
In this study, a Cycle-MAR algorithm which used paired CT scans for training purposes was proposed to reduce the US transducer-induced metal artifacts on planning CT scans for US-guided cardiac radioablation.Cycle-MAR was evaluated for the improvement of image quality and HU value restoration compared to the commonly used commercial and research-based MAR algorithms.
Overall, the proposed model effectively reduced the metal artifacts on the clinical CT scans more than on the phantom CT scans.For the Cycle-MAR training, only clinical CT scans, and no phantom CT scans were used.This might be a reason for the noticeable residual streaks on the phantom CT cor scans after the Cycle-MAR application (See Fig. 4).This potentially can be improved by adding separate sets of phantom scans to the algorithm training set.On the other hand, in the end, the algorithm will be used in the clinic   All MAR algorithms restored the measured HU values for the cardiac structures within 21 HU, which is well below the tolerance accuracy of 30 HU for the waterlike material recommended by the American Association of Physicists in Medicine (AAPM) guidelines [56] for image-guided radiation therapy.Remarkably, the same guidelines recommend that the HU value deviation for the lung and bone be within 50 HU.
The application of conventional MAR algorithms on CT art scans modified the anatomical structures and induced a number of secondary artifacts (See Figs. 4 and 5).These MAR algorithms apply their correction on the projection data, therefore, small errors in local corrections in the projection data can affect the reconstructed CT scan globally [12].However, Cycle-MAR works in the image space and does not require any projection data for artifact corrections.This means that the local changes are applied only to a specific area on the CT scan.
To the best of our knowledge, this is the first study to investigate the application of a deep learning model, Cycle-GAN, for the reduction of US transducer-induced metal artifact on CT scans which has been compared to state-ofthe-art MAR algorithms.Even though Cycle-MAR generally well reduced the metal artifacts compared to other MAR algorithms, especially in the clinical CT scans, a reduction in image contrast was observed on the CT cor scans after Cycle-MAR application.A possible reason for this is the inherent limitation of the generator in the conversion and/ or reassignment accuracy of pixel values while performing feature extraction and the image translation process.In addition, the direct optimization in pixel differences through the loss function may also result in reduced image contrast or blurry appearance on CT scans [31,57].Therefore, investigating a different generator, especially DenseNet (Densely Connected Network) [58,59] instead of ResNet, and also examining appropriate loss functions may solve this issue.
This work has a few limitations: the performance of Cycle-MAR was evaluated using clinical CT scans with the simulated US-transducer-induced metal artifacts.To draw final conclusions regarding the performance of the proposed MAR algorithm, further evaluation using clinical data with real artifacts is necessary.However, this can be a challenging task due to ethical justification of acquiring an additional CT scan [one with the probe in place (CT art ) and another one without it (CT ref )].
In addition to the evaluation of image quality improvement and HU value restoration, the dosimetric impact of the metal artifact reduction including the accuracy in contouring, and the calculation of dose distribution for the arrhythmogenic tissue (target) and OARs during the treatment planning is also crucial.Future work will therefore include an evaluation of the dosimetric impacts of the application of the Cycle-MAR network.

Conclusion
This work developed a MAR network based on a deep learning CycleGAN which can be used to reduce metal artifacts resulting from the presence of a US transducer during CT scan acquisition.The performance of the proposed algorithm was evaluated for the metal artifact reduction abilities on phantom and clinical CT scans in comparison with commonly used commercial and research-based MAR algorithms.The results of the study have shown that the proposed Cycle-MAR considerably reduces the metal artifacts, while preserving the bone density and soft tissue details.Future challenges and analysis include exploring appropriate loss functions for the improvement of adversarial training, and dosimetric evaluations using clinical CT scans.
Acknowledgements The authors would like to thank F. Edward Boas, MD, Ph.D. for providing the MDT algorithm, and Giovanna Dipasquale and Nikolaos Koutsouvelis from the Division of Radiation Oncology at the Geneva University Hospital as well as Nicoletta Lomax and Roger Hälg from Aarau Kantonspital for their support during the data acquisition of the ART and ATOM® phantoms for this study.The authors are grateful to Peta Gray from the Herston Medical Research Institute (HIRF) for providing support to collect data from the CT torso phantom.Finally, the authors also acknowledge the important contribution of Damjan Vukovic from the Queensland University of Technology for providing assistance to access the GPU system.

Fig. 1 a
Fig. 1 a The procedure for simulation of US transducer-induced metal artifact on clinical CT scans.The first and second row show the segmented US transducers from the phantom CT scans and the aligned US transducers with the suitable clinical CT slices, respec-

Fig. 2 Table 1
Fig. 2 Workflow to obtain the dataset pairs: in (a) the CT torso phantom is shown.The positioning of the phantom with (b) and without (c) the linear volume array transducer resulted in CT scans as shown in (d) and (e)

Fig. 3
Fig. 3 Training workflow of the Cycle-MAR network for the reduction of the US transducer-induced metal artifacts.It has two mapping functions: generator (G Y ) transforms the CT art scan (domain X) into CT cor scan, while generator (G X ) transforms the CT ref scan (domain

1 )Fig. 4
Fig. 4 From top row to bottom: CT scans of the ATOM®, ART, and CT torso phantoms.The images from left to right show: the CT ref scan with the US transducer details, CT art scan, and the CT cor

Fig. 5
Fig. 5 CT scans from three patients.The images from left to right show: the CT ref scan with the US transducer details, CT art scan, and the CT cor scans after MDT, CCS-MAR, and Cycle-MAR applications [Window level/width: 50/350].The red and green arrows indicate the

Table 2
shows the data split strategy for the training and testing of Cycle-MAR.The network was trained using randomly selected paired clinical CT scans consisting of CT ref scans and the corresponding simulated CT art scans.Then it was tested on the phantom CT scans with real artifacts, and on the clinical CT scans with simulated artifacts.The HU

Table 2
Data split strategy for training and testing of the Cycle-MAR network

Table 3
Mean values of SSIM, PSNR and RMSE for the CT scans of the ATOM®, ART and CT torso phantoms The best performance is indicated with bold numbers

Table 4
The overall calculated mean values of SSIM, PSNR and RMSE values for all clinical CT scans from 14 patientsThe best performance is indicated with bold numbers

30.86 (± 8.80) Fig. 6
HU values improvement percentage for the heart, lungs and bone regions on the CT cor scans after MDT, CCS-MAR and Cycle-MAR application and therefore good performance on phantom scans is less important.To further reduce the residual streaks, the CT cor scans which resulted from the training of Cycle-MAR may be added to the training data set and this is under consideration for future work.

Table 5
The overall regionbased Hounsfield unit (HU) value measurements and calculation using all clinical CT scans from 14 patients |Δ mean| (± Δ STD) HU represents the absolute differences of overall mean and the differences of STD between CT ref Scans and the corresponding CTart art Scans, and as well as between CT ref Scans and CT cor Scans, respectively