Introduction

Both hip fractures and vertebral fractures are common in the aging patient population. Therefore, metal hip replacements and spinal stabilizations are often encountered in abdominal imaging. Metal generates typical high- and low-density streak artifacts on CT [1]. Dedicated iterative metal artifact reduction algorithms (iMAR) are available to reduce metal artifact for a variety of implants [2,3,4,5,6,7,8].

Another approach to reduce metal artifacts is dual-energy CT (DECT) derived virtual monoenergetic images (VMIs) [9,10,11]. They simulate an acquisition with a monoenergetic beam. Higher energies than that of the polyenergetic beam can be simulated, without actually increasing the tube voltage of the acquisition [12]. Currently there are different hardware solutions available to acquire DECT images [13]. Multiple studies have shown the feasibility of metal artifact reduction with high keV VMIs for most DECT scanners [14,15,16,17,18,19,20,21,22,23]. The combination of iMAR and VMIs on established DECT platforms has shown different image reconstructions to be most favorable [8, 23,24,25,26,27,28,29]. Nevertheless, there is a paucity of studies on the metal artifact reduction capabilities of the split-filter DECT platform (Twin Beam DECT, sfDECT). SfDECT uses a filter made of equal parts of gold and tin in front of the tube output, splitting the beam in a high- and low energy part [30]. The tin part of the filter shifts the x-ray spectrum toward higher energies, leading to a higher dose efficiency and possibly aiding imaging of metal implants [31]. Because of the limited spectral separation of the sfDECT [32], it remains unclear if high keV VMI will be of added value for spinal and hip implants if iMAR is used.

Previous studies of DECT have focused on high energy VMI to reduce metal artifact, since low energy VMI increase artifact. However, in abdominal imaging mostly low energy VMIs are employed to investigate clinical questions: Improve vascular contrast and create arterial phase-like images from venous phase images [33], increase conspicuity of pancreatic or hepatic lesions [34], bowel wall ischemia [35, 36], or sensitivity in CT colonography [37]. Using iMAR may be a way to enable the use of low keV VMI with metal implants.

The aim of this study was to investigate if iMAR, VMIs or their combination are able to improve objective and subjective image quality over 120kVp-equivalent images in split-filter abdominal DECT with hip replacements or spinal hardware.

Materials and methods

Study population

This study was approved by our institutional review board and the need for informed consent was waived. Between June 2019 and October 2019, 154 consecutive adult patients with either hip or spinal metal implants, who underwent portal venous phase (thoraco-)abdominal DECT for other clinical reasons were initially included in this retrospective study (Fig. 1).

Fig. 1
figure 1

Study population: 154 consecutive patients with either hip or spinal implants, who underwent abdominal sfDECT between June and November 2019, were initially included. Of these 154 patients, 52 patients were excluded from analysis. The final study population for analysis comprised a total of 102 patients

DECT Imaging and postprocessing

All patients underwent portal venous phase (thoraco-)abdominal DECT and received a non-ionic contrast medium (Ultravist® 370 mg I/ml, Bayer HealthCare Pharmaceuticals, Berlin, Germany or Iopamiro® 370 mg I/ml, Bracco Suisse S.A., Manno, Switzerland) at a volume between 60 and 115 ml, adjusted to patient weight, with a flow of 1.5–3 ml/s, intravenously. All exams were performed on sfDECT scanners (SOMATOM Definition Edge (n = 31), or SOMATOM Definition AS + (n = 71), Siemens Healthineers, Erlangen, Germany). The following DECT acquisition parameters were applied: tube voltage 120 kVp, split-beam tube filtration with gold and tin, 420 average reference mAs, 64 × 0.6 mm collimation, 0.33 s rotation time and 0.3 pitch. All acquisitions were obtained with automatic tube current modulation (CARE Dose4D; Siemens Healthineers).

The dual-energy source images were reconstructed using ADMIRE 3 with a Q30f kernel at 1.5 mm slice thickness with 1 mm slice interval, with and without iMAR (hip or spine setting, depending on the implant present), as a high- and low-kVp dataset at the scanner console and automatically transferred to a post processing software (syngo.via VB30A_HF04, Siemens Healthineers). In the “CT Dual-Energy” workflow conventional- or 120kVp-equivalent images were generated with vendor recommended settings as a linear blend of the high-(20%) and low-(80%) energy dataset, further called Mixed images. Using the “Monoenergetic Plus” application with vendor recommended settings (resolution = 6, Minimum [HU] =  − 950, Maximum [HU] = 3071) VMI were reconstructed at keV levels ranging from 40 to 190 keV with an 10 keV increment. In the syngo.via software all datasets were reconstructed at 5 mm slice thickness with 2.5 mm slice interval (Table 1).

Table 1 Image acquisition and- reconstruction parameters

Objective image quality measurements

For objective image quality measurements images were transferred to a secondary imaging platform, “Nora Imaging” [38]. This enabled the placement of circular regions of interest (ROIs) in the exact same image position for all 34 different image reconstructions of each patient. ROIs were placed on axial image reconstructions of mixed images without iMAR and then automatically copied to all other reconstructions. The size of the ROIs was set to 1 cm2, and modified to capture only the artifact and only the specific underlying tissue. ROIs were placed in artifacts on clearly defined tissue as well as farther away reference tissue not affected by artifacts. The specific ROI positions for the evaluation of hip prostheses and spinal implants are described in Table 2 and displayed in Fig. 2.

Table 2 ROI positions
Fig. 2
figure 2

Example of ROI placements in a patient with a spinal implants and b bilateral hip prostheses. a Exemplary positioning of ROIs in the hyperdense artifacts in inferior vena cava (blue circle), psoas muscle (brown circle) and the slight hyperdense artifact in kidney (red circle), as well as in the hypodense artifact in the abdominal aorta (green circle). ROIs in the subcutaneous fat were placed in a different slice not displayed in the image. b ROIs were placed in the hypodense artifact in bladder (purple circle), as well as in by hyper- (blue circle) and hypodense (green circle) artifact impaired subcutaneous tissue. The hyperdense artifact in muscle was measured in the iliopsoas muscle (red circle), the hypodense muscle artifact (beige circle) was measured in the internal obturator muscle. The ROI to measure the hyperdense artifact in bladder was placed in a different slice not displayed in the image. All measurements were corrected by calculating the difference between artifact impaired tissue and its reference tissue without artifact (not captured in this image)

In all ROIs mean attenuation in Hounsfield Units (HU) and standard deviation values were automatically extracted. To account for the influence of monoenergetic reconstructions on measured HU values, the corrected attenuation was calculated as the difference between artifact impaired tissue and its reference tissue without artifact to only take the real artifact into account, as previously described in the literature [23]:

$$Artifact\left[ {HU} \right]_{(corrected)} = Artifact\left[ {HU} \right]_{(uncorrected)} - Underlying \, tissue\left[ {HU} \right]$$

As a surrogate for image noise standard deviation (SD) of attenuation in HU was measured in tissues affected by artifact and corrected by measurements of the same tissue in areas without artifact, as previously suggested [39]:

$$Image \, noise\left[ {HU} \right] = SD \, of \, artifact\left[ {HU} \right] - SD \, of \, same \, underlying \, tissue \, without \, artifact\left[ {HU} \right]$$

Subjective image quality assessment

Two board-certified radiologists with 5 and 6 years of training rated subjective image quality. Readers were trained on 8 datasets from patients excluded from the analysis. Readings were performed on Mixed and VMIs at 50, 70, 110, 140, and 190 keV both with and without iMAR. Window width/level settings could be freely adjusted. Readers evaluated artifacts and vascular contrast on a five-point-Likert-scale, with a score of one representing only subtle artifact/excellent vascular contrast and five representing massive artifacts/vascular contrast similar to the use of no intravenous contrast. Depending on the evaluated tissue individual scores were further specified for the readers (Supplementary Table 1).

Statistical analysis

To assess inter-reader agreement of subjective reading, the intraclass correlation coefficient (ICC) was calculated as a two-way random effects model as a mean of the two raters for absolute agreement, and interpreted as previously reported [40, 41]. Paired t-tests were used to compare quantitative artifact measurements of 40–190 keV vs 40–190 keViMAR of the same keV level, as well as Mixed vs MixediMAR. One-way ANOVA with Tukey honestly significant difference post-hoc test was used to compare quantitative artifact and image noise between Mixed and 40–190 keV, as well as MixediMAR and 40–190 keViMAR.

Subjective readings were compared using Kruskal–Wallis H test. Differences between individual image reconstructions were further investigated using a pairwise Mann–Whitney test with Benjamini–Hochberg adjustment to correct for multiple testing. Additionally, in the subgroup of patients, where the metal artifact was at the site of clinical question, differences in overall image quality were compared between Mixed and MixediMAR using the pairwise Mann–Whitney test. The level of statistical significance was defined as p < 0.05. Statistical analysis was performed using R (Version 4.0.5) [42].

Results

Study population

Of the initially included 154 consecutive patients a total of 52 patients were excluded from analysis due to different reasons (Fig. 1).

Thus, the final study population for analysis comprised a total of 102 consecutive patients (female n = 55, male n = 47), with a mean age of 77 years (range 50–96 years). Clinical indications for the included scans varied: Infection (n = 54), oncologic (n = 24), intestinal obstruction (n = 21), trauma (n = 2), ischemia (n = 1). 71 patients had hip implants (unilateral n = 45, bilateral n = 26), 31 patients had spinal implants. 37% of patients (38/102) had a specific clinical question and/or finding in a site affected by the metal implants (32 hip implants and 6 spinal implants). The average weight and body mass index (BMI) were 72.7 ± 17.9 kg and 26.0 ± 6.3 kg/m2 for the hip implant group and 73.9 ± 19.5 kg and 25.5 ± 5.13 kg/m2 for the spinal implant group, respectively. The average radiation dose for abdominal scans was: CTDIvol 10.90 ± 3.11 mGy and DLP 579.48 ± 202.39 mGy*cm.

Objective image quality

Hip implants

For images without iMAR the keV level of VMI with the lowest artifact was VMI190keV for all measured ROIs (Table 3), (Supplementary Table 2a).

Table 3 Quantitative artifact hip implants

Comparing MixediMAR to Mixed images, MixediMAR showed decreased artifact in all ROIs. This difference reached statistical significance for all ROIs (p < 0.05), except for hyperdense artifact in subcutaneous fat (p = 0.06). Lowest artifact in VMIiMAR were observed in VMI190keV-iMAR, albeit not being statistically significant for any of the ROIs (for all p > 0.05).

Slight overcorrection of artifacts was seen in iMAR images in hyperdense artifact in muscle (Mean corrected artifact VMI40keV-iMAR: − 15.26 HU) (Fig. 3a), as well as the hypodense artifact in subcutaneous tissue (Mean corrected artifact VMI190keV-iMAR: 2.66 HU).

Fig. 3
figure 3

Boxplots of corrected Metal artifact for different reconstructions with and without iMAR. a Strongest hyperdense artifact in muscle in scans with hip prostheses. b Hypodense artifact in the internal obturator muscle in scans with hip implants. c Hyperdense artifact in the inferior vena cava in scans with spinal implants. d Hyperdense artifact in the kidney in scans with spinal implants

VMI50keV-iMAR showed less artifact in all ROIs compared to the optimal VMI images without iMAR (VMI190keV) (Fig. 3b). VMI50keV-iMAR images showed a stronger artifact reduction than Mixed images without iMAR. Comparison of VMI50keV-iMAR to VMI190keV-iMAR showed no statistically significant difference in artifact reduction (p > 0.83 for all ROIs).

Spinal implants

In images without iMAR lowest artifact was observed in VMI190keV (Supplementary Table 2b). However, there was no statistically significant difference compared to Mixed images (all p > 0.08) (Table 4).

Table 4 Quantitative artifact spinal implants

MixediMAR images showed significantly better artifact reduction compared to Mixed images without iMAR in hypodense artifacts in the abdominal aorta (p < 0.001), as well as in the hypodense artifact in subcutaneous tissue (p = 0.04). Between MixediMAR and VMI190keV-iMAR no significant difference in artifact reduction was observed in any ROI.

When comparing VMI50keV-iMAR images to Mixed images without iMAR, VMI50keV-iMAR showed significantly lower artifact in all hypodense artifacts. When comparing VMI50keV-iMAR to MixediMAR or VMI190keV-iMAR, no statistically significant difference in artifact severity was observed (all p > 0.05).

Overcorrection was seen in iMAR images in the hyperdense artifact in IVC and kidney as displayed in Fig. 4 and quantitatively shown in Fig. 3c and d.

Fig. 4
figure 4

Axial images of a patient with spinal implants, window setting: window width 300 HU, window level 40 HU. a Mixed b MixediMAR. Overcorrection of an originally hypodense artifact into a hyperdense artifact in the aorta (red arrow). Additional new hypodense artifacts in the kidney and the psoas muscle (green arrows) and reduced organ margin sharpness between the duodenum and right kidney (white circle) or at the dorsal retroperitoneum (blue circle)

Image noise

For both hip and spinal implants in images without iMAR no significant difference in corrected image noise was observed between Mixed and VMI190keV in all ROIs (all p = 1.00) (Supplementary Table 3a, b). MixediMAR showed significantly lower corrected image noise compared to Mixed images in most tissues, for hip implants but not for spinal implants. VMI190keV-iMAR did not significantly reduce corrected image noise compared to the MixediMAR images (all p = 1.00) for both implant types.

Subjective image quality

Overall interrater-agreement was good: ICC = 0.77 (95% Confidence-Interval: 0.71–0.82).

Hip implants

iMAR reconstructions were rated better than corresponding images without iMAR for overall image quality (Supplementary Table 4a), as well as all other diagnostic criteria.

For overall diagnostic image quality MixediMAR was rated best (Fig. 5a), and was significantly better than all other images with and without iMAR (for all p < 0.002). There was no significant difference between VMI70keV-iMAR to VMI190keV-iMAR (for all p > 0.07), suggesting that overall diagnostic image quality wasn’t further improved using higher keVs.

Fig. 5
figure 5

Subjective overall image quality for hip (a) and spinal (c) implants. Lower values describe lower artifact, see also supplementary Table 1. Subjective vascular contrast for hip implants (b). Subjective organ margin sharpness for spinal implants (d). Lower values describe better image quality/vascular contrast (see supplementary Table 4). Central bar shows the median, lower and upper hinges of the box correspond to the first and third quartiles, whiskers extend from the hinge to the largest value no further than 1.5 times the interquartile range

The evaluation of vascular contrast showed no significant difference between VMI images of the same keV level with and without iMAR (for all p > 0.52), except for VMI70keV (p = 0.03) (Fig. 5b).

While for overall image quality VMI50keV-iMAR images were rated worse than MixediMAR images and VMIiMAR of higher keV levels, VMI50keV-iMAR images were rated significantly better than images without iMAR, both Mixed and VMI of any keV level (for all p < 0.001). Additionally, VMI50keV-iMAR were rated best in terms of visualization of vascular contrast (Fig. 6). They were rated significantly better than Mixed or VMI of all other keV levels with iMAR (for all p < 0.001).

Fig. 6
figure 6

Axial images of a patient with unilateral hip implant. Images all shown with the same windowing: window width 600 HU, window level 150 HU. Images A without iMAR, images B with iMAR: (1) Mixed (2) VMI50keV (3) VMI190keV. Note the improved vascular contrast on VMI50keV-iMAR (A.2 and B.2) and the reduced tissue contrast on VMI190keV-iMAR (A.3 and B.3)

Spinal implants

MixediMAR and VMIiMAR images were rated better in terms of overall diagnostic image quality (Fig. 5c), as well as muscle, osseous and prevertebral structure evaluation, compared to the corresponding images without iMAR (Supplementary Table 4b).

VMI50keV-iMAR was not rated inferior than Mixed, VMI110keV, VMI140keV and VMI190keV images without iMAR (all p > 0.51) regarding overall diagnostic image quality.

While in the category organ margin sharpness, iMAR images were not rated better than corresponding images without iMAR (all p > 0.77) (Fig. 5d), for prevertebral structures iMAR images were significantly better than images without iMAR (all p < 0.005).

For the assessment of vascular contrast VMI50keV-iMAR and VMI50keV were rated better than all other images (all p < 0.001).

Patients with artifact at the site of clinical question

Thirty-two studies with hip implants had artifact in the area of interest. Subjective overall image quality and diagnostic quality improved from the Mixed images: 3.78 ± 0.98 to 1.67 ± 0.67 for the MixediMAR images, p < 0.001.

For the six patients with spinal implants image quality improved from Mixed to MixediMAR from 3.43 ± 1.5 to 2.29 ± 0.73, p = 0.006.

Discussion

In our study we assessed the image quality and value of a dedicated iMAR together with VMI of sfDECT in abdominal CT of patients with hip or spinal implants. We found that both for hip and spinal implants MixediMAR images are preferred. They showed quantitative artifact reduction in both hypo- and hyperdense artifact, as well as improved subjective image quality over Mixed images without iMAR in all diagnostic criteria. Additional high keV VMIiMAR did not further improve image quality over MixediMAR. However, due to lower artifact on iMAR images, it is possible to use low keV images (VMI50keV) to improve vascular and soft tissue contrast.

Many previous works have evaluated iterative metal artifact reduction algorithms and high keV VMI for reducing artifacts in hip and spinal implants both in phantom and patient studies. However, studies that combined both iMAR algorithms and VMI had different outcomes depending on the algorithm and DECT platform used. In a phantom with hip implants, using both dual-source and rapid-kVp-switching DECT, Andersson et al. showed that iMAR images are preferred in the visual analysis over VMI [43], while quantitative artifact was lower when both approaches were combined [44]. In a patient study with hip implants Youe et al. found the combination of iMAR and VMI to be providing lowest artifact on rapid-kVp-switching DECT scanners [45]. Same was found for dual-layer spectral-detector DECT [24]. Bongers et al. could show similar results for dual-source DECT, but noted only an incremental value of adding high keV VMI to iMAR [46].

In contrast to these findings in hip implants, for spinal implants Wang et al. reported a preference for VMI over iMAR images when using rapid-kVp-switching DECT, due to massive overcorrection of artifacts on iMAR images [27]. Yet, for dual-source and dual-layer spectral-detector DECT the combination of iMAR and VMI was rated best [8, 25]. The only study on metal artifact reduction in sfDECT with iMAR investigated dental hardware in head and neck CT [47]. They found a greater impact of iMAR than VMI and only a slight benefit of combining both techniques.

While the majority of studies with hip implants preferred a combination of VMI and iMAR, our study indicates that for sfDECT MixediMAR images are preferred. Despite not improving image quality, VMI190keV-iMAR still showed lowest quantitative artifact. This is in line with the results of Bongers et al.[46], where only an incremental value of VMI to iMAR was seen. The inferior spectral separation of sfDECT compared to dual-source DECT may explain why in our study this small increase in artifact reduction with VMI did not provide enough additional benefit to be perceived helpful by the readers. For spinal implants readers also preferred MixediMAR images in our study. This seems to be different from dual-source and dual-layer spectral-detector CT. Similarly to the findings in rapid-kVp-switching DECT and other studies [47, 48], overcorrection of artifact was also observed in spinal implants in our study when using iMAR, however to a much lesser degree. This may be because the iMAR algorithm, used in this study, combines beam hardening correction, normalized sinogram inpainting and frequency split, to address both, the avoidance of new artifacts and the conservation of the original image impression [5, 49]. Our study found low keV images with iMAR (VMI50keV-iMAR) increased vascular contrast, while maintaining sufficient image quality, which is of special interest in abdominal imaging. This differs from the findings of the only previous study that assessed iMAR and low keV VMI [47]. This may be due to the fact, that we investigated abdominal CT examinations instead of head and neck CT, with a different radiation dose and field of view.

A subgroup analysis of our patient cohort revealed that over a third of patients (37%) had implant-based metal artifact directly at the site of clinical question. Subjective image quality improved significantly from markedly reduced diagnostic interpretability to only slight artifacts without impaired diagnostic interpretability. This underlines the clinical importance of this technique to reduce image artifacts.

There are several limitations to this study. First this was a retrospective design. Second, readers for subjective image quality were not blinded toward the image reconstruction. iMAR reconstructions show a typical and recognizable appearance and have lower artifact, making a blinding of the readers not feasible. To assess objective artifact reduction the corrected HU were measured. While this allowed a fair comparison between mixed images and VMI, it is not used clinically. We did not distinguish between unilateral and bilateral hip prosthesis. Previous studies showed that MAR-algorithms especially yielded benefits over VMI alone in severe artifacts such as in bilateral hip prosthesis [50]. The results of our study are specific for split-filter dual-energy CT with the use of iMAR for hip and spinal implants, they do not transfer to other dual-energy CT scanner types or iterative metal artifact algorithms.

Conclusion

For abdominal split-filter DECT of patients with spinal or hip implants iMAR should be used to minimize metal artifacts. MixediMAR images provide best image quality. While high keV VMI can reduce quantitative artifact, they do not improve overall image quality. Low keV images (VMI50keV) may be used together with iMAR to improve vascular contrast, while providing less metal artifact compared to non iMAR images.