Introduction

Iron disorders show differences in degree or distribution of iron overload between organs, underlying etiologies, and treatment [1]. In the primary form of hereditary hemochromatosis, iron overload mainly affects the liver and, in later stages, the pancreas and the heart [2]. In subjects with primary or secondary iron loading, pancreas iron accumulation was defined as a predictor of cardiac iron overload while an iron-free pancreas virtually precludes increased cardiac iron [3, 4]. While in hereditary hemochromatosis the reticuloendothelial system is iron-deficient [5, 6], in secondary iron hemochromatosis, iron accumulates also in macrophages in the liver, spleen, and bone marrow. Accordingly, genetic hemochromatosis is characterized by low iron content in the spleen, with the exemption of ferroportin loss of function mutations where iron is retained in macrophages and thus in the spleen [7]. In hematologic disorders and dysmetabolic syndromes, splenic iron has been reported to be neither decreased nor increased [8]. Therefore, in the diagnostic workup of patients with iron overload, in addition to quantification of liver iron, the evaluation of other abdominal organs such as the spleen and the pancreas has been recommended as important non-invasive diagnostic workups [9, 10].

Magnetic resonance (MR) R2* relaxometry was established in clinical routine as a reliable method to assess liver iron concentration (LIC). Originally, a variety of different MR sequences and post-processing methods were used, which were frequently individually developed and calibrated by the performing center [4]. Therefore, availability of MR relaxometry for LIC quantification was limited to specialized centers with appropriate expertise, and switching between different methods was found to be not advisable [9]. Meanwhile, different vendors have introduced commercial 3D chemical shift imaging sequences which simultaneously allow the quantification of liver R2* values for LIC quantification together with the quantification of hepatic fat fraction. Usually, these sequences do not necessitate separate off-line post-processing, but automatically provide R2* maps to determine hepatic iron content and proton density fat fraction (PDFF) maps to quantify liver fat content. Thus, these parameters have become easily available for a wide range of users and earlier studies have shown that these 3D chemical shift imaging sequences are a reliable tool for MR hepatic iron assessment and their performance has been proven to be comparable with established relaxometry methods [11]. It is important to note that because the underlying post-processing algorithms mostly rely on a multi-peak fat model of the liver [12,13,14], these sequences are only meant to be used in the liver. But, as mentioned earlier, apart from the liver, the determination of iron loading in the pancreas or the spleen is also of diagnostic importance. These organs are typically within the 3D acquisition volume used for the liver, so that it would be tempting to obtain also R2* values for the pancreas and the spleen from the same 3D chemical shift imaging data set, which actually has already been proposed in the literature [4, 15]. Unfortunately, it has not yet been validated whether R2* values obtained with the commercial 3D chemical shift imaging sequences for the pancreas or the spleen are reliable.

It was therefore the purpose of our study to compare R2* values obtained for the pancreas and the spleen from a commercial 3D chemical shift imaging sequence with values of an established R2* relaxometry method that does not rely on a liver-specific post-processing model [16].

Materials and methods

This study was approved by the local institutional review board (Medical University of Innsbruck). We retrospectively evaluated 143 MR examinations in 108 patients with respect to not only hepatic but also splenic and pancreatic iron overload. Patients were referred to our department between 05/2020 and 01/2022 for MR imaging of the upper abdomen including quantification of liver iron. All examinations were carried out on a 1.5-T whole body MR system (MAGNETOM AvantoFit, Siemens Healthcare) using an 18-element body matrix coil and 12–16 elements of the integrated 32-channel spine matrix coil. Scans were performed in supine position and images were acquired in transversal orientation during breath-holds at the end of expiration. For iron quantification, we used two different sequences: a commercial 3D chemical shift imaging sequence (qDixon) and a biopsy-calibrated, fat-saturated 2D multi-gradient echo sequence (ME-GRE) [16]. Parameters for both sequences are given in Table 1.

Table 1 MR parameters for both sequences routinely used for liver iron quantification

The qDixon sequence is based on a 3D multi-gradient-echo acquisition with 6 echoes and uses controlled aliasing undersampling (CAIPIRINHA) [17], which allows acquisition in a single breath-hold. During image calculation, the sequence utilizes advanced inline processing using a multi-peak fat model and a multistep adaptive fitting approach to automatically calculate R2* and PDFF maps without need for further post-processing. Any image-viewing software that allows region of interest (ROI)–based signal intensity measurements can be used for measuring R2* and PDFF values. For the qDixon sequence, default parameters suggested by the vendor were used. Image analysis of the obtained images was performed using our clinical standard picture archiving and communication system (IMPAX; Agfa-Gevaert). One MR experienced radiologist (MP) carefully placed three ROI with a mean area of 50 mm2 (diameter approx. 8 mm) within the liver parenchyma, two in the pancreas (body and tail) and one with 140 mm2 (diameter approx. 13 mm) in the central spleen always avoiding artifacts, big vessels, and focal lesions.

For the ME-GRE sequence, R2* maps were calculated using a custom-written plugin for ImageJ (Wayne Rasband, National Institutes of Health) by fitting on a pixel-wise basis with a single exponential truncation model [16]. ROI placement on R2* maps was then independently performed on co-registered areas by a physicist (CK) with longstanding experience in MRI post-processing.

Statistical analysis

For further analysis, R2* values for the liver, pancreas, and spleen obtained with both methods were stored as an Excel worksheet. Statistical calculations were then performed using the R Project for Statistical Computing [18]. Splenic and pancreatic R2* values (1/s) were compared between qDixon and ME-GRE using Bland–Altman plots and concordance correlation coefficients (CCC) were calculated. To test the hypothesis that the obtained bias was equal to zero, a one-sample t-test was performed. p-values < 0.05 indicated a significant difference form zero. In addition, linear regression analysis was performed by fitting a linear model to the data. Finally, iron overload for the pancreas and the spleen was defined as R2* > 50 1/s. Using contingency tables, the agreement of both methods regarding iron overload classification was determined. Agreement coefficients were given by calculating the percent agreement and to avoid paradoxical kappa values, Gwet’s AC1 coefficient was used with the rel-package for R [19,20,21].

Results

Altogether, 143 examinations in 108 patients (65 male, 43 female) with a mean (median; range) age of 61.3 (64.0, 19–88) years were included in the initial analysis.

Due to wrong slice positioning (e.g., the pancreas not included in ME-GRE sequence), poor delimitation of the organ, or intense (motion) artifacts, only 140 examinations with an analyzable spleen and 132 with an evaluable pancreas remained.

Common data

Table 2 represents the resulting pancreatic and splenic values of all patients using the qDixon sequence and the ME-GRE sequence.

Table 2 Pancreatic and splenic values of all patients using both sequences

Agreement spleen

Bland–Altman analysis of splenic R2* values between qDixon and ME-GRE resulted in a bias (absolute mean difference) of 2.12 1/s (LoA of 49.62 and − 45.38) with a CCC of 0.934 (0.909–0.952) (Fig. 1). The bias was not significantly different from zero (p = 0.302).

Fig. 1
figure 1

Bland–Altman plot representing the absolute difference of splenic R2* values between qDixon and ME-GRE sequences

Linear regression analysis correlating splenic R2* values of qDixon and ME-GRE resulted in a correlation coefficient of 0.94 (p < 2.2e − 16). The respective scatterplot is shown in Fig. 2.

Fig. 2
figure 2

Scatterplot of splenic R2* values of ME-GRE and qDixon

Agreement pancreas

Evaluating the agreement of pancreatic R2* values between qDixon and ME-GRE, the absolute Bland–Altman plot (Fig. 3) showed a mean bias of 0.29 (LoA of 20.09 and − 19.52) with a CCC of 0.714 (0.623–0.786). Again, the bias was not significantly different form zero (p = 0.743).

Fig. 3
figure 3

Bland–Altman plot representing the absolute difference of pancreatic R2* values comparing qDixon and ME-GRE sequences

Linear regression analysis (Fig. 4) for the pancreas resulted in a correlation coefficient of 0.725 (p < 2.2e − 16).

Fig. 4
figure 4

Scatterplot of pancreatic R2* values of ME-GRE and qDixon

Analysis regarding iron overload detection

When using a threshold of R2* > 50 1/s for the presence of iron overload in the pancreas or the spleen, the iron assessment in the pancreas agreed in 123 patients between qDixon and ME-GRE and resulted in different ratings (iron overload versus no iron overload) only in 9 patients leading to an overall agreement of 93.18% and a Gwet AC1 of 0.92, indicating strong agreement. For the spleen, iron assessment agreed in 128 and differed in 12 patients, leading to an overall agreement of 91.43% and a Gwet AC1 of 0.844, also indicating strong agreement. The corresponding contingency tables are shown in Table 3.

Table 3 Contingency tables of qDixon and ME-GRE examinations regarding splenic (top) and pancreatic (bottom) iron overload

Figure 5 demonstrates a patient example with iron overload in the spleen and normal iron load in the pancreas using the two different methods.

Fig. 5
figure 5

Splenic iron overload with central ROI placement on R2* maps of the 3D-qDixon sequence showing a mean R2* of 160 1/s (A) and a mean R2* of 186.6 1/s on a fat-saturated R2* relaxometry ME-GRE sequence (C). The images on the bottom reveal mean normal iron level in the pancreas using both methods. One ROI is positioned within the pancreatic corpus and one in the tail with a mean R2* of 37.05 1/s for the 3D-qDixon sequence (B) and 47.35 1/s for the ME-GRE sequence (D)

Discussion

In our study, we have shown that the commercial qDixon sequence, which is a 3D chemical shift imaging sequence, can apart from the liver also be reliably used to assess splenic and pancreatic iron overload.

The qDixon sequence and similar 3D chemical shift imaging sequences are only approved for imaging of the liver. They typically acquire a series of 6 gradient echoes and use a multi-fat peak model to obtain hepatic R2* and PDFF simultaneously from the obtained signals. This underlying multi-fat peak model, however, is usually derived from in vivo liver spectroscopy [12, 14] and thus, strictly speaking, only holds true for liver tissue. However, as the spleen and the pancreas are included in the acquired 3D volume of the upper abdomen, these two organs could be evaluated simultaneously in one examination. This would provide important additional information regarding differential diagnosis and the need for further imaging [3] whenever managing hyperferritinemia [7] and evaluating patients with suspected iron overload.

The algorithms used for R2* and PDFF parameter estimation from commercial 3D chemical shift imaging sequences use complex fitting of multi-spectral hepatic fat models and thus [13] necessitate prior knowledge of hepatic triglyceride spectra. Therefore, deviation of these spectra potentially induces quantification bias when performing measurements in organs other than the liver. Although Pezeshkian et al observed regional differences in fat composition between epicardial and subcutaneous adipose tissues [22], Reeder et al state that spectral positions of fat peaks are relatively constant across different types of fat and that differences between them probably have minimal influence on fat fraction measurements [23]. This is in agreement with Fukui et al [24] who found a significant correlation between histological pancreatic fat fraction and PDFF values obtained with a 3D chemical shift sequence which was also based on a hepatic multi-fat peak model. Hong et al [25] investigated the effects of varying six-peak triglyceride spectral models on PDFF assessment and demonstrated robustness of PDFF estimation across the biologically plausible range of triglyceride spectra over a wide range of different hepatic fat contents. Although confirming an increase of absolute estimation bias with higher PDFF, they underlined its small magnitude and therefore likely clinical insignificance. Moreover, they similarly found only minor bias in R2* estimation. This suggests reliable R2* quantification by 3D chemical shift imaging sequences also in organs other than the liver.

To date, using 3D chemical shift imaging sequences, only PDFF quantification in the spleen and the pancreas has been studied [25,26,27,28,29] mostly by comparing with MR spectroscopy (MRS) as a reference standard [30, 31]. 3D chemical shift imaging sequences have been used earlier to assess R2* in the spleen or the pancreas [26, 32], but to the best of our knowledge, so far, no study exists regarding the reliability of R2* values of these organs obtained with 3D multi-echo chemical shift imaging sequences. In our study, we compared R2* values obtained with the qDixon sequence for the pancreas and the spleen with values obtained with a 2D multi-gradient echo sequence (ME-GRE) which did not rely on prior knowledge of fat spectra but used simple magnitude fitting of a truncated exponential model. The used sequence applied fat suppression to improve the goodness-of-fit which might be reduced due to the confounding effect of fat. The used method closely corresponds to the biopsy-calibrated sequence used by Plaikner et al [33] who have shown very small differences between fat and non-fat suppression for R2* < 400 1/s which corresponds to the range of values observed in this study for the pancreas and the spleen.

Only modest agreement but strong correlation between R2* values of the compared methods was found for the pancreas (CCC = 0.714) while agreement was found to be excellent for the spleen (CCC = 0.934). In contrast, for the classification in iron overload or no iron overload based on R2* thresholds, strong agreement was found between both methods for the pancreas (overall agreement: 93.18%) as well as the spleen (overall agreement: 91.43%). The modest agreement for pancreatic R2* values might, among other reasons, be explained by the difference in pancreatic coverage of the used sequences. Whereas qDixon provides volume coverage with a slice thickness of 3.5 mm, pancreatic R2* maps for the used 2D ME-GRE sequence were acquired only with a single slice of 8 mm thickness. Poor slice positioning and partial volume effects thus most probably explain variability of individual R2* values. A similar effect was discussed by Coe et al [34] who obtained a modest level of agreement for PDFF values in the pancreas between a 3D chemical shift sequence and spectroscopy, stating that the “3D dimensionality” of the pancreas must be accounted for. In this context, for the pancreas potential measurement variability due to the heterogeneous shape of the organ, difficulty in delineating the contours, especially in the presence of atrophy or severe fat infiltration, and possible susceptibility artifacts due to adjacent intestinal gases [3] were already reported [35].

Unlike the pancreas, the spleen shows fatty infiltration only in very rare cases [26, 36] and usually has no histologically detectable fat. This would suggest that the particular multi-fat-peak model used for parameter estimation of 3D chemical shift imaging sequences only has limited effect, explaining the observed strong agreement for R2* values. For the spleen, Hong et al [26] found a slight overestimation (~ 2%) of PDFF values by 3D chemical shift -MRI compared to MRS which was explained as artifacts due to ghosting or aliasing or due to noise floor effects. It should be mentioned that they also investigated the correlation between splenic R2* and PDFF values but no comparison of R2* values between different methods was performed.

Finally, it must be noted that although for the liver a conversion of hepatic R2* values into hepatic iron concentration based on biopsy-based correlations is possible [37], such conversions do not currently exist for the pancreas and the spleen, because biopsy is either not feasible or not justifiable. In contrast to the liver, no splenic- and pancreatic-specific conversion factor based on biopsy is known until now. In some human studies, e.g., for the spleen, a calibration equal to the liver was assumed [38, 39], while other studies in mice showed significant differences between liver and spleen calibrations [8]. Therefore, future additional studies employing tissue samples are certainly necessary for calibration.

This retrospective study has some limitations: Supplemental clinical information of the included patients was not used for patient selection resulting in a heterogeneous group with only few patients with hepatic iron overload. Iron overload in the investigated organs was not confirmed by histology, because, as mentioned above, biopsy is not justified for the spleen or the pancreas. We did not correlate R2* values between the different organs, and we did not correlate with PDFF values, even though these data were provided by the qDixon sequence. We only focused on comparing the R2* values between our two MRI methods for the spleen and the pancreas, regardless of patient’s disease. For the liver, such comparisons have already been published before. Furthermore, also different iron distribution patterns between the organs were not evaluated and correlated with pathology. Iron distribution patterns play an important role in the differential diagnosis of iron overload disease [2, 40]; nevertheless, this was beyond the aim of our study.

In conclusion, our data show good agreement between R2* values obtained with a commercial qDixon sequence and a validated ME-GRE relaxometry method for spleen and pancreas. Therefore, the qDixon sequence, primarily intended for liver assessment, seems to be a reliable tool for the additional evaluation of these organs in the upper abdomen enabling an optimal diagnostic workflow for further differential diagnosis and patient management regarding iron status.