Introduction

Detection and treatment of hepatic fibrosis at an early stage can prevent ongoing hepatocellular damage, progression towards cirrhosis, and complications such as portal hypertension and ascites. The combined examination of the liver and spleen has emerged as a promising field of research [1,2,3,4]. It has been shown that, in hepatic fibrosis, not only hepatic stiffness but also splenic stiffness increases, which is presumably caused by an increased pressure in splenic vasculature [1, 4, 5]. However, data regarding the staging of hepatic fibrosis based on mechanical tissue properties of both liver and spleen are still limited, and no investigation aimed at defining the optimal frequency range of elastography has been performed [5].

Magnetic resonance elastography (MRE) is a noninvasive imaging technique for staging hepatic fibrosis by assessing mechanical tissue properties [6,7,8,9]. It is based on the higher stiffness of fibrotic tissue induced by pathological changes such as the proliferation of collagen and cross-linking of free collagen branches [10]. Many currently available MRE techniques still suffer from limited anatomical resolution due to insufficient shear wave propagation and noise [11]. Tomoelastography by multifrequency MRE (MMRE) is a recently introduced advanced technique for generating full-field-of-view elastograms with pixel-wise detail resolution in a tomographic fashion [12]. It has been shown that tomoelastography outperformed previous elastograms generated by direct Helmholtz inversion in terms of detail resolution, noise robustness, and intra-tissue homogeneity [12]. With this technique, consistent mechanical tissue properties of small anatomical regions such as the spinal cord have been shown. So far, no diagnostic benefit of multifrequency over monofrequency MRE techniques has been established [13].

The primary aim of this study was to determine the diagnostic performance, cut-off values, and optimal drive frequency range for staging hepatic fibrosis using tomoelastography of the liver in patients with biopsy-proven fibrosis or imaging findings of cirrhosis. As a secondary aim, we investigated whether a combined analysis of the liver and spleen would further improve diagnostic performance.

Materials and methods

Subjects

The study was approved by the institutional review board, and written informed consent was obtained from all subjects. In this prospective monocenter study, a total of 61 subjects were consecutively enrolled: 45 patients (18 women) and 16 healthy volunteers (8 women). Patients were examined between June 2014 and April 2017. Inclusion criteria were the presence of chronic liver disease (CLD) and liver biopsy performed or planned within 1 year of enrollment. Patients without biopsy but definite imaging findings of cirrhosis with a nodular liver contour and segmental hypertrophy or atrophy were also included since histological sampling would have been unethical in these cases. Further inclusion and exclusion criteria are shown in Fig. 1. All 16 healthy volunteers were characterized by ultrasound elastography (Virtual Touch Quantification, Acuson S2000, Siemens Healthineers) in a previous study to account for the absence of a histological reference standard [14]. Patients and healthy volunteers had a mean age of 49 years (range 16–75 years) and 52 years (range 31–75 years), respectively.

Fig. 1
figure 1

Flow diagram of study design and subjects. CLD, chronic liver disease

Tomoelastography data acquisition

Tomoelastography was performed on a 1.5 T MRI scanner (Magnetom Aera, Siemens Healthineers) with a 12-channel phased-array coil. We used a custom-designed piezoelectric driver, fast single-shot 3D wave-field acquisition at drive frequencies of 35 to 60 Hz with 5-Hz increments, as proposed by Hirsch et al [2]. Further imaging parameters were as follows: 9 slices, 8 time steps, 12 filter directions, 3 components, 78 × 100 matrix, 3 × 3 × 5 mm3 resolution, 2 averages, and 50-Hz motion-encoding gradient frequency. To avoid an increased postprandial hepatic blood flow and stiffness, subjects fasted for at least 4 h [15, 16]. The actuator was positioned at the level of the xiphoid process. Tomoelastography was performed in free breathing with a total image acquisition time of 4:30 min for all six frequencies combined. The total examination time was approximately 15–20 min and included patient and setup preparation, tomoelastography, and conventional MRI without contrast medium as follows: axial T1-weighted dual gradient-echo in-phase and out-of-phase sequence and axial and coronal T2-weighted half-Fourier acquisition single-shot turbo spin echo sequence. Examinations were performed by one of two radiologists with 5 (M.H.) and 7 (R.R.) years of experience in abdominal elastography. Image processing and evaluations were performed blinded to biopsy results by one observer (R.R.). Technical success of tomoelastography was evaluated by a visual assessment of the shear wave images (Fig. 2).

Fig. 2
figure 2

Tomoelastography of the upper abdomen. MRI T2w, conventional T2-weighted image with half-Fourier acquisition single-shot turbo spin echo (HASTE) sequence in axial orientation without contrast medium. Magnitude, morphological magnitude image derived from the multifrequency MR elastography (MMRE) sequence. MMRE (displacement), the shear wave image depicts tissue displacement in and out of the axial plane with red and blue colors. MMRE (c), compound multifrequency elastograms represent quantitative maps of shear wave speed (c) with bright and dark colors. Besides abdominal organs such as the liver, spleen, and kidneys, these maps also visualize smaller anatomical structures. The region of interest (red) indicates areas included in the analysis. V, hepatic vein/inferior vena cava; A, aorta; P, pancreas; K, kidney; S, stomach; C, kidney cyst; As, ascites. a A 61-year-old female healthy volunteer with mean c of 1.67 ± 0.25 m/s. b A 64-year-old male patient with chronic hepatitis C and cirrhosis (F4). Bright colors in the elastogram indicate a pathologically increased mean c of 2.59 ± 0.57 m/s. The left kidney (K) and a kidney cyst (C) are distinguishable from the spleen. c A 72-year-old male patient with chronic hepatitis C, cirrhosis (F4), and increased mean c of 2.46 ± 0.58 m/s. Shear wave propagation was not hindered by the presence of ascites (As)

Data processing

For image processing, we used a recently introduced tomoelastography pipeline, which is entirely available at https://bioqic-apps.charite.de and described in detail by Tzschätzsch et al [12]. Briefly, it has been demonstrated that complex wave number recovery and amplitude-weighted averaging of multiple harmonic frequencies in a compound map of stiffness outperforms detail resolution and noise robustness of single frequencies. Compound multifrequency processing improves elastograms by reducing areas of low shear wave displacement and wave nodes. Tomoelastography parameters are shear wave speed (c in m/s; c = wavelength × frequency) and penetration rate (a in m/s; a = penetration depth × frequency). While c relates to stiffness (the higher c, the stiffer the material), a relates to damping of shear waves (the higher a, the less attenuation is encountered). Full-field-of-view elastograms of the liver and spleen were derived from the same scan. Each slice of the elastograms was generated by compounding 216 images of MMRE raw data (12 spatiotemporal filter directions, 3 field components, 6 drive frequencies). Regions of interests were generated using a systematic approach: (i) The contours of liver and spleen were manually outlined using the magnitude image. (ii) A lower shear wave speed threshold of 1 m/s was applied to reduce boundary effects of major blood vessels and regions of insufficient shear wave excitation. Values above 1 m/s were included in the analysis. Consistent filter settings were used for all subjects. Liver fat content was calculated according to Fischer et al [17]. For patients, liver biopsy or imaging findings of cirrhosis were used as a reference standard. Histological fibrosis staging was performed according to Desmet et al [18].

Statistical data analysis

The Shapiro–Wilk test was used to assess normal distribution with a level of significance of p ≤ 0.01. Spearman’s rank correlation coefficient (Rs) was calculated for a pairwise comparison of all parameters with the stage of fibrosis. Pearson’s correlation coefficient (Rp) was calculated for hepatic and splenic shear wave speed. A two-sided t test was used to assess the differences between patients and healthy volunteers as well as hepatic and splenic shear wave speed. The level of significance was p ≤ 0.05. Sensitivity, specificity, negative and positive predictive values, area under the receiver operating characteristic curve (AUC) with 95% confidence intervals (CI), and optimized cut-off values using the Youden index were calculated for fibrosis staging, accounting for single drive frequencies as well as compound multifrequency processing. AUC with 95% CI for combined hepatic and splenic parameters was calculated using binary logistic regression and compared with hepatic AUC with a level of significance of p ≤ 0.05, as described by DeLong et al [19]. Mean AUC values of all fibrosis stages combined were determined to compare single drive frequencies and compound multifrequency processing. A second observer (C.B.) reevaluated all cases and an interobserver reproducibility assessment was conducted by calculating the intraclass correlation coefficient (ICC) with 95% CI. Statistical analysis was conducted by an expert statistician (M.U.) using Matlab version 9.0 R2016a (The Mathworks, Inc.).

Results

Characteristics of subjects

The flow of subject enrollment is shown in Fig. 1. Mean values and standard deviation (SD) of body mass index (BMI) and liver fat content of patients and healthy volunteers were 25 ± 4 kg/m2 and 24 ± 4 kg/m2 and 3 ± 7% and 3 ± 5%, respectively. Liver fat content was not available for 1 patient. Healthy volunteers had no known history of any liver disease. Patients had the following CLDs: chronic hepatitis B and C (n = 6 and 5, respectively), primary sclerosing cholangitis (n = 9), nonalcoholic steatohepatitis (n = 6), autoimmune hepatitis (n = 6), toxic liver disease (n = 5), primary biliary cholangitis (n = 2), Wilson’s disease (n = 2), cryptogenic fibrosis (n = 2), diffuse liver metastases from breast cancer (n = 1), and alcoholic liver disease (n = 1). The mean (± SD) time interval between biopsy and tomoelastography was 69 ± 99 days. Fibrosis stage distribution in the patients included in the analysis (n = 43) based on the reference standard was as follows: F0, n = 1; F1, n = 10; F2, n = 5; F3, n = 9; and F4, n = 18. No correlation was found between age, BMI, liver fat content (Rs = 0.17, 0.04, − 0.12; with p = 0.19, 0.77, and 0.38, respectively) and the stage of fibrosis.

Tomoelastography

Tomoelastography of the liver and spleen failed for 2 and 5 patients, respectively, due to insufficient shear wave propagation based on technical difficulties with the custom-designed piezoelectric driver setup. The overall technical success rate was 96.7% for the liver and 91.8% for the spleen. Figure 2 shows full-field-of-view elastograms with high detail resolution, even in patients with ascites. Besides the liver and spleen, also smaller anatomical structures such as the kidneys and kidney cysts, the pancreas, the aorta, and the portal vein and major hepatic veins are displayed in a tomographic fashion. The Shapiro–Wilk test showed normal distribution for all shear wave speed (c) data but not for penetration rate (a) data. Mean c- and a-values of the liver and spleen are listed in Table 1. Boxplots of hepatic c-values of compound multifrequency processing are displayed in Fig. 3. A significant difference in mean hepatic c derived from compound multifrequency processing was evident between patients and healthy volunteers (1.89 ± 0.44 m/s and 1.44 ± 0.08 m/s, respectively; p ≤ 0.0002); in contrast, no significant difference was found for mean splenic c (2.03 ± 0.56 m/s and 1.79 ± 0.36 m/s, respectively; p = 0.13). Mean c was significantly higher in the spleen compared with that in the liver for all single drive frequencies (all p ≤ 0.05) and compound multifrequency processing (p = 0.02). There was a strong significant correlation between hepatic c and the stage of fibrosis (Table 2). For splenic c, only a weak correlation with the stage of fibrosis was found, which was significant for all frequencies, except 35 Hz (p = 0.076; Table 2). For hepatic and splenic c, a weak to moderate significant correlation was evident, which became more pronounced towards higher frequencies (35 to 60 Hz with 5-Hz increments, and compound multifrequency: Rp = 0.32, 0.39, 0.46, 0.46, 0.46, 0.47, and 0.44, respectively; all with p ≤ 0.02). For liver fat content, there was a weak significant correlation with hepatic a (Rp = − 0.33, p = 0.01); in contrast, no significant correlation was found for hepatic c (Rp = − 0.18, p = 0.17).

Table 1 Mean shear wave speed (c) and penetration rate (a) values of the liver and spleen
Fig. 3
figure 3

Boxplot of compound multifrequency (35–60 Hz) shear wave speed (c) of the liver and the stage of hepatic fibrosis. Median, upper, and lower quartile and whiskers of c-values are displayed. Statistically significant differences between groups of fibrosis stages are demarcated with asterisks: **p < 0.01; ***p < 0.001

Table 2 AUC values for staging hepatic fibrosis and correlation analysis

AUC values with 95% CI and Spearman’s rank correlation coefficients of the liver and spleen as well as binary logistic regression analysis of the liver and spleen combined are compiled in Tables 2 and 3, respectively. Drive frequencies with the highest mean AUC and Spearman’s rank correlation coefficients were 45 Hz, 50 Hz, 55 Hz, 60 Hz, and compound multifrequency processing (all with mean AUC = 0.95, Rs ≥ 0.86 with p < 0.001; Table 2). In comparison, binary logistic regression analysis of combined hepatic and splenic c-values showed an increased mean AUC of 0.97 at 60 Hz; however, statistical significance was only evident for stage F4 (p < 0.03, Tables 3 and 4). Optimized diagnostic cut-off values of hepatic c with corresponding sensitivity, specificity, and negative and positive predictive values are shown in Table 5. Despite its high diagnostic accuracy, 50-Hz cut-off values failed in differentiating moderate fibrosis (F2) from severe fibrosis (F3) which limits clinical usefulness substantially. Receiver operating characteristic curves for the most important parameters (45 Hz, 55 Hz, 60 Hz, and compound multifrequency) are displayed in Fig. 4. For compound multifrequency processing of the liver, cut-off and AUC (with 95% CI) values were as follows: F1, 1.52 m/s and 0.89 (0.81–0.95); F2, 1.55 m/s and 0.94 (0.89–0.99); F3, 1.67 m/s and 0.98 (0.96–1.00); and F4, 1.72 m/s and 0.98 (0.96–1.00).

Table 3 AUC values for staging hepatic fibrosis using a combined analysis of liver and spleen
Table 4 P values for the comparison of two ROC curves: analysis of shear wave speed of liver and spleen versus liver alone
Table 5 Optimized cut-off values of hepatic shear wave speed (c) for staging fibrosis and corresponding sensitivity, specificity, and predictive values
Fig. 4
figure 4

ROC curves for staging hepatic fibrosis based on shear wave speed (c) of the liver. Tomoelastography for the most important parameters: a compound multifrequency processing from 35–60 Hz as well as single drive frequencies of (b) 60 Hz, (c) 55 Hz, and (d) 40 Hz. Receiver operating characteristic (ROC) curves show values for any fibrosis (stage F1 or higher), moderate fibrosis (stage F2 or higher), severe fibrosis (stage F3 or higher), and cirrhosis (equivalent to stage F4)

For hepatic and splenic a, there was no consistent correlation with the stage of fibrosis, and diagnostic performance was poor with mean AUC values ranging from 0.35 to 0.70 (Table 2).

An excellent interobserver reproducibility with an ICC (95% CI) of 92% (85–96%) for the liver and 96% (93–98%) for the spleen was found.

Discussion

To our knowledge, this is the first study investigating MRE of both the liver and spleen for staging hepatic fibrosis. We aimed to determine the diagnostic performance, cut-off values, and optimal drive frequency range of tomoelastography for this indication. For hepatic c, high AUC values suggest an excellent discriminative ability for staging hepatic fibrosis while detail resolution was improved compared with available MRE techniques. Full-field-of-view elastograms show a pixel-wise detail resolution in a tomographic fashion, which replaces the need to superimpose elastograms with conventional morphological images to identify abdominal organs. The best mechanical drive frequencies for the liver in terms of diagnostic performance are 45 Hz, 55 Hz, 60 Hz, and compound multifrequency processing.

Our results suggest a better diagnostic performance for higher drive frequencies and for staging severe fibrosis (F3) or cirrhosis (F4), which is consistent with the literature [7, 20]. For staging fibrosis at 60 Hz—the single drive frequency used in most studies—our results suggest cut-off and AUC values as follows: F1, 1.62 m/s and 0.92; F2, 1.78 m/s and 0.93; F3, 1.82 m/s and 0.97; and F4, 1.85 m/s and 0.98. Diagnostic performance is in the same range as reported by other studies [20,21,22]. A meta-analysis by Singh et al reported cut-off and AUC values as follows (cut-off values were transformed from kPa to m/s for better comparison): F1, 1.86 m/s and 0.84; F2, 1.91 m/s and 0.88; F3, 2.03 m/s and 0.93; and F4, 2.17 m/s and 0.92. Higher cut-off values and lower AUC values might be attributable to the combination of MRE techniques from various groups and to the investigation of a more diversified population [7]. Another meta-analysis by Singh et al, investigating the detection of liver fibrosis in patients with nonalcoholic fatty liver disease, found a similar performance with AUC values from F1 to F4 as follows: 0.86, 0.87, 0.90, and 0.91 [23].

Our current results demonstrate that the diagnostic performance of compound multifrequency processing is equivalent to that of higher single drive frequencies and not inferior as reported by Asbach et al [20]. Nevertheless, future studies could benefit from higher accuracy and shorter scan times when performing tomoelastography at higher frequencies only.

It is a stimulating result that the combined analysis of liver and spleen improved diagnostic performance in our study, which is in contrast to the results of an ultrasound elastography study by Leung et al [5]. However, the diagnostic benefit of combined elastography of the liver and spleen for fibrosis characterization strongly depends on the underlying systemic pathology, the presence of vascular obstructions, and portal hypertension. The fact that tomoelastography provides maps of the entire liver and spleen within a single scan will be of clinical relevance in many applications and renders MRE superior to complementary ultrasound-based elastography examinations.

For splenic c as well as hepatic and splenic a, low AUC values suggest a poor ability or failure to stage hepatic fibrosis. This poor sensitivity of a, as a representation of damping, for staging fibrosis has been shown previously [12, 20]. However, an even more pronounced significant correlation of hepatic a with steatosis has been demonstrated recently by a study investigating nonalcoholic fatty liver disease and should be implemented in future studies and data analysis for liver fat quantification [24]. Our results support their finding that hepatic damping increases with steatosis, although the liver fat content in our cohort was substantially lower.

Future studies should investigate the significance of tomoelastography for the assessment of focal liver lesions and fibrosis heterogeneity as an additional biomarker besides overall stiffness and compare the diagnostic performance of different MRE setups and image processing pipelines.

Our study has some limitations. First, we examined a small number of patients, especially in the F2 group, since liver biopsy is increasingly avoided in clinical routine in favor of noninvasive diagnostic tests. Moreover, we had a deviated population with a larger proportion of subjects in the F0 and F4 group, which can lead to overestimation of diagnostic performance. Second, there was a long interval between tomoelastography and biopsy, which can lead to misclassification. However, a recent meta-analysis suggests a low risk of disease progression bias when the interval is less than 1 year [7]. Third, as a monocenter study, data were acquired mainly from the same population, which favors overestimation of diagnostic performance. Fourth, fibrosis was caused by CLD of different etiologies. Fifth, for healthy volunteers, no biopsy was available as a reference test. Instead, healthy volunteers were assessed with an established ultrasound elastography method as a less reliable reference test. Sixth, we did not perform a reproducibility assessment. However, another study investigating the feasibility of tomoelastography of the prostate found a good overall test–retest reproducibility for this technique [25]. Finally, there was a technical success rate for liver and spleen of 96.7% and 91.8%, respectively. Examinations failed due to insufficient shear wave penetration and amplitudes. Compressed air drivers as used in [16, 26] have been proven a powerful alternative to our present piezo-based setup. This method was not available at the time of our study but will be implemented for future work.

In conclusion, tomoelastography provides cut-off values with excellent diagnostic accuracy for staging hepatic fibrosis. While diagnostic performance was comparable to that reported for other elastography techniques in prior studies, tomoelastography provided full-field-of-view elastograms of the abdomen with unprecedented pixel-wise detail resolution in a tomographic fashion. Our analysis of single-frequency tomoelastography suggests that scan time can be further reduced in future studies, making tomoelastography easier to implement in clinical routine.