Introduction

Diffusion-weighted imaging (DWI) is one of the most promising non-contrast techniques that can be readily implemented in standard liver magnetic resonance imaging (MRI) examinations allowing for lesion detection and differentiation1. In routine clinical practice the apparent diffusion coefficient (ADC) is usually calculated with b-values between 0 and 500–1000 s/mm2 assuming a mono-exponential relationship between signal intensity and the b-value2. However the ADC is not only influenced by molecular diffusion, but also by other (pseudo) random motion such as blood flow in small vessels within the tissue (perfusion). According to the intravoxel incoherent motion (IVIM) theory, diffusion and perfusion effects can be separated assuming a bi-exponential behavior of signal intensity, ultimately yielding the diffusion coefficient D, the pseudo-diffusion coefficient D* and the perfusion fraction f3,4,5,6,7. f is associated with microvessel density8,9. D* was negatively correlated with the interstitial fluid pressure (IFP), which influences blood flow10. The problems with IVIM in clinical liver MRI are long acquisition times and limited data quality caused by respiratory and cardiac motion and by low signal-to-noise ratio, which may lead to unstable fitting results, measurement errors and poor reproducibility11,12,13,14. Improved stability can be achieved by segmented fitting approaches, which decrease the degree of freedom by determining the parameters step by step15,16,17,18,19 or by simplified IVIM, which uses numerically stable computation of IVIM parameter estimations from 4 b-values20,21,22,23,24,25,26,27.

For quantitative analysis of ADC and IVIM parameter maps in lesions a region of interest (ROI) based approach is the most commonly used28,29,30. However, there are different ROI-placement and analysis strategies, mostly only investigated for ADC: to place the ROIs into areas with most restricted diffusion (“hot spots”, focused ROIs), to average over multiple small ROIs placed into different regions, to place a large ROI on a central slice of a lesion, or to cover the whole lesion7,21,23. Usually ROI-analysis is done by averaging the voxel values within the ROI (mean). However, in order to address tumor heterogeneity, also histogram-based approaches are employed to subclassify different tumor diffusion and perfusion environments7,31.

The purpose of this study was to investigate whether there are differences in the diagnostic accuracy of ADC and IVIM parameters in the discrimination of liver lesions using different ROI placement and analysis strategies. We compared 2D- and 3D-volume ROIs, inclusion and exclusion of central necrosis, cystic components and scars, and ROI analysis by averaging and histogram metrics.

Materials and methods

Study cohort

This single-center retrospective study was approved by the ethics committee of the University Hospital of the Rheinische Friedrich-Wilhelms University Bonn, Germany, with a waiver for written informed consent. Data of consecutive patients with focal hepatic lesions ≥ 1 cm undergoing clinical MRI examination of the liver including 4 b-value DWI from 2013 to 2016 were used. A flowchart of patient inclusion and exclusion is given in Fig. 1. Finally, data of 109/73 patients at 1.5/3.0 T were analyzed (Table 1). These two patient groups had already been examined in previous studies21,23. In those studies basic investigations concerning simplified IVIM for liver lesion characterization had been performed. In the present study, the data were used to investigate the influence of different ROI placement and analysis methods concerning diagnostic accuracy.

Figure 1
figure 1

Flow chart of inclusion and exclusion criteria of the study sample.

Table 1 Group composition and demographic data of included subjects at 3.0 and 1.5 T.

Diagnosis of liver lesions was undertaken within clinical routine. Cholangiocellular carcinomas (CCCs) were histologically proven. Hepatocellular carcinomas (HCCs) were either histologically proven or diagnosed according to the American Association for the Study for Liver Disease MRI criteria32. Diagnosis of metastasis was based on typical imaging features in combination with histologically proven primary cancer. Diagnosis of focal nodular hyperplasia (FNH) or haemangioma was established on the basis of typical radiological findings on contrast-enhanced MRI and was confirmed by at least one follow-up examination.

Magnetic resonance imaging

Imaging was performed on clinical whole-body 1.5/3.0-T MRI systems (Ingenia, Philips Healthcare; 1.5/3.0-T gradient system: 45/45 mT/m maximum amplitude, 200/200 T/m/s maximum slew rate; 3.0-T system with dual source RF transmission) using 32-channel abdominal coils with a digital interface for signal reception. The standardized imaging protocol included a DWI sequence with a respiratory-triggered single-shot spin-echo echo-planar imaging variant with four b-values (0, 50, 250, 800 s/mm2) before contrast agent administration (Table 2). For each slice, an isotropic diffusion-weighted image was reconstructed from the three images obtained for the different diffusion directions.

Table 2 Parameters of the diffusion-weighted imaging (DWI) sequence.

Postprocessing

As described previously21,23, two different approximations of D and f were calculated from signal intensities S(b) and S(0) of the acquired b-values, one from b0 = 0, b1 = 50, b3 = 800 and one from b0 = 0, b2 = 250, b3 = 800 s/mm2:

$$D_{1} ^{\prime} = ADC(50,800) = \frac{{\ln (S(b_{1} )) - \ln (S(b_{3} ))}}{{b_{3} - b_{1} }}$$
(1)
$$D_{2} ^{\prime} = ADC(250,800) = \frac{{\ln (S(b_{2} )) - \ln (S(b_{3} ))}}{{b_{3} - b_{2} }}$$
(2)
$$f_{1} ^{\prime} = f(0,50,800) = 1 - \frac{{S(b_{1} )}}{S(0)} \cdot \exp^{{D_{1} ^{\prime} \cdot b_{1} }}$$
(3)
$$f_{2} ^{\prime} = f(0,250,800) = 1 - \frac{{S(b_{2} )}}{S(0)} \cdot \exp^{{D_{2} ^{\prime} \cdot b_{2} }}$$
(4)

From the four b-values, D* was approximated by using D2′ and f2′ and the reading for b1:

$$D^{*\prime} = D^{*}(0,50,250,800) = - \frac{1}{{b_{1} }} \cdot \ln \left[ {\frac{1}{{f_{2} ^{\prime}}} \cdot \left( {\frac{{S(b_{1} )}}{S(0)} - \left( {1 - f_{2} ^{\prime}} \right) \cdot \exp^{{ - D_{2} ^{\prime} \cdot b_{1} }} } \right)} \right]$$
(5)

D*′ cannot be determined for all voxels, because some voxels are not affected by perfusion. Voxels with not defined values were excluded from ROI analysis.

Moreover, the conventional ADC was calculated:

$$ADC = ADC(0,800) = \frac{{\ln (S(b_{0} )) - \ln (S(b_{3} ))}}{{b_{3} - b_{0} }}$$
(6)

Parameter maps and ROI analyses were calculated offline using custom written software in MATLAB (MathWorks, Natick, MA).

Image analysis

Image analysis was performed by a radiologist (N.M.) with 3 years of experience and checked by a radiologist (C.C.P.) with 10 years of experience in abdominal imaging and a physicist (P.M.) with more than 20 years of experience in DWI. All were blinded to clinical information.

One reference lesion per lesion type was analyzed. For each included lesion, 2D- and 3D-volume ROI-based analyses were performed. ROIs were placed as large as possible using DWI with highest contrast between lesion and normal tissue and excluding areas close to the lesion rim to avoid partial-volume effects. After the anatomical position of each ROI had been visually cross-checked for pixel misalignments between images with different b-values, the ROI was analyzed in the related parameter maps.

For 2D-analysis, one hand-drawn ROI was placed centrally in each lesion on a single representative slice (reference slice), which was largely unaffected by motion and susceptibility artifacts and pixel misalignments. For the 3D-volume analysis, a hand-drawn ROI was placed on each slice of the lesion. Slices with artifacts and pixel misalignments as well as the first and the last slice (due to potential partial volume effect) were marked as “bad”. An example of ROI placement is given in Fig. 2. Data from all slices (“good” and “bad”) were combined into a whole-lesion 3D-volume ROI (3DA). Furthermore, a second 3D-volume ROI was calculated including only the “good” slices (3DG). Thus, in each lesion three different ROI-sizes were investigated (2D, 3DA, 3DG).

Figure 2
figure 2

A typical example of 2D and 3D DWI IVIM analysis in a hepatocellular carcinoma at 1.5 T. Original diffusion-weighted images with b = 0, 50, 250, 800 s/mm2 are presented together with conventional ADC maps displayed as color-coded overlays over b800 images. For analysis, on each tumor-containing slice a region of interest (ROI) was selected, where ADC and IVIM parameters (not shown) were analyzed. ADC values are given in units of 10−6 mm2/s. Slices largely unaffected by artifacts were defined as good (“G”), slices close to the lesion’s rim (partial volume) or with images affected by artifacts (see red x) due to motion, susceptibility or pixel misalignments were defined as bad (“B”). One central “good” slice served as reference (“REF”) for the 2D analysis (see green frame), hereby slices in the lower part of the liver should be preferred due to lower motion influences from the heart. For 3D analysis, the voxels of the 2D ROI were combined with voxels of the ROIs on other “good” slices (3DG), voxels of all ROIs was used for whole lesion analysis (3DA).

For lesions with central necrosis, cystic components or scars (centrally deviating areas in DWI), the 2D- and 3D-ROI placements were repeated with exclusion of such areas. Two example analyses are given in Fig. 3. These measurements allowed the evaluation of different ROI sizes as well as of different lesion tissues included to the ROIs.

Figure 3
figure 3

Typical examples of DWI IVIM analysis comparing in- and exclusion of necrosis in a metastasis of colorectal carcinoma (a) and of liquid in a hemangioma (b) at 1.5 T. For one central slice per lesion, original diffusion-weighted images with b = 0, 50, 250, 800 s/mm2 are presented together with conventional ADC, diffusion sensitive D1′ and D2′ parameter maps, and perfusion sensitive f1′, f2′, D*′ parameter maps. The parameter maps are displayed as color-coded overlays over b = 800. Values of ADC, D1′, D2′ and D*′ are given in units of 10–6 mm2/s, those of f1′ and f2′ in 10−3. If bad data quality led to negative parameter values or to not defined values, these voxels were not colorized. When necrosis/cystic components were excluded (“Without”) from regions of interests (ROIs), the diffusion sensitive parameters were significantly lower compared to inclusion (“With”). Perfusion sensitive parameters remained unchanged because there is only low perfusion in the metastasis and hemangioma anyway.

Finally, a histogram analysis was performed for each 2D-ROI. The following histogram metrics were calculated: median, standard deviation, the 5th, 10th, 25th, 75th, 90th, 95th percentiles, skewness and kurtosis.

Statistical analysis

Statistical analysis was performed using SPSS (Version 24.0, IBM) and pROC package (Version 1.16.2) in R (Version 3.6.1)33. Receiver operating characteristic (ROC) analysis was performed for liver lesions discrimination. Youden’s index was used to determine the optimal cut-off of the ROC curve providing the best trade-off between sensitivity and specificity. DeLong method was used to compare dependent ROC curves34. The area under the curve (AUC) based on mean ROI values was compared for the different ROI variants. Furthermore, it was investigated, whether AUC values can be improved by using one of the histogram metrics instead of the mean value. These investigations were carried out for both types of ROIs, including and excluding centrally deviating areas. In order to investigate whether histogram analyses may replace manual exclusion of such areas, additionally a comparison was performed using ROIs excluding such areas in case of mean values and including them in case of histogram metrics.

Ethical approval and informed consent

The presented study was approved by the institutional review board of the University of Bonn and hence all methods were performed in compliance with the ethical standards set in the 1964 Declaration of Helsinki as well as its later amendments. Written informed consent was waived.

Results

At 1.5/3.0 T, 74/54 malignant and 35/19 benign liver lesions were analyzed (Table 1). Mean volume of malignant lesions was 96.6/76.6 cm3 (range: 1.3–1715.7/1.2–521.2 cm3) and of benign lesions 72.1/20.4 cm3 (range: 0.9–856.3/1.1–118.3 cm3). Of these 109/73 lesions, 36/11 had centrally deviating areas. In total, 1333 ROIs were placed. The mean values of ADC and IVIM parameters for the benign and malignant lesion group together with the ROC analyses results for lesion differentiation are presented in Table 3. In Fig. 4 an overview to the obtained AUC values are given. In general, the values of diffusion and perfusion sensitive parameters were lower in malignant lesions than in benign lesions.

Table 3 Results of ADC and IVIM parameter value analysis within different regions of interest (ROIs) and receiver operating characteristic (ROC) analysis of benign and malignant liver lesions.
Figure 4
figure 4

Overview to obtained AUC values (a) at 1.5 T and (b) at 3.0 T for the different ROIs (2D, 3DG, 3DA) and with included and excluded central necrosis, cystic components or scars. Significant differences are marked by “*”.

The highest AUC values for lesion differentiation were found for ADC (0.967–0.911) and D1′ (0.941–0.857) followed by D2′ (0.919–0.816), f2′ (0.731–0.656), f1′ (0.673–0.616), and D*′ (0.563–0.515). For all parameters, diagnostic performance was compared for the different 2D- and 3D-ROI variants, for ROIs in- and excluding centrally deviating areas, and for mean values and histogram metrics.

Comparison of 2D- and 3D-ROIs

In Table 4 the results of the AUC value comparisons with respect to the different ROI types (2D, 3DG, 3DA) are presented. No significant differences were found in any of the comparisons, neither for ROIs that include centrally deviating areas, nor for those excluding such areas. The only exceptions were that AUC values for 3DA ROIs compared to those for 3DG ROIs were slightly larger in case of f1′ and f2′ at 1.5 T (for ROIs including centrally deviating areas: 0.712 vs 0.620 with p = 0.049 and 0.761 vs 0.675 with p = 0.031, respectively; for ROIs excluding those areas: 0.712 vs 0.622 with p = 0.055 and 0.773 vs 0.688 with p = 0.029, respectively), and in case of D2′ at 3.0 T, but only for ROIs including centrally deviating areas (0.895 vs 0.825 with p = 0.029).

Table 4 Comparison of AUC values of the ROC curves obtained from 2 and 3D ROIs (see Table 2) at 1.5 T (a) and 3.0 T (b).

Comparison of ROIs with included and excluded central necrosis, cystic components or scars

Table 5 summarizes the results of AUC value comparison with respect to included tissue. Exclusion of centrally deviating areas from ROIs yields larger AUC values of ADC, D1′, and D2′, for all 2D- and 3D-ROI variants. Improvements were significant at 1.5 T, at 3 T, however, sometimes only by tendency, potentially due to fewer cases with centrally deviating areas. For 2D-ROIs at 1.5 T for example, AUC values of ADC improved from 0.925 to 0.958 (p = 0.01), of D1′ from 0.866 to 0.902 (p = 0.0081), and of D2′ from 0.822 to 0.864 (0.00089). Perfusion parameters did not show any differences. Typical examples of DWI IVIM analysis comparing in- and exclusion of centrally deviating areas are presented in Fig. 3.

Table 5 Comparison of AUC values of the ROC curves obtained from ROIs including (incl) and excluding (excl) centrally deviating areas like necrosis, cystic components or scars (see Table 1) at 1.5 T (a) and 3.0 T (b).

Comparison of mean values versus histogram analysis

Table S1 gives the mean values and values of histogram metrics for the benign and malignant lesion group together with the ROC analyses results for lesion differentiation using 2D-ROIs. In Table S2 the results of the different AUC value comparisons are given.

At 1.5 T, the 5th and 10th percentiles of ADC and D1′ and the 25th percentiles of ADC, D1′ and D2′ lead to significantly higher AUC values than the mean values for ROIs including centrally deviating areas. For example, by using the 10th percentile instead of mean value, AUC values could be improved for ADC from 0.925 to 0.969 (p = 0.018), for D1′ from 0.866 to 0.926 (p = 0.0042), and for D2′ from 0.822 to 0.856 (p = 0.074). For ROIs excluding centrally deviating areas, these improvements were observed to a lesser degree. For example, by using the 10th percentile instead of mean value, AUC values could only be improved for ADC from 0.958 to 0.975 (p = 0.13) and for D1′ from 0.902 to 0.935 (p = 0.038) and not for D2′. The additional comparison using ROIs excluding centrally deviating areas in case of mean value analysis and including such areas in case of histogram analysis, no significant differences were found for ADC, D1′ and D2′. This means, that the use of low percentiles can replace the elaborate exclusion of centrally deviating areas by hand without reducing the diagnostic accuracy. At 3.0 T, where there were fewer cases with centrally deviating areas, similar results were obtained but with higher p-values.

At both field strengths, the 5th and 10th percentiles of D*′ lead to significantly higher AUC values than the mean values, regardless of whether centrally deviating areas were included or excluded or excluded only in case of mean value analysis. For example, by using the 5th percentile instead of the mean value, AUC values could be improved from 0.515 to 0.646 (p = 0.00085) at 1.5 T and from 0.559 to 0.717 (p = 0.0079) at 3.0 T for ROIs excluding centrally deviating areas. This behavior also tended to be observed for f1′. For example, by using the 5th percentile instead of the mean value, AUC values could be improved from 0.622 to 0.708 (p = 0.034) at 1.5 T and from 0.661 to 0.681 (p = 0.74) at 3.0 T for ROIs excluding centrally deviating areas. All other histogram metrics including skewness and kurtosis performed with lower or not significantly different AUC values compared to the ROI mean values.

Discussion

The main findings of the present study were: (1) No significant differences in diagnostic performance were found between 2D- and 3D-ROIs even if only slices with good image quality were included. (2) Differentiation was more accurate when centrally deviating areas were excluded from ROIs. (3) When such areas were included, diagnostic accuracy of diffusion sensitive parameters was improved by histogram analysis of the ROIs using low percentiles instead of mean values. (4) Diagnostic accuracy of perfusion parameters, especially of D*′ was improved by histogram analysis using low percentiles instead of mean values, regardless of whether centrally deviating areas were in- or excluded.

To our knowledge, to date no systematic evaluation of different ROI placement and analysis methods for liver lesion analysis by IVIM-derived DWI parameters has been performed. However, it is important for potential clinical use of IVIM DWI techniques for lesion characterization to establish an appropriate ROI placement and analysis strategy as simple as possible that leads to highest possible diagnostic accuracy.

The technically simplest way for ROI placement in clinical practice is to draw a single 2D-ROI on a representative slice encompassing the whole lesion including centrally deviating areas. In scientific studies, however, 3D-volume ROIs are often used e.g. together with automated segmentation software. In the present work we performed comparisons with respect to ROI-type (2D on a reference slice, 3DA for whole-tumor volume, 3DG considering only “good” slices) and tumor tissue by inclusion and exclusion of centrally deviating areas. For different ROI-types, we did not find significant differences in diagnostic accuracy of ADC and IVIM parameters. Compared to 3D-whole-lesion ROIs (3DA), the inclusion of only “good” slices (3DG) or the selection of a ROI on a reference slice (2D) was expected to improve diagnostic accuracy due to less influence of artifacts, pixel misalignments and partial volume effects. However, this effect was hard to find. One reason might be that in case of whole-tumor 3DA volumes negative influences by “bad” slices were compensated by improved statistics due to higher number of included voxels compared to 3DG and 2D. More voxel averaging and thus a better noise robustness was noticeable especially in small lesions (see Table S3). A previous study on prostate cancer also yielded no improved diagnostic performance using 3D-ROIs instead of 2D-ROIs35. Although further studies on a larger population with liver lesions are needed to confirm the finding of this study, the analysis of a central representative slice of “good” image quality seems to be sufficient for reliable lesion discrimination and is applicable in clinical practice and less time consuming.

The exclusion of centrally deviating areas significantly improves the diagnostic accuracy of diffusion parameters, as was to be expected. For perfusion parameters no differences were found. A previous study on breast lesions, also found improved accuracy of differential diagnosis for ADC in ROIs including only viable tissue instead of whole tumor29. Necrosis, cystic areas and scars increase the diffusion coefficient of a lesion at random due to the admixture of varied proportions of high values. Especially in case of necrosis, the malignancy of tumors may be masked by measurement of a higher ADC due to varying amounts of necrotic tissue. Perfusion parameters, in contrast, are low in necrosis which further reduces the already small values in malignant tumors. In liver metastases, a correlation was found between diffusion parameters and liver tumor necrosis, but not for perfusion parameters36.

For lesion assessment, the exclusion of centrally deviating areas is more time consuming and, therefore, not a routine clinical practice and can be challenging for unexperienced radiologists. Thus, automated segmentation would be helpful. In this respect, histogram analysis can provide additional quantitative metrics beyond the mean value of a ROI, which reflect the heterogeneity of pathologic changes without additional imaging7. In our study, histogram analysis of ROIs including centrally deviating areas showed that low percentiles led to similar diagnostic accuracy for ADC and diffusion coefficients than mean value analysis of ROIs without such areas. Thus, this method may be of use to automatically determine voxels of viable tumor for ADC and IVIM analysis. In some other studies, it was also shown that diagnostic accuracy of ADC and D in whole-lesion ROI analysis was improved when low percentiles were used instead of mean values, e.g. in predicting microvascular invasion of hepatocellular carcinoma37, differentiation of malignancy in breast and testicular lesions31,38, differentiating of different grades of prostate cancer39, and gliomas40,41,42.

Furthermore, of special interest is the finding that for the perfusion parameters, especially D*, diagnostic accuracy in lesion discrimination was significantly improved by the use of low percentiles instead of mean values regardless of whether centrally deviating areas were included or excluded or excluded only in case of mean value analysis. Because D* depends on blood flow velocity and length of microvessel segments3,4, this may indicate that differences between benign and malignant lesions exist especially for small vessels. Other studies investigating histogram analysis for IVIM perfusion parameters in liver lesions are rare. There is one other study investigating hepatocellular carcinoma with and without microvascular invasion, but no significant differences were found for parameters D* and f, neither for mean values nor for low percentiles37.

This study has several limitations. First, it was a retrospective study with inherent methodological limitations. For example, due to the lack of raw data, no motion correction of the individual images43 could be performed before averaging. Second, although the total number of lesions included was relatively large, only common lesion types were analyzed, which may affect the generalizability of the results. Also, there was a relatively small number of patients who underwent MRI examination at 3.0 T MRI system and, therefore, statistical power was lower compared to 1.5 T. We included a typical clinical patient cohort of a large tertiary reference center so that not only large lesions were included. Therefore, a study including more large lesions may show differences between 2D- and 3D-volume measurements. On the other hand, not even tendencies concerning differences of 2D- and 3D-ROIs were found in the present study.

In conclusion, using representative 2D-ROIs seems to be sufficient for reliable liver lesion discrimination in routine clinical practice. Central necrosis, cystic components or scars should be excluded from ROIs either by hand or by computing low percentiles of diffusion coefficients instead of mean values.