Introduction

Crohn’s disease (CD) is a chronic relapsing and remitting disorder that can involve entire gastrointestinal tract and cause irreversible intestinal damage [1]. Accurate assessment of disease activity is critical for therapeutic decision, clinical management, and follow-up of CD patients [1,2,3,4]. Magnetic resonance enterography (MRE) is increasingly used for diagnosis and assessment of CD benefitting from its non-ionizing radiation and excellent soft tissue contrast. And several studies have demonstrated that the magnetic resonance index of activity (MaRIA) is well correlated with Crohn’s Disease Endoscopic Index of Severity (CDEIS) or SES-CD [2, 5, 6] and widely used for monitoring therapeutic response [2,3,4].

Diffusion weighted imaging (DWI), as a method of the only non-invasive assessment of free diffusion of water molecules, has been increasingly applied to evaluate pediatric patient or renal failure patients. Several studies found that DWI is noninferiority to contrast-enhanced MR imaging for the evaluation of inflammation and diagnosis of complications in CD [7, 8]. And apparent diffusion coefficient (ADC), the quantitative parameter of DWI, has been increasingly used for evaluating disease activity in CD [8, 9]. However, conventional ADC was based on assumption that the displacement of water molecular was an ideal Gaussian distribution without any restriction. Actually, water molecular diffusion in complex biological tissue tends to deviate from a Gaussian distribution, owing to cellular microstructural barriers.

Diffusion kurtosis imaging is an advanced DWI model that quantifies non-Gaussian behavior of diffusion, yielding additional parameters, apparent kurtosis coefficient (Kapp), reflecting the deviation degree from the ideal Gaussian curve, and the corrected ADC value, which quantifies the non-Gaussian behavior of water molecular diffusion [10]. Recent years, DKI has been widely applied to the preoperative diagnosis, grading and postoperative surveillance of cancer [11,12,13]. Nevertheless, few studies applied DKI to evaluate the activity of autoimmune inflammation diseases, especially in the inflammatory bowel disease. Hence, the objectives of this study are to explore the feasibility of DKI for the evaluation of inflammatory activity in Crohn’s disease (CD).

Materials and methods

Patients

This is a retrospectively observational study conducted from July 2016 to September 2018, the patients consecutively performed MRE with DKI sequence and enteroscopy at our hospital were selected. Informed consent was obtained from all patients. Inclusion criteria were the following: (a) the diagnosis of CD was established; (b) the interval between MR examination and enteroscopy was within 14 days; (c) intestinal dilatation was good, and it did not affect the assessment. Exclusion criteria were the following: (a) the diagnosis of CD is not confirmed; (b) intestinal dilatation was poor and can impact assessment; (c) the interval between MR examination and endoscopy exceeds 2 weeks; (d) bowel resection before MR examination or enteroscopy; (e) patients had contraindications for MR examinations. Clinical disease activity was assessed using the Harvey–Bradshaw Index (HBI); C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR) were examined between the two examinations.

Sixty-seven patients were originally considered in our database. Besides, ten patients were excluded because of bowel resection before the two examinations; one patient was ulcerative colitis, one patient was excluded due to the interval between the two examinations exceeded 14 days; four patients were intestinal tuberculosis. Finally, 51 patients (37 males and 14 females) were included in the study.

MRE protocol

Patients were required to take polyethylene glycol electrolyte (PGE) solution for bowel cleaning the night before the procedure, and to take oral 1500–2000 ml of 2.5% mannitol solution (depending on patient’s physique and comfort) 1 h before the procedure. A 10 mg of anisodamine was slowly injected intramuscularly into the buttocks to induce gastrointestinal hypotonia 10 min before MRE.

The MRI examinations were performed on a 3.0 T MR scanner (Magnetom Prisma, Siemens Healthcare, Erlangen, Germany) with two dedicated 32-channel abdominal phased-array coil. Patients were in the supine position. Coronal and axial fat-saturated T2-weighted half-Fourier single shot turbo spin-echo (HASTE), coronal HASTE were taken. Coronal T1-weighted volumetric interpolated breath-hold examination (VIBE) sequences were also obtained at baseline and then 30, 60, 90, 180 s after injection of 0.2 ml/kg body weight 10 ml (10 ml: 4.69 g) gadopentetate dimeglumine (Magnevist; Beijing Beilu Pharmaceutical Co., Ltd). DKI was performed by axial free-breathing using a single shot spin-echo echo planar imaging (SE-EPI) sequence in three orthogonal directions with five b values (200, 500, 1000, 1500, 2000 s/mm2) before contrast agent administration. The total acquisition time of DKI was 4 min 32 s. MR protocol and sequence parameters were summarized in Table 1.

Table 1 Scanning parameters of magnetic resonance enterography protocol and DKI

Enteroscopy examination and evaluation

Patients followed a bowel cleansing protocol via oral ingestion of 2000 ml of polyethylene glycol (PEG, Fortrans, BEAUFOUR ISPEN, Tianjin) on the evening before examination and 2000 ml the morning of examination. Enteroscopy was performed under anesthesia with propofol (Fresenius Kabi Austria GmbH, Austria). All enteroscopies (oral—into the jejunum about 350 cm away from the pylorus and the anal—into the ileum about 150 cm away from the ileocecal valve) were performed by two experienced endoscopists (L.J. and C.M.) with single-balloon enteroscopy (SIF-Q260; Olympus, Tokyo, Japan). The severity and extent of enteroscopic lesions were retrospectively assessed by an experienced endoscopist (L.Q.) without knowledge of MRE results according to the simplified endoscopic score for Crohn’s disease (SES-CD) [14]. The SES-CD was applied to each segment (jejunum, proximal ileum, terminal ileum, cecum/right colon, transverse colon, left/sigmoid colon and rectum) to obtain a segmental SES-CD. The terminal ileum was defined as the segment ≤ 10 cm from the ileocecal valve. Proximal ileum was defined as the segment from > 10 cm from the ileocecal valve to the borderline of between jejunum and ileum [4]. Each segment was graded as inactive (0–2), mild (3–6), or moderate–severe (≥ 7) by SES-CD. Inactive segments included normal segments (intestinal mucosa was absolutely normal) and some segments with lesions (SES-CD ≤ 2). Normal segments were not scored because the purpose of the study was to retrospectively assess the lesion activity. There were 213 normal segments in the study. Finally, there are 28 inactive, 49 mild and 67 moderate–severe.

Imaging analysis

Two radiologists, one junior (with 3 years of experience in abdominal imaging and MRE, C.J.) and one senior (with more than 15 years of experience in abdominal imaging and MRE, X.G.) independently reviewed the anonymized MRE examinations on a picture-archiving and communication system (PACS) viewing station (Carestream Health Inc, Toronto, Canada). In case of discrepancies, consensus was reached after discussion between the two radiologists. For each segmental analysis, the same division as enteroscopy was applied. MRE variables were evaluated in each segment including bowel wall thickening (bowel wall thickness > 3 mm in good intestinal dilatation), presence of ulcer (deep depression in the mucosal surface of thickened bowel segment), presence of mucosal edema (T2 hyperintensity relative to psoas muscle) and relative contrast enhancement (RCE), which was calculated by RCE = [(SIpost − SIpre)/(SIpre)] × 100 × (SD noisepre/SD noisepost), where SIpre and SIpost denote wall signal intensity, SD noisepre and SD noisepost denote standard deviations of noise outside of the body before and after gadolinium injection. SIpost and SD noisepost were measured on the 3D VIBE images at 70 s after contrast enhancement. Three adjacent circular region of interest (ROI) were drawn in each selected segment of abnormal bowels. In case of dominantly intense mucosal enhancement, the ROIs were placed as close as possible into the inner mucosal surface. The mean value of three ROIs was calculated for each segment. Each ROI had an area between 3 and 6 mm2 [15]. To measure the background noise (standard deviation), a single ROI between 60 and 80 mm2 was placed outside the abdomen on the same field of view [15]. The MaRIA was calculated by [6]: MaRIA = 1.5 × wall thickness + 0.02 × RCE + 5 × edema + 10 × ulcers.

DKI and DWI data were analyzed using an in-house program written in MATLAB (version R2013b, MathWorks, MA, USA). The DKI parameter was fitted according to the following equation [10,11,12,13]:

$$\frac{{S_{\text{b}} }}{{S_{0} }} = { \exp }\left[ { - {\text{b}} \times D_{\text{app}} + \frac{1}{6} \times b^{2} \times \left( {D_{\text{app}} } \right)^{2} \times K_{\text{app}} } \right],$$
(1)

where S0 is signal intensity when b value is 0, Sb is the signal intensity for a given b value (s/mm2), representing the diffusion gradient. Dapp is the ADC corrected by the non-Gaussian model, representing true diffusivity. Kapp is kurtosis, indicating the extent of water molecule diffusion that deviates from the Gaussian distribution.

The ADC is the mean value obtained using all b values that were fitted with a conventional mono-exponential model according to the following equation [11, 12]:

$$S_{b} = S_{0} \times \exp \left( { - b \times {\text{ADC}}} \right),$$
(2)

where S0 is signal intensity when b value is 0, Sb is the signal intensity for a given b value, and ADC is the apparent diffusion coefficient.

Two radiologists (W.K. with 6 years of experience in abdominal imaging and MRE and W.G. with more than 15 years of experience in abdominal imaging and MRE) reviewed MRE images, who were blind to clinical and enteroscopy results. Because of low resolution of DKI data, the target segments were selected by location-by-location based on MRE. Although we spared our effort to evaluate the lesions of each segments by location-by-location between endoscopy and MRE, there were still some inactive (SES-CD ≤ 2) lesions not detected on MRE, however, which were actually detected on endoscopy, because MRE was hard to display minimal lesions, such as erosion, erythema, edema, small ulcers etc. Finally, compared with enteroscopy results, there were 13 inactive and four mild segments not detected by MRE, however, all moderate–severe segments were detected. The lesions displayed by MR were all detected by enteroscopy. So we selected the lesions that were detected by both endoscopy and MRE. Each ROI was manually drawn along the border of the brightest signal region of bowel wall on three consecutive slices (the slice with the prominent lesion section and the adjacent up and down slices) [16]. Then, all the parameters, including mean ADC, mean Dapp, and Kapp, were calculated by voxel using the whole volume method.

Statistical analysis

Statistical analysis was performed using the IBM SPSS Statistics for Windows Version 25.0 (IBM Corp., Armonk, NY). The tests were two-sided, with a Type I error set at α = 0.05. Categorical variables were presented as frequency and percentages. Continuous variables were tested by the Kolmogorov–Smirnov test and the Levene’s test for variance homogeneity, and presented as mean ± standard deviation or median (interquartile range). Comparison of ADC, Kapp, and Dapp among different active groups were tested by one-way analysis of variance (ANOVA) or Kruskal–Wallis test. The correlations of between diffusion parameters and SES-CD or MaRIA were tested by spearman rank test. Receiver operating characteristic (ROC) curves were analyzed to assess the ability of diffusion parameters differentiating inactive from active group and inactive–mild from moderate–severe group, and areas under the ROC curves (AUCs) were calculated. The Delong test was used to compare the metrics in term of AUCs. The cutoff value was determined by maximizing the Youden’s index. Interobserver consistency of the parameters between the two readers was assessed using the intraclass correlation coefficient (ICC) with 95% CIs.

Results

Baseline characteristics of the patients

A total of 51 patients (37 males and 14 females), 127 bowel segments (15 inactive, 45 mild and 67 moderate–severe) were included in the study, including four jejunum, 32 proximal ileum, 43 terminal ileum, 18 cecum/right colon, nine transverse colon, 16 left/sigmoid colon and five rectum segments. The baseline characteristics of the patients were summarized in Table 2.

Table 2 Characteristics of the Crohn’s disease patients (n = 51)

Correlation between diffusion parameters and SES-CD

Considering 127 segments, ADC (r = − 0.627, p < 0.001), Dapp (r = − 0.381, p < 0.001) and Kapp (r = 0.641, p < 0.001) were correlated with SES-CD. ADC (r = − 0.563, p < 0.001), Dapp (r = − 0.306, p < 0.001) and Kapp (r = 0.581, p < 0.001) were correlated with MaRIA. And segmental MaRIA was correlated with segmental SES-CD (r = 0.741, p < 0.001).

Regarding jejunum–ileum segments (n = 79), ADC (r = − 0.691, p < 0.001), Dapp (r = − 0.409, p < 0.001) and Kapp (r = 0.686, p < 0.001) were correlated with SES-CD. ADC (r = − 0.631, p < 0.001), Dapp (r = − 0.297, p = 0.008) and Kapp (r = 0.627, p < 0.001) were correlated with MaRIA. Segmental MaRIA was correlated with segmental SES-CD (r = 0.732, p < 0.001).

Regarding colorectal segments (n = 48), ADC (r = − 0.486, p < 0.001), Dapp (r = − 0.374, p = 0.009) and Kapp (r = 0.510, p < 0.001) were correlated with SES-CD. ADC (r = − 0.425, p < 0.001), Dapp (r = − 0.338, p = 0.019) and Kapp (r = 0.530, p < 0.001) were correlated with MaRIA. And segmental MaRIA was correlated with segmental SES-CD (r = 0. 755, p < 0.001).

Differences of the parameters of DKI and DWI among different active groups

Considering 127 segments, ADC (Kruskal–Wallis test), Dapp and Kapp (ANOVA) was significantly different (all p < 0.001) among inactive, mild and moderate–severe groups (Fig. 1a–i). The difference of ADC, Dapp and Kapp among inactive, mild and moderate–severe groups were presented with Fig. 2a–c. When taking into account colorectal segments and jejunum–ileum segments separately, there were still significant difference for ADC, Dapp and Kapp among different groups (all p < 0.001).

Fig. 1
figure 1

A 39-year-old woman with moderate–severe CD in the proximal ileum, with SES-CD of 9 and MaRIA of 16.3. Coronal and axial contrast-enhanced T1—(a, b) and T2-weighted imaging (c, d) show asymmetric bowel thickening along the mesenteric border (white arrows), sacculations (yellow triangles), and stenosis (white asterisk). Linear hyperintensity (yellow arrow) was found in axial fat-saturated T2-weighted imaging. Hyperintensity present in axial DWI image (e) with b value of 1000 s/mm2. The endoscopic image of the proximal ileum (f) showed longitudinal ulcer. ADC map (g), the mean ADC is 0.86 × 10−3 mm2/s. Diffusivity map (h), the mean Dapp was 1.49 × 10−3 mm2/s. Kurtosis map (i), the mean Kapp was 0.65

Fig. 2
figure 2

ADC, Dapp, and Kapp showing significant differences among inactive, mild, and moderate–severe groups . ADC (a) and Dapp (b) decreased with the increasing disease activity, whereas Kapp (c) increased

ROC analysis for DKI and DWI to differentiate different active groups (Table 3)

Considering 127 segments, ROC analysis found ADC had the highest accuracy (AUC = 0.884, p < 0.001) to differentiate inactive from active with 93.3% sensitivity and 77.7% specificity when ADC was the threshold at 0.865 × 10−3 mm2/s. The accuracy of Kapp (AUC = 0.867, p < 0.001) with 67.9% sensitivity and 93.3% specificity to differentiate inactive from active was near to ADC with the threshold at 0.645. However, Dapp (AUC = 0.726, p = 0.005) with 93.3% sensitivity and 55.4% specificity was obviously inferior to ADC and Kapp with the threshold at 1.365 × 10−3 mm2/s (Fig. 3a).

Table 3 ROC analysis for DKI and DWI to differentiate from inactive to active and from inactive–mild to moderate–severe
Fig. 3
figure 3

ROC analysis considering 127 bowel segments; ADC was found to have the highest accuracy (AUC = 0.884, p < 0.001) to differentiate inactive from active group, which was slightly higher than Kapp (AUC = 0.867, p < 0.001), and obviously higher than Dapp (AUC = 0.726, p = 0.005) (a). For differentiating inactive–mild from moderate–severe group (b), ADC also had the highest accuracy (AUC = 0.846, p < 0.001), which was minimally higher than Kapp (AUC = 0.843, p < 0.001), and obviously higher than Dapp (AUC = 0.690, p < 0.001)

Similar accuracy was found between ADC (AUC = 0.846, p < 0.001) with 71.7% sensitivity and 89.6% specificity and Kapp (AUC = 0.843, p < 0.001) with 71.6% sensitivity and 83.3% specificity to differentiate inactive–mild from moderate–severe at the thresholds of 0.825 × 10−3 mm2/s and 0.645, respectively. Dapp (AUC = 0.690, p < 0.001) with 66.7% sensitivity and 70.1% specificity was obviously worser than ADC and Kapp for differentiating inactive–mild from moderate–severe at the threshold of 1.375 × 10−3 mm2/s (Fig. 3b). The Delong test suggested that there were no significant differences for the AUCs of ADC and Kapp to differentiate inactive from active (p = 0.895) and to differentiate inactive–mild from moderate–severe (p = 0.522).

Considering jejunum–ileum segments, the accuracy of Kapp (AUC = 0.0.877, p < 0.001) to differentiating inactive and active was minimally lower than ADC (AUC = 0.891, p < 0.001). Similar results were found in differentiating inactive–mild and moderate–severe for ADC (AUC = 0.868, p < 0.001) and Kapp (AUC = 0.856, p < 0.001). However, Dapp was obviously lower than ADC and Kapp for differentiating inactive from active(AUC = 0.739, p = 0.015), and for differentiating inactive–mild from moderate–severe (AUC = 0.664, p = 0.012). The Delong test suggested that there were no significant difference for the AUCs of ADC and Kapp to differentiate inactive from active (p = 0.830) and to differentiate inactive–mild from moderate–severe (p = 0.799).

Considering colorectal segments, Kapp had slightly higher accuracy than ADC to differentiate inactive from active (AUC = 0.879, p < 0.001 vs. AUC = 0.872, p < 0.001) and inactive–mild from moderate–severe (AUC = 0.822, p < 0.001 vs. AUC = 0.801, p < 0.001). Dapp was unable to differentiate inactive from active (p = 0.147), and had significantly lower accuracy than ADC and Kapp to differentiate inactive–mild from moderate–severe. The Delong test suggested that there were no significant differences for the AUCs of ADC and Kapp to differentiate inactive from active (p = 0.953) and to differentiate inactive–mild from moderate–severe (p = 0.774).

Interobserver agreement

The intraclass correlation coefficient were excellent between W.K. and W.G. for ADC [ICC = 0.903, (95% CI 0.891–0.932)] Dapp [ICC = 0.897, (95% CI 0.882–0.915)], and Kapp [ICC = 0.911, (95% CI 0.901–0.938)].

Discussion

In our study, it was observed that ADC and Dapp values inversely correlated, and Kapp positively correlated with disease activity. The similar AUCs of ADC and Kapp were found in differentiating inactive from active and from inactive–mild to moderate–severe; however, the AUCs of Dapp were obviously lower than those of ADC and Kapp when analyzing all segments, even jejunum–ileum and colorectal segments, respectively.

The ADC originating from DWI quantitatively reflects water molecule diffusion in tissue in the Gaussian distribution. Dapp is true diffusivity, and Kapp represents the peaked distribution of tissue diffusivity in the non-Gaussian distribution, which is believed to be associated with microstructural complexity in vivo [10]. With the increasing CD lesion activity, lymphoid tissue, and capillary proliferation, inflammatory cell infiltrations are more prominent, which result in higher restriction of water molecule movement, and manifested as lower ADC. Likewise, these changes coupled with nuclear heterogeneity and cellular complexity result in lower Dapp and higher Kapp. Interestingly, in our study, the AUCs of Dapp were obviously lower than those of ADC and Kapp. Possibly, the complexity in lesion changed more obviously than water molecule diffusion with the increasing CD lesion activity. Actually, ADC calculated in the ideal Gaussian distribution were truly and concisely not able to reflect water molecule diffusion in lesion. Clinically, DKI and DWI could be alternatives to intravenous administration of a gadolinium chelate to assess disease activity. Besides, DKI could provide more useful information about lesion.

Recently, two studies [17, 18] have reported that with DKI it is clinically feasible to evaluate the activity of autoimmune inflammation diseases. Furthermore, Huang et al. [18] found that DKI was able to accurately grade disease activity of CD, even superior to ADC. In our study, the highest accuracy for ADC to differentiate inactive from active was the threshold at 0.865 × 10−3 mm2/s, and the threshold for Kapp was at 0.645. The thresholds of both ADC and Kapp were smaller than that reported in the previous study [19]. Similarly, the trend also occurred in differentiating inactive–mild from moderate–severe. These phenomena were possibly associated with imaging equipment, parameters, and post-processing software we performed in the study. In addition, Kapp had similar accuracy with ADC for differentiating inactive from active or inactive–mild from moderate–severe compared to a recent study. The possible reason is that all lesions were divided into groups based on SES-CD rather than MaRIA [18].

Considering the differences between jejunum and ileum and colorectum physiologically, jejunum–ileum and colorectal segments were analyzed separately in this study. We found that the correlations of ADC, Dapp, and Kapp with SES-CD in jejunum–ileum were higher than that in colorectum, but the accuracies to differentiate inactive from active or inactive–mild from moderate–severe in colorectum were similar with that in jejunum–ileum. Therefore, we decided to put jejunum–ileum and colorectum together.

In the study, five b values (200, 500, 1000, 1500, and 2000 mm2/s) were applied to acquire DKI imaging. Previous studies [11, 16] have demonstrated that perfusion has prominent influence for low b values (usually < 200 mm2/s); nonetheless, too much high b values increase the chance of image distortion and susceptibility artifacts, especially for bowels. Therefore, considering acquisition time, we finally selected above five b values in our institution.

Regarding ROI placement on DKI, each ROI was manually drawn along the border of the brightest signal region of bowel wall, instead of size-fixed (i.e. 6–8 mm2) round ROIs [15]. We suggested that there might be selection bias placing small size-fixed round ROIs. A recent study [20] suggested that to analyze a larger number of pixels may result in more reproducible results for the measured parameters. Furthermore, We chose consecutive three slices of ROIs (the slice with the prominent lesion section and the adjacent up and down slices), thus could decrease the deviation of sampling between cases and increase the reliability of results [16].

There are some limitations in our study. Firstly, this is retrospective study, meaning that there were potential selection biases. Secondly, the number of bowel segments was relatively less, because the number of patients was insufficient and normal segments were excluded. Besides, the number of inactive segments was significantly less than mild or moderate–severe, because most of inactive lesions and partial mild lesions, such as erosion, erythema, edema, small ulcers, can be detected by endoscopy, however, they were hardly displayed on conventional MR images, especially on high b values images resulting from the decrease of the signal-to-noise ratio. So our results were not fully and concisely able to reflect the difference of diffusion parameters among different active lesions. Thirdly, due to the decrease of the signal-to-noise ratio on high b values and thin bowel wall, drawing outline ROI inevitably resulted in containing intestinal contents. Finally, although we found DKI was comparable to conventional DWI to assess disease activity in our study, the feasibility of DKI on evaluating disease activity in CD need to be studied through large prospective multicenter studies.

In conclusion, this study suggested that DKI is clinically feasible and helpful to evaluate disease activity in CD. DKI is comparable to conventional DWI for grading disease activity, and able to provide more useful information about lesion.