Introduction

Ulcerative colitis (UC) and Crohn’s disease (CD) are two forms of inflammatory bowel disease (IBD). Compared to UC, where the lesions only involve the mucosa and submucosa, lesions in the CD are penetrating and patients are more prone to complications such as intestinal obstruction and inter-intestinal fistula, leading to a higher risk of surgery [1]. Moreover, CD patients often require more systematic and complex drug therapy. Current guidelines emphasize differentiating IBD through a comprehensive assessment of clinical, endoscopic, pathological, and imaging data [2]. Nonetheless, the clinical symptoms in CD and UC patients often overlap. As a feature of CD, the positive pathology rate of non-caseating granulomas is low [3]. In addition, endoscopy and radiology have limited ability to depict transmural inflammation in CD [4]. Consequently, distinguishing between CD and UC can be challenging, especially in patients with only colonic lesions [5].

Visceral adipose tissue (VAT) in IBD is recognized as an endocrine organ associated with disease progression and adverse outcomes, rather than an inert energy-storing tissue [6]. Particularly in CD, the creeping fat surrounding the affected intestinal segment is closely associated with complex disease phenotype [7, 8]. Previous studies have observed higher VAT content in UC compared to CD patients [9]. Researchers have also investigated the difference in VAT area (VFA) and the ratio of VFA to the subcutaneous fat area (VSR) at the third lumbar vertebrae (L3) level on CT images in CD, UC, and control populations [10]. However, due to the individualized differences in nutritional status, and the fact that accumulation of adipose tissue in CD is usually related to the diseased intestinal segment, using indicators obtained from single-slice images for differential diagnosis can be challenging [5, 11]. Evaluation of whole abdominal CT is currently emerging as the most accurate method to provide precise measurement of VAT volume [12], and abdominal CT scans are routinely performed in IBD populations due to the need for diagnosis and follow-up [13]. Automated VAT volume quantification on CT can be achieved using deep learning models, which is helpful for use in large-scale populations [14]. In addition, algorithms can also quantify the VAT distribution based on anatomical location, thus revealing the relationship between different cross-sectional areas and volumes. Studies have been conducted to analyze the distribution of VAT in overweight or obese patients [15], but no research has yet extended this method to the IBD population. Therefore, this study aimed to investigate the distribution characteristics of VAT with the assistance of a deep learning model in IBD subjects, to distinguish IBD subtypes.

Materials and methods

Study participants

This multicenter retrospective study conducted at three institutions was approved by the Research Ethics Committees from each of the participating centers, and informed consent was waived. Researchers reviewed and identified patients who underwent abdominal CT scans and were diagnosed with CD or UC between 2012 and 2021 by searching the electronic medical record system. The diagnosis was confirmed under the World Gastroenterology Organization global guidelines [2]. The following patients were excluded: (1) age < 18 years old, (2) CT data were unavailable, (3) patients had comorbidities including cancer or renal failure, (4) patients received immunotherapy for other reasons within 6 months, (5) underwent prior intestinal surgery. The scan closest to the initial diagnosis was selected in patients with multiple CT scans.

Referring to previous studies [10], the control group consisted of patients admitted between 2012 and 2021 with acute appendicitis, who were in good health prior to the onset of acute abdominal pain, and underwent abdominal CT scans before surgery. The patient’s sex, age, height, weight, and smoking history were extracted from the electronic medical record, as well as Montreal classification [16], laboratory indicators (serum albumin (Alb); C-reactive protein (CRP); erythrocyte sedimentation rate (ESR)), and surgical records within the following 6 months. The body mass index (BMI) was obtained by dividing the weight by the square of the height and 24 kg/m2 was used as the cut-off point for overweight according to a prior study [17].

Development of an automatic VAT segmentation model

Semi-automatic segmentation of VAT in CT images

Across different time periods and geographic regions, abdominal CT protocols have displayed variations, yet all have consistently surpassed the essential technical prerequisites [18]. For IBD patients undergoing CT scans in the outpatient department, they were required to fast for 4 to 6 h before the scan and avoid gas-producing liquids. Additionally, they were instructed to ingest 1000 to 1500 mL of aqueous 2.5% mannitol within 45 to 60 min before the scan. For patients with IBD and appendicitis undergoing CT scans in the emergency department, the above preparation was omitted and they proceeded directly to enhanced CT scanning. All patients started with a pre-contrast scan, and then, contrast-enhanced CT was performed after a rapid bolus of iopromide (Ultravist 370, 370 mg/ mL, Bayer Schering Pharma, Berlin, Germany) (1.5 mL/kg) at a rate of 3–5 mL/s, followed by a 20-mL saline flush using a power injector. Images were routinely obtained in the arterial, intestinal, or portal venous phases. All CT scans covered the whole abdomen and pelvic cavity, and the maximal slice thickness was 3 mm. Arterial phase images were selected for further analysis. A semi-automated method was used to quantify the VAT between the dome of the diaphragm and the pubic symphysis on the patient’s CT image [19]; the procedure was described in the supplementary methods.

Construction of an automatic segmentation model

We constructed an automatic VAT segmentation algorithm based on a 3D U-shape convolutional neural network (CNN), and Table S1 and S2 show detailed information. We trained and validated our segmentation model using the fivefold cross-validation algorithm in center 1. Data from the other two centers were deployed to validate the generalization of the model. To assess the model’s reliability and effectiveness, we conducted a test wherein 30 patients were randomly selected from 3 different centers. Subsequently, we utilized both semi-automatic and automatic methods to delineate VAT regions. The time taken for each approach was recorded for further analysis. The segmentation results were compared with the previous semi-automatic results, and the Dice and Jaccard were calculated. Furthermore, elapsed time comparison was conducted between semi-automated segmentation, unet, and unet+ adjustment.

Extraction of VAT indicators

The VAT volume was quantified using the masks generated from the semi-automated and automated methods. Multiple 2D axial slices from each patient’s CT images were selected for analysis, with detailed steps described in the supplementary methods, and representative 35 slices were selected from the first to the fifth lumbar vertebra and pelvic level. The area of the selected image is automatically calculated through the marked layer. Then, the measured lumbar height (vertical height between L1-1 and L5-5) was used to standardize VFA as follows: standardized index (visceral adipose index, VAI) = VFA/heightL1-L52 (cm2/m2) [20]. We assessed the VAT distribution by calculating the VAT ratio at each level as follows: VAT ratio (%) = (VFA × layer thickness)/VAT volume × 100%. The VFAs of all slices in the lumbar region (L1 to L5) were calculated, and the mean, standard deviation (SD), and coefficient of variation (CV) of areas were extracted to reflect the distribution. The 316 IBD patients in this study were reported previously [19]. The prior report studied the value of radiomics in the identification of IBD subtypes. The current study included more cases, faster analysis methods, and more interpretable quantitative parameters to extend this finding.

Statistical analysis

Quantification of VFA, volume, and distribution was performed using Python version 3.7, and the source code is available at https://github.com/CharelBIT/nnUNet-modify. Statistical analyses were performed using SPSS version 26.0 and MedCalc Statistical Software version 20.100. Comparisons between two groups of continuous variables were made using Student’s t-test or Mann-Whitney U-test, and comparisons between the three groups of continuous variables were made using one-way ANOVA or Kruskal-Wallis test as appropriate. The χ2 test was used to compare categorical variables. The Pearson/Spearman correlation coefficient was used to analyze the correlation between variables. Binary logistic regression and receiver operating characteristic (ROC) analysis were performed to evaluate the potential of indicators to distinguish between CD and UC patients, and the area under the ROC curve (AUC) was compared between semi-automatic and automatic ways using the DeLong method. Statistical significance levels were set at p < 0.05 for two groups, and 0.0167 (corrected p = 0.05/3 = 0.0167) for three groups.

Results

Clinical characteristics of patients

A total of 772 patients (365 CD patients, 241 UC patients, and 166 controls) were included. Figure 1 shows the detailed steps. Comparisons among CD, UC, and controls are shown in Table 1. Compared to the other two groups, the CD group had more males and was generally younger but had a lower BMI (half were underweight). The prevalence of perianal disease was 37.5% (129/344) in the CD group compared to only 7.6% (18/237) in the UC group. In terms of laboratory indicators, Alb levels were generally low in IBD patients; CRP and ESR levels were significantly higher in CD than in UC (p < 0.001). In addition, 16.7% (61/365) of CD patients had lesions involving only the colon. At follow-up, a significantly higher proportion of CD patients had undergone bowel resection within 6 months (24.7% vs. 5.9%).

Fig. 1
figure 1

Inclusion flowchart of study subjects

Table 1 Comparison of clinical characteristics of all patients

Performance of automatic VAT segmentation models

The network topology of the automatic segmentation algorithm is shown in Fig. 2a. The Dice scores were above 0.90 for the training set and above 0.85 for all the testing and validation sets (Table S3). The VAT volumes obtained from the semi-automatic and automated processes were highly correlated (r = 0.99, p < 0.001). The results of the repeatability test are shown in Table S4. The automatic model took an average of 3.3% (1.14/34.50 min) of the time required by the semi-automatic method to complete the segmentation of a patient. Subsequent adjustments resulted in an improvement of the model’s Dice score, and the automatic method plus manual adjustment process took 15.1% (5.22/34.50 min) of the time of the semi-automatic methods (Fig. 2b, c).

Fig. 2
figure 2

Construction and efficiency testing of automatic segmentation algorithm. a The topology of the whole network consists of the encoding and decoding parts. There are five stages in the encoding and decoding subnetwork, indicating that five-level scales of feature maps were formulated for automatic feature extraction. b The time required by the automatic segmentation algorithm, including preprocessing, inference, and post-processing. c The time required by the semi-automated method, U-net model, and U-net model plus subsequent adjustments

Comparison of the VAT characteristics among the three groups

The volume of VAT was significantly lower in CD (1584.95 ± 1128.31 cm3) and UC patients (1855.30 ± 1326.12 cm3) than in controls (2470.91 ± 1646.42 cm3, p < 0.001). The intra-subject CVs across 35 slices were 34.08 ± 11.14% in CD, 31.08 ± 9.93% in UC, and 29.92 ± 8.97% in controls, respectively, and the CV in CD was significantly higher (p < 0.05). The CVs of VFA at each analyzed level of the three groups are shown in Fig. 3a, and the CVs decreased with decreasing vertebral level (except for Pelvis-5) and the trend was consistent among the three groups. The correlation between VFA and VAT volume at each analyzed level of the three groups is shown in Fig. 3b. Correlation coefficients at all levels were greater than 0.80 except Pelvis-5 (p < 0.01), with the strongest correlation level being L3-L4 in CD (r = 0.954), the upper part of L3 in UC (r = 0.972), and controls (r = 0.950). Comparisons of VAI at different levels among the three groups are shown in Fig. 3c (Table S5), and the VAIs were generally lower in IBD cases. Comparisons of the VAT ratio are shown in Fig. 3d (Table S6). The trend in UC resembled that of the control group, displaying a relatively even pattern. Conversely, CD showed a concentration primarily in the lower lumbar region. We also described the trends according to sex, as shown in Figure S3b, where the trends of UC and controls were still similar and distinguished from CD.

Fig. 3
figure 3

Scatter plots describing various visceral adipose tissue indicators in CD, UC, and control patients. a Inter-subject CV of VFA. b Correlation between VFA and VAT volume. c VAI. d VAT ratio. VAT, visceral adipose tissue; CD, Crohn’s disease; UC, ulcerative colitis; CV, coefficient of variation; VFA, visceral fat area; VAI, visceral adipose index

Differences in VAT distribution among the three groups

The indicators reflecting VAT distribution are summarized in Table 2 (Fig. 4). In the mean value and SD, the CD and UC groups were similar, and lower than the controls, while the CD group had the largest CV reflecting the more heterogeneous VAT distribution within the lumbar region of CD patients. The difference in CV between UC and controls was not significant, although UC had a lower mean and SD. The above comparison results were consistent whether using semi-automated or automatic segmentation results. In addition, we compared the VAT distribution between UC patients with and without perianal fistula and found no significant differences in CV between them (0.27 (0.19, 0.37) vs. 0.24 (0.17, 0.31), p = 0.31). Figure 5 shows coronal CT images of three patients diagnosed with CD, UC, and acute appendicitis. The VAT of the CD patients was significantly more concentrated in the lower lumbar region, while the distribution of VAT in the UC and the control group was more uniform and similar. Furthermore, binary logistic regression showed that after adjusting clinical indicators, CV was still a predictor of CD (OR = 6.05 (1.17, 31.12), p = 0.03) (Table S7). ROC analysis demonstrated that the diagnostic efficacy of the automatic model was comparable to that of semi-automatic techniques (AUC = 0.810 (0.773, 0.843) vs. AUC = 0.811 (0.774, 0.844), p = 0.38), and improved the efficiency of clinical indicators (AUC = 0.803 (0.766, 0.836), p = 0.10 and 0.08) (Table 3).

Table 2 Comparison of the mean, standard deviation, and coefficient of variability of visceral fat areas within the lumbar region
Fig. 4
figure 4

Boxplots of VAT distribution indexes within the lumbar region in three groups. The differences among groups of each index calculated by semi-automatic and automatic segmentation results were consistent. VAT, visceral adipose tissue; CD, Crohn’s disease; UC, ulcerative colitis; SD, standard deviation; CV, coefficient of variability, *p < 0.05, **p < 0.01, ***p < 0.001

Fig. 5
figure 5

Coronal CT images of three patients diagnosed with CD, UC, and acute appendicitis, and their visceral adipose tissue was marked red. a A 24-year-old male CD patient with a BMI of 12.35 kg/m2 presenting with ileocolonic CD. The average VFA within the lumbar region was 22.46 cm2 with a CV of 66.39%. b A 30-year-old male UC patient with a BMI of 17.82kg/m2, presenting with mild UC. The average VFA was 58.84 cm2, and the CV was 30.15%. c A 32-year-old male appendicitis patient had a BMI of 27.71kg/m2. The average VFA was 161.30 cm2, and the CV was 28.86%. CD, Crohn’s disease; UC, ulcerative colitis; BMI, body mass index; VFA, visceral fat area; CV, coefficient of variation

Table 3 Comparison between areas under the receiver operating characteristic curve for differentiation between Crohn’s disease and ulcerative colitis groups

Discussion

In this study, a deep learning approach was used to achieve automated quantification of VAT, which was used to investigate the VAT distribution in CD, UC, and controls. Unlike the relatively uniform VAT distribution in the UC and controls, the VAT in CD was more concentrated in the lower lumbar vertebrae and further analysis revealed a greater variability of VAT distribution in CD patients. Subsequent analyses also confirmed the ability of CV to distinguish between CD and UC patients and improved diagnostic performance when combined with clinical indicators.

Previous body composition studies conducted in the IBD population mainly used semi-automated quantification methods based on single-slice images [10, 21, 22], and the frequent usages of single-slice areas are possibly due to its simplicity. The development of automated segmentation of body composition has enabled fast and accurate quantification of multi-slice VFAs and VAT volumes [14, 23, 24], which has been achieved in this study. Our developed model exhibited commendable accuracy not only in IBD patients, but also in individuals diagnosed with acute appendicitis. Moreover, its strong performance persisted across different medical imaging machines and healthcare institutions, underscoring its resilience and applicability. In addition, this method took much less time than the semi-automatic method.

Differences in VAT content among CD, UC, and control (acute appendicitis) populations have been compared. Zhang et al. found that compared with UC and controls, the CD group had lower VFA but higher VSR in a single CT image at the L3 level. The difference between UC and controls was not significant [10]. Jahnsen et al. quantified the body composition through dual X-ray absorptiometry and found significantly lower VAT content in CD than in UC [9]. Clinical imaging analyses have also been performed in IBD patients using representative levels such as L3 or L4, revealing the impact of VAT on CD disease phenotype, activity, and prognosis [20, 21, 25]. Although our analysis showed a high correlation between VFA and VAT volume at all lumbar vertebra levels in the IBD groups, it is not a reliable indicator for identifying and assessing IBD, given that VAT content is closely related to individual nutritional status and disease duration. Therefore, in the current study, the vertebral level-VAT ratio curve was described to show the trend of VAT distribution, and the CV of VAT distribution in the lumbar region was also calculated. Our results show that the VAT distribution in CD is concentrated in the lower lumbar region compared to the relatively uniform distribution in non-CD groups. Furthermore, this trend was observed consistently in both male and female patients. Subsequent analysis unveiled that the CV in VAT distribution was notably higher in CD compared to those with UC. Conversely, no discernible difference between UC and controls was found, which was not mentioned previously. Based on these findings, we devised a novel approach for identifying CD and UC individuals by integrating CV measurements with pertinent clinical indicators. Notably, CV emerged as a significant predictor for CD.

Abnormal hyperplastic mesenteric adipose tissue (MAT) can be seen around inflamed intestinal segments in CD patients, which can occur early in the disease [26]. Under normal conditions, MAT participates in the balance of local and systemic immune microenvironments. It serves as a physical barrier to block the spread of inflammation, absorbs excess fat and sugars, and secretes various anti-inflammatory factors thereby reducing the severity of intestinal inflammation [27]. However, for CD patients, MAT is a repository for dysfunctional immune cells, and it secretes more fatty acids and increases immune cells and extracellular matrix, exacerbating intestinal inflammation [28, 29]. Interestingly, some studies have found that VAT may play a different role in UC. A study by Zulian et al. revealed differential inflammatory gene expression as well as bacterial load in the MAT of CD and UC [30], suggesting that VAT plays an important role in the pathophysiology of CD, but not in UC. Therefore, the function of VAT in UC may be closer to that of normal people than that of CD. Considering that the distribution of MAT in CD patients is related to the affected intestinal segment, and CD mainly affects the ileum, this may explain why our results showed a similar VAT distribution between UC and controls, while CD was more concentrated in the lower abdomen.

This multicenter study has some limitations. First, inherent flaws exist in retrospective study design. Due to the considerable time and space spanned, variations in CT scanning protocols, whether within the same or different locales, are practically unavoidable. To address this challenge, we adopted a focused strategy, specifically selecting arterial phase CT images for analysis. Furthermore, to mitigate potential heterogeneity, we normalized both spatial and intensity before feeding them into the neural network for processing. Second, due to the importance of VAT for CD, our model currently exclusively identified VAT and did not encompass other body composition such as subcutaneous fat. But subsequent studies will include more labels if necessary. Finally, considering the trade-offs associated with radiation exposure in enhanced CT scans, our control group was composed of patients with acute appendicitis rather than healthy volunteers. To mitigate potential confounding effects on the results, we focused only on VAT content without considering the attenuation value.

Based on the deep learning algorithm, VAT in CT images can be readily identified, allowing for automated extraction of content, and distribution information. In CD patients, the distribution of VAT is concentrated in the lower lumbar level and exhibits greater heterogeneity compared to non-CD patients. This feature may be a potential biomarker to distinguish CD from UC.