1 Introduction

About 50% of the patients consulting a gastroenterology clinic report symptoms without detectable cause. Colonic content is a potential mechanism involved in their symptoms. The research of colonic metabolic activity and its variations provoked by digestive dysfunctions or diets requires non-invasive measurement of colonic volumes and contents based on medical imaging.

Although diseases under study can be particularly disturbing, they are not life-threatening, and therefore irradiation —if used at all— is to be kept to a minimum. Hence, non-ionizing imaging techniques play an important role as they allow acquiring data from patients with low-severity diseases or healthy volunteers. Furthermore, some clinical studies expressly reject preparation (fasting and/or edema) neither contrast administration. For all these reasons, CT imaging, which has been traditionally used for colon analysis, is not a choice.

The clinical analysis of colon is typically performed on MRI T2-weighted modality, and its goal is to distinguish the specific volumes of the colon segments (ascending \(S_{ asc }\), transverse \(S_{ trv }\), descending \(S_{ dsc }\) and sigma-rectum \(S_{ \sigma }\)).

Experts make use of specific tools for engaging colon segmentation on MRI, but it is a complex task due to the high level of variability of its anatomy (specially \(S_{ \sigma }\) segment) and the adjacency of regions with similar intensity levels, such as small bowel, liver or muscular tissues. Techniques used for CT colon segmentation are not applicable for MRI since there is not a fixed correspondence between tissues types and intensities. Furthermore, MRI suffers from higher levels of noise and artifacts that have an impact on segmentation algorithms.

There are few references in the bibliography for T2-MRI colon segmentation on unprepared subjects, here we will review the most relevant. In [1] the authors opted for the simplest segmentation strategy: slice-per-slice manual selection. In order to ease the selection within slices, [2] improved the usability by providing a seed region growing mechanism combined with the ability to add stop markers for prevent leaking. A different proposal is presented in [3], which requires the user to define a Region Of Interest (ROI) by outlining the colon manually with a polygonal in all coronal slices. Their approach is based on k-means clustering on the intensity space within the ROI to separate colon from background. The authors do not provide comparison metrics against ground truth, only overlapping measures of segmentations obtained from different users on the same image.

In another category of applications, Mahapatra et al. [4] describe a full automatic segmentation of colon areas affected by Crohn’s disease on T1-FS under fasting condition and contrast administration. The authors build a Random Forest classificator that permits discriminating healthy and diseased colon regions. Finally, authors in [5] segment only one 2D colon section along a temporal sequence of T2-HS (cine-MRI). Their strategy is based on a set of user marks placed inside and outside the colon image of the first frame. Segmentation is driven by 2D graph-cuts.

As far as we know, there is a lack of algorithms for T2 colon segmentation on MRI images acquired without contrast neither colon preparation. Summarizing, the contributions of this paper are outlined below:

  • A new approach for colon segmentation based on the detection of the colon medial line and the usage of a colon probability model that is used on a 3D graph cuts algorithm to produce the final result. Our dual probabilistic model uses training information for a preliminary segmentation and unsupervised clustering for the final segmentation.

  • A set of novel probabilistic tubularity filters that allow detecting generalized tubular structures with large radius and non-circular sections. Moreover, a set of fast algorithms to segment a coarse voxel model for adjacent colon areas and to reduce the search space (liver, psoas, spine and fat inner abdominal layer) have been developed.

  • Our segmentation algorithm is suitable for clinical use since it provides a low-effort, accurate colon segmentation in MRI T2-weighted images without neither colon preparation nor contrast administration.

  • Our approach achieves a remarkable improvement in the experts interaction. The full colon segmentation requires 5 min of user interaction (UI), and 5 min of CPU processing. In contrast, current manual-based solutions require times that range from 20 up to 40 min of intensive work.

Fig. 1.
figure 1

Overview of the segmentation pipeline. Four stages are shown: (T) Tubularity filter aimed to detect colon candidate areas. (A) Search space delimitation discards adjacent regions. (M) Medial line extraction is based on a set of points provided by the user and information on (T) + (A). (S) is the segmentation stage, which combines (M) + (A) with training data to obtain the final segmentation in a two-phase graph-cuts.

2 Method Overview

Our segmentation strategy consists in modeling the colon as a generalized tube with a characteristic probabilistic distribution of intensities and radius differentiated for each colon segment. Complementary, it is convenient to exclude some adjacent organs from the search space, since their similar intensities cause low contrast boundaries. Our algorithm combines both approaches with a simple user input to obtain the precise colon segmentation. It is remarkable that the algorithm is fully 3D, working on a 3D voxel model built from the acquired images. Figure 1 shows the segmentation pipeline, which relies on four stages:

Fig. 2.
figure 2

Probability distribution \(p(D, \hat{I}|C_{S_{trv}})\) of transverse colon.

  1. 1.

    Tubularity filter: We propose a tubularity filter evolved from ideas in [6, 7] aimed to detect generalized non-uniform tubes. Since the filter has a high tolerance to be able to detect large deviations from perfect tubularity, the output is noisier than regular tubularity filters, and its result alone is not sufficient for colon segmentation. The tubularity feature of each voxel includes its direction, a tubularity measure and an estimated radius, which are used in subsequent stages for the selection of colon candidate areas and for spinal cord detection.

  2. 2.

    Non-colonic area delimitation: In this stage, fast tailored algorithms coarsely segment voxel models for four anatomic structures which are closer to the colon: liver, psoas+pelvic muscles, spinal cord + spine and the inner abdominal fat capsule. Therefore, its location is valuable for preventing leaks that may be caused by low contrast boundaries. The output of this stage is a set of binary masking volumes delimiting the segmented structures.

  3. 3.

    Colon medial line extraction: The estimation of the colon medial line is the starting point of our segmentation strategy. The user has to provide a minimal set of 5 anatomical reference points along the colon path that are easily located by specialists on MRI: cecum, hepatic angle, splenic angle, descending-sigma interface and anus. Depending on the anatomical complexity of the case under study, further points can be added to guide the location of the medial path. On this basis, we build a graph where nodes represent connected sets of voxels with similar intensities. Graph edge cost penalizes paths of low tubularity, tube direction changes and high intensities. The medial path is obtained as the union of the lowest cost paths between pairs of consecutive points provided by the user.

  4. 4.

    Colon graph-cuts based segmentation: Last stage performs the colon segmentation. It requires information from the previous stages (medial line (M), search space (A) and intensity (I)) and information from a training database, which is computed in a pre-process and updated after each new performed segmentation. In the training phase, we use the golden truth segmentations (see Sect. 3) to estimate \(p(D, \hat{I})\), which is the joint probability function of the normalized intensity \(\hat{I}\), and the distance to the colon medial, D. The intervals of \(\hat{I}\) and D are quantized, and the probability function is stored into our training database as a 2D histogram. The probability is analyzed within a ROI defined by those voxels having D below \(1.5 \cdot CMR \), where \( CMR \) denotes the Colon Maximum Radius (30 mm).

The intensity is normalized using the range of values along the medial path, \(\hat{I}=(I-\mu _{medial}) /\sigma _{medial}\), redressing this way the effects of MRI intensity variability. \(p(D, \hat{I})\) is analyzed independently within each colon segment and outside the colon (\(\overline{C}\)). Hence, we obtain 5 probability distributions denoted by \(p(D, \hat{I}| region )\) where \( region \in \{\overline{C},C_{S_{asc}},C_{S_{trv}}, C_{S_{dsc}}, C_{S_\sigma }\}\). Figure 2 shows the distribution of \(p(D, \hat{I}|C_{S_{trv}})\) in the transverse colon. The training also provides statistics of D for the voxels on the colon boundaries per segment \(\{ DB _\mu ^{sg}, DB _\sigma ^{sg}\}\).

  • Now we describe the three steps of the last segmentation stage:

    1. (a)

      From \(p(D, \hat{I}| region )\) obtained in the training and the colon medial (M), we derive the probability of a voxel to be colon \(p(C|D, \hat{I}, sg )\) and to be not-colon \(p(\overline{C}|D, \hat{I},sg)\) depending on its normalized intensity, medial distance and nearest segment \( sg \in \{C_{S_{asc}},C_{S_{trv}}, C_{S_{dsc}}, C_{S_\sigma }\}\). At this point we engage the preliminary segmentation. A graph \(\mathcal {G}_1\) is created where nodes represent ROI voxels and graph edges represent voxel neighborhoods. In order to segment via graph-cuts we add two extra nodes, sink (colon) and source (not-colon), and use \(p(C|D, \hat{I}, sg )\) and \(p(\overline{C}|D, \hat{I}, sg )\) to build sink and source costs in the regional term R of graph cuts algorithm. The result is the preliminary colon segmentation, \(S_1\).

    2. (b)

      \(S_1\) happens to be a conservative colon segmentation, but is not accurate enough. In the second step we cluster intensities in the area outside \(S_1\) within the ROI, using an Expectation Maximization (EM) algorithm. We look for two modalities, one corresponding to fat tissues (high intensity) and the other corresponding to other organs or misclassified colon (medium intensity). At the end of the process we obtain the probabilies of a voxel to be C or \(\overline{C}\) as functions of the intensity: p(C|I) and \(p(\overline{C}|I)\). The model is based on the mixture of three gaussians, the two obtained from EM and the gaussian model from \(S_1\) intensities.

    3. (c)

      Using boundary distance statistics \(\{ DB _\mu ^{sg}, DB _\sigma ^{sg}\}\) from training, we derive p(C|Dsg) and \(p(\overline{C}|D,sg)\), which describe the probability of a voxel to be colon/not-colon depending on its medial distance (Eq. 1) and the corresponding colon segment.

      $$\begin{aligned} p(C|D,sg )=\dfrac{1}{2}(1- erf ( \dfrac{D- DB _{\mu }^{ sg }}{\sqrt{2} DB _{\sigma }^{ sg }})) \;\;\;\;;\;\;\;\;p(\overline{C}|D, sg )=1-p(C|D, sg ) \end{aligned}$$
      (1)

      In the last stage we merge the colon/not-colon probabilities (Eq. 2) based on intensity (from clustering) with the probabilities based on the medial distance (from training) in order to build the probabilities of a voxel to be colon/not colon as functions of intensity, medial distance and segment: p(C|DIsg) and \(p(\overline{C}|D,I,sg)\) (Eq. 3). In a similar way that in step (a), we build a graph \(\mathcal {G}_2\) using these new probabilities in the regional terms of graph cuts. The new result, \(S_2\), is more accurate than \(S_1\). If the segmentation \(S_2\) is not satisfactory, the user can add markers (positive or negative) on misclassified areas. Marker information is used to update the regional term costs of the affected nodes of the graph. Graph-cuts is run again to obtain a new corrected version of the segmentation. The refinement is accomplished in near real-time, which allows the user to add markers interactively.

      $$\begin{aligned} \mathcal {M}( p_D,p_I )&= p_D (1+2 (1-p_D ) (p_I-0.5)) \end{aligned}$$
      (2)
      $$\begin{aligned} p(c|D,I,sg )&= \mathcal {M}( p(c|D,sg ), p(c|I) )\, \text{ where } \,\, c \in \{C,\overline{C}\} \end{aligned}$$
      (3)
Fig. 3.
figure 3

Ring filter

2.1 Tubularity Detection Filter

We propose a new Tubularity Detection Filter (TDF) that is built as a combination of two filters: the ring filter (RF) that computes for each voxel a tubularity measure RF and its associated radius s, and the directional filter (DF), which estimates the tube direction \(P_\theta \). By applying TDF to a certain voxel \(\varvec{v}\), it gets characterized by (RF(\(\varvec{v}\)), s(\(\varvec{v}\)) and \(P_\theta (\varvec{v})\)).

The Ring Filter (RF) works on a set of planar ring-shaped vectorial kernels \(k_{(s,\theta )}(\varvec{x})\) that lie in a plane orthogonal to \(P_\theta \) and have a scale s, Fig. 3. Multiple kernels are necessary to cover different tube sizes and 3D orientations. On this purpose 13 scales s (\(s \in [7\) mm, 31 mm] in steps of 2 mm) and 13 directions \(\theta \) (pointing to the 26 neighbour voxels) are used. Kernel values are 0 outside the \(P_\theta \) plane and show a radial Gaussian distribution on the \(P_\theta \) plane as \(k_{(s,\theta )}(\varvec{x}) = \mathcal {N}_{(\mu =0,{\sigma ={ \frac{\gamma 2s}{3})}}}(||\varvec{x}||-s(1-\gamma )) \), where \(\gamma =0.25\).

The tubularity measure \(RF(\varvec{v})\) of a certain voxel \(\varvec{v}\) is computed as \(RF(\varvec{v})=max_{s,\theta } (M_{ ring }^{s,\theta } (\varvec{v})\cdot M_{ sym }^{s,\theta }(\varvec{v}))\), where this maximum is computed for all 13 scales and 13 directions, and its associated radius is computed as \(s(\varvec{v})= \mathop {\hbox {arg max}}\nolimits _{s,\theta } ( M_{ ring }^{s,\theta } (\varvec{v})\cdot M_{ sym }^{s,\theta }(\varvec{v}))\).

\(M_{ring}^{s,\theta }\) is the result of computing the well-known convolution of the filter kernel \(k_{(s,\theta )}(\varvec{x})\) with the magnitude of the projection of the intensity gradient at \(\varvec{v}\) on the \(P_\theta \) plane. The symmetry measure \(M_{sym}\) weights \(M_{ring}\) with the objective to punish partly open areas. We compute the symmetry measure \(M_{sym}\) by analysing partial convolutions \(M_{ringsectors}\) on angular sections \(\phi \) on the \(P_\theta \) plane, see Fig. 3. We group sections by pairs P, each pair including one section and its opposite symmetric. For each P, we calculate the symmetry pair value \(( SPV )\) as the square of geometric mean divided by the arithmetic mean of P. \(M_{ sym }\) is the mean of all \( SPV \).

Finally, the DF filter is used to improve the estimation of the tube direction. To this end, we trace 92 uniformly spaced directions from \(\varvec{v}\) plus 45 even spaced sample directions in the coronal plane. The tube direction \(D(\varvec{v})\) is estimated by the ray direction that has minimum average squared difference to the central voxel \(\varvec{v}\).

3 Evaluation and Results

The evaluation experiments tested our segmentation approach from three different perspectives, its accuracy with respect manual golden truth segmentations, its computational and user interaction cost, and its usability.

The data set used for the accuracy evaluation includes 30 T2-weighted HASTE volumes (256\(\,\times \,\)256\(\,\times \,\)50). Images were obtained from 15 healthy volunteers after and before defecation as part of a clinical experiment to determine the effect of diet on colonic content volume. These data set were segmented accurately by experienced specialists using BowelMetryRM (BMRM) software (its description is explained in [2]). This manual segmentation is considered the golden standard, and is used both for training and validation, using one-left-out methodology.

Dice Similarity Coefficient (\( DSC \)) and Sensitivity (\( SENS \)) measures were selected for comparing our segmentation results against the golden standard. Both measures are studied with respect the number of points introduced by the specialist in the stage M (see Fig. 1) of our approach. Figure 5 shows the evolution of \( DSC \) and \( SENS \) from 5 up to 16 path points. Segmentation with 5 points reaches a mean \( DSC \) above 0.73, but the variability of the results (\(\sigma _{ DSC }=0.17\)) advises using a larger number of points. The segmentation accuracy saturates near 12 points, reaching medians around 0.85 for \( DSC \) and 0.86 for \( SENS \), with standard deviation (SD) values \(\sigma _{ DSC }=0.05\) and \(\sigma _{ SENS }=0.07\) respectively.

Fig. 4.
figure 4

Details on segmentation results for a case of sigma colon highly convoluted. (a) original image and two ROI. (b.1) and (b.2) depict ROIs, with golden truth segmentation (left) and our segmentation (right). (c) shows volume rendering (front view, back view) of our segmentation.

Fig. 5.
figure 5

Tukey boxplot for \( DSC \) and \( SENS \) measures for our segmentation against golden standard and its relationship with the number of used path points.

Fig. 6.
figure 6

\( DSC \) improvement with the addition of manual correction markers.

In the last stage of the approach (S), the addition of user markers for segmentation correction can improve the accuracy of the results as show in Fig. 6. After the addition of 5 markers, the overall median \( DSC \) reaches 0.913, with \(\sigma _{ DSC }=0.016\) (Fig. 6).

From a clinical point of view, we have compared the effort of manual segmentation using BMRM software with our proposal. First, a BMRM expert segmented three new cases not included in the validation experiments using our algorithm, after ten minutes briefing on how operating the software. Next, the specialist segmented the volumes again using BMRM. Note that for fair the learning effect can only benefit the manual segmentation. Time and mouse usage were tracked along the execution.

The results show that manual segmentation takes on average 25’ of full user interaction, with 1230 mouse wheel turns, and 122 cm of mouse drag. In contrast, our proposal averages 5’ for user interaction, of which 2’ correspond to point placement in stage (M) and 3’ to review and correct, if needed, the proposed segmentation; CPU computational cost was 5’ with an i7 5820 K processor. Mouse usage drops to 500 wheel turns and 36 cm of mouse drag. These results show that our proposal clearly performs much better in terms of time and effort than BMRM. Qualitatively, the users declared to feel highly relieved with the simplicity of the new approach when compared with manual segmentation, emphasizing that the visual fatigue is drastically reduced.

4 Conclusions

We have presented a quasi-automatic pipeline for colon segmentation on T2-weighted MRI images obtained from unprepared colon. Our proposal achieves an important reduction of the segmentation time with respect state-of-the-art solutions, also reducing the user interaction up to a 80% and the usage effort. The segmentation accuracy is comparable to manual experts one. Medical experts found that this new algorithm improves efficiency and it is suitable for its use in clinical practice due to its easy-to-use, low interaction and improves the objectivity of the segmentations.

Segmentation pipeline relies on a new Tubularity filter that allows the detection of irregular tubular structures, such as the colon. The combination of tubularity, a reduced search space and a probabilistic model based on intensity and radius per segment have demonstrated its suitability for accurate colon segmentation.