Background

Ulcerative colitis (UC) is a chronic relapsing-remitting inflammatory disorder affecting intestinal mucosa. The pathogenic mechanisms of UC are complex and involve interaction between genetic, host immune system and environmental factors [13]. The diagnosis of UC is determined by standard clinical, endoscopic, radiological, and histological criteria [4]. Histological analysis is an important component in diagnosis, classification, and evaluation of treatment effectiveness of UC [5]. Several scoring schemes for images of haematoxylin–eosin (HE) stained sections are currently used in inflammatory bowel diseases and classification of colon inflammation activity. Some histological scoring schemes are designed particularly for UC cases [610]. Analysis of microscopy images and histological scoring is not only time consuming, but the results are often susceptible to inconsistency due to human factor [11, 12]. Development of digital microscopic imaging technology and image processing techniques [13] inspired research towards translational computational systems that can detect, analyze, classify, and quantify tissue sections. Usage of digital imaging systems could make histological image assessment less time consuming, but also could improve diagnostic quality due to objective estimation of image features.

Histological features of chronic active UC listed in the guidelines for visual inspection and evaluation of histologic images include crypt distortion, crypt branching, and lymphoplasmacytic infiltration deep into the crypts [14]. It is possible to elaborate quantitative estimates of these and similar features based on mathematical transforms used for pattern evaluation in the images from various technical areas, e.g. defect detection in textile [15] or medical diagnostics, e.g. detection of early stage of cancer in human cervical tissues [16]. The approach mentioned above is based on estimation of spatial frequency parameters of the images and provides quantitative estimates of periodic and/or random structures. Majority of known diagnostic features of UC could also be considered as estimates of periodic and/or random structures. Therefore, methods aimed at evaluation of spatial frequency parameters could provide promising results.

The aim of this study was to develop a method for automated evaluation of inflammation severity based on evaluation of spatial frequency features in histological images of inflamed mice and human colon tissue.

Methods

Animals and experimental colitis model

BALB/c mice used in this study come from our previous research that aimed to evaluate the role of NADPH oxidase in pathogenesis of colon inflammation [17]. Acute and chronic colon inflammation in the animals was induced by oral administration of 3.5 % dextran sulphate sodium (DSS, TdB Consultancy, Uppsala, Sweden). Detailed methods of experimental colitis induction in mice and clinical data analysis have been published in R. Ramonaite et al. paper [17] and our current study included only histological samples of the colon. Lithuanian Animal Ethics Committee approved the design of experiments (Protocol no. 0201).

Histological specimen imaging

Images were taken by means of OLYMPUS IX71 light microscope (×20 magnification) equipped with Q IMAGING EXI aqua camera at 1392 × 1040 pixels resolution (0.6 μm/pixel).

Assessment of the histological score in mice

Colonic segments were washed with the Mg+2 or Ca+2 free phosphate-buffered solution (PBS) and immediately fixed by the neutral 10 % formalin for 4 h at room temperature for paraffin embedding. Serial 4-μm sections were cut for each tract and stained with HE. The experts approved the image resolution for further analysis, confirming that all the tissue levels and structures are not distorted and clearly visible. Histological examination was performed using analysis method according to M. Hausmann et al. [18].

Patients

Fifteen subjects participated in the study: 6 patients with UC (medium age (year ± SD) = 42 ± 20.85, men n = 4, women n = 2) and 9 control subjects (medium age (year ± SD) = 64.44 ± 15.73, men n = 5, women n = 4). UC patients and control subjects were recruited in the Department of Gastroenterology, Hospital of Lithuanian University of Health Sciences during the years 2011–2014. The diagnosis of UC was based on standard clinical, endoscopic, radiological, and histological criteria [1921]. Patients with mild to severe disease activity were included in the study (Mayo UC Endoscopic Score 1 to 3). The control group consisted of patients with irritable bowel disease or functional constipation and routine colonoscopy was performed as a part of their planned examination workup. Individuals were included in the control group if they had a no endoscopic signs of inflammation during colonoscopy. Kaunas Regional Biomedical Research Ethics Committee approved inclusion of patients within the study (Protocol No. BE-2-10).

Assessment of histological score in humans

The colon biopsies were obtained from inflamed (UC patients) and non-inflamed mucosa (control subjects) during endoscopy. Biopsies were washed with the PBS and immediately fixed by the neutral 10 % formalin for 4 h at room temperature for paraffin embedding. Serial 4-μm sections were cut for each tract stained with HE and examined using Riley scoring technique [8].

Statistical analysis

All clinical and histological data were analyzed using SPSS version 16.0 software (SPSS Inc., Chicago, IL). Statistical analyses were performed using one-way ANOVA according to Ramonaite et al. [3, 17].

Image preprocessing

Image processing algorithms were realized as programs in MATLAB computation environment and ran on personal computer with an Intel® Core™ 2 Duo, 3.06 GHz processor and 2GB of RAM.

Normalization of illumination intensity in images was realized by means of image histogram alignment using algorithm similar to Petrolis et al. [22]. All images contained empty white areas with no cells, the pixel values of which were forming a peak on the right side of the image histogram used as the reference. Image illumination adjustment was made adding certain bias to pixel values. The bias value was determined by maximizing correlation between histogram peaks representing white areas in analyzed pictures and ones in reference image. All analyzed pictures were preprocessed with the same procedure.

Automatic image features formation was performed on 512 × 512 pixel mice and human colon image cutouts (samples) selected by the experts, representing as much as possible homogeneous and typical tissue pattern without any gaps. Fifty such samples were representing acute inflammation, 50 chronic inflammation and 50 healthy controls for mice specimen cutouts. One-hundred-fifty-six samples were representing UC and 96 came from controls of human biopsy images.

Examples of typical images representing whole range of tissue patterns form healthy controls to acute inflammation and their cutouts are presented in top and middle rows of Fig. 1.

Fig. 1
figure 1

Examples of typical analyzed images representing whole range of tissue patterns: healthy control (a–mice; f–human) on the left; chronic inflammation (b–mice; g–human) in the middle and acute inflammation (c–mice; h–human) in the right. Examples of image cutouts used for analysis are below the whole sample images (from A1 to H2). Graph (e) and image of typical Gabor function (d) in the middle row

Algorithm for feature extraction

Main diagnostic features in histologic images characterizing UC include crypt distortion, branching, and appearance of lymphoplasmacytic infiltrate deep in the crypts [23]. In digital image representation crypts are elliptic white spots varying about 180–350 pixels long and 50–130 pixels wide, both for human and mice specimens. Appearance of eosinophils, which also might be present during inflammation, is expressed as appearance of rounded spots of 7–25 pixels in diameter for all test samples. Therefore, development of inflammatory process could be described by appearance or disappearance of certain contrasted spots of some dimensions, changes of their density and even some specific changes in tissue pattern structure. We used Gabor filters for detection and evaluation of such morphological changes. The procedure performs convolution of analyzed image with function constructed of a cosine wave modulated by two-dimensional Gaussian function [24]:

$$ {g}_{\uplambda, \uptheta, \upvarphi, \upsigma, \upgamma}\left(x,y\right)= \exp \left(-\frac{x{\prime}^2+{\upgamma}^2y{\prime}^2}{2{\upsigma}^2}\right) cos\left(2\uppi \frac{x^{\prime }}{\uplambda}+\upvarphi \right), $$
(1)

where x ′ = x cos θ + y sin θ, and y ′ = − x sin θ + y cos θ.

θ in the equations is the orientation of the Gabor function in degrees; λ represents the wavelength of the cosine factor; φ is the phase offset in degrees; γ is the spatial aspect ratio of elliptic Gabor function and σ is the standard deviation of the Gaussian kernel. We can construct Gabor functions similar in shape to the sought objects in the images or patterns expecting maximal Gabor filter response when applied to corresponding place in the image. That for we need to define following Gabor functions parameters: spatial frequency of the cosine factor f = 1/ λ and half-response spatial frequency bandwidth b (in octaves) of a Gabor filter. The last is related to the ratio σ / λ as follows:

$$ b={ \log}_2\frac{\frac{\upsigma}{\uplambda}\uppi +\frac{\sqrt{ \ln 2}}{2}}{\frac{\upsigma}{\uplambda}\uppi -\frac{\sqrt{ \ln 2}}{2}}, $$
(2)

where ratio σ / λ is expressed as:

$$ \frac{\upsigma}{\uplambda}=\frac{1}{\uppi}\frac{\sqrt{ \ln 2}}{2}\frac{2^b+1}{2^b-1}. $$
(3)

According to recommendations given in [24] we used following parameters to construct Gabor filter bank:

φ :

0

θ :

0°, 30°, 60°, 90°, 120°, 150°

γ :

0.5, 2, 4

λ :

20, 30, 40

b :

5 octaves, 10 octaves, 15 octaves, 20 octaves.

The example of Gabor function is presented on the bottom of Fig. 1. It is easy to recognize similarity between the shape of Gabor function and certain objects of interest in analyzed images, e.g. crypts, neutrophils, abnormalities of the muscularis mucosae, increase of the cells in transmucosal lamina propria, etc.

Assessment of inflammation in digital images cutouts

Constructed filter bank consisted of 216 filters in total (6 orientations; 3 spatial aspect ratios; 3 wave lengths; 4 frequency bandwidths). Application of each filter to ordinary 512 × 512 pixels sample image produced 512 × 512 arrays of responses corresponding to particular spatial aspect ratios, wavelengths and frequency bandwidths for each of 6 orientations. Only the maximal values of responses in regard to orientations were taken for further analysis compensating initial arbitrary orientation of tissue structure in the analyzed image. After this operation we have 36 arrays of 512 × 512 filter responses representing each sample image. We generalized these features calculating mean, histogram skewness and entropy of every responses array, finally getting array of 108 features (36 triplets) representing each analyzed sample image. Mean was calculated:

$$ mean=\frac{1}{mn}\sum_{i=1}^m\sum_{j=1}^n{x}_{ij} $$
(4)

where xij is pixel value of i th row and j th column of analyzed image cutout; m–number of rows and n–number of columns of analyzed image cutout.

Histogram skewness was calculated:

$$ skewness=\frac{\frac{1}{mn}\sum_{i=1}^m\sum_{j=1}^n{\left({x}_{ij}-\overline{x}\right)}^3}{s^3} $$
(5)

where xij is pixel value of i th row and j th column of analyzed image cutout; m–row number and n–column number of analyzed image cutout; x̅–mean of pixel values of analyzed image cutout.

Entropy was calculated:

$$ entropy=-{\displaystyle \sum_{i=1}^n\left({p}_i\cdot { \log}_2\left({p}_i\right)\right)}, $$
(6)

where pi is normalized i th bin value of histogram of analyzed image cutout.

Pooling all data representing analyzed images arrays which contained data representing several cutouts of images taken from several histological pictures of each investigative. It means, one can expect the data array to be not homogeneous and independent, but rather a mixture of several clusters. Testing null-hypothesis about equality of distributions of parameter values in all hierarchal levels (between histological pictures and between investigatives) proved homogeneity of these arrays (Kruskal-Wallis test, p > 0.1). Array of features representing all sample images formed 150 × 108 matrix (108 features form 150 images) for mice and 252 × 108 (108 features from 252 images) matrix for human specimens:

$$ X=\left[\begin{array}{cccc}\hfill {x}_{1,1}\hfill & \hfill {x}_{1,2}\hfill & \hfill \cdots \hfill & \hfill {x}_{1,m}\hfill \\ {}\hfill {x}_{2,1}\hfill & \hfill {x}_{2,2}\hfill & \hfill \cdots \hfill & \hfill {x}_{2,m}\hfill \\ {}\hfill \cdots \hfill & \hfill \cdots \hfill & \hfill {x}_{i,j}\hfill & \hfill {x}_{i,m}\hfill \\ {}\hfill {x}_{n,1}\hfill & \hfill {x}_{n,1}\hfill & \hfill \cdots \hfill & \hfill {x}_{n,m}\hfill \end{array}\right], $$
(7)

where xi,j is the j th feature of ith image cutout. Principal component analysis (PCA) transforms original feature data set X into new space of variables maximizing variation and concentrating correlated original variables [25]. Our training sets of images were constructed so that variation in feature values in regard to inflammation intensity takes the biggest part in it. Then we expect that first or at least one of the first computed new variables (principal components) will give optimal quantitative estimate of inflammation. Spatial correlation R of original representation of all images feature data set X can be estimated as:

$$ {R}_X=\frac{1}{n\cdot m}X\cdot {X}^T $$
(8)

The eigenvector equation for R X [25], representing variation of original feature data set X, is:

$$ {R}_X\cdot \uppsi =\uppsi \cdot \varLambda $$
(9)

where Λ denotes the eigenvalue matrix with the eigenvalues sorted in descending order, and Ψ is the corresponding eigenvector matrix. The matrix Ψ defines an orthonormal transform, which is applied to the original data X and principle component matrix Y is computed:

$$ Y={\uppsi}^T\cdot X. $$
(10)

The first principal component (PC1) appears in first row of matrix Y and we will use it as quantitative estimate of inflammation.

To validate this new constructed variable, estimated for human and mice specimens separately, experts participated in double blind experiment realized by means of special software created in JAWA programing language. The program shows two randomly selected images for the expert, asking him to select the one corresponding to more severe inflammation. The choice of the expert is stored together with values of PC1 corresponding to shown images. Screenshot of the program window is shown on Fig. 2.

Fig. 2
figure 2

Screenshot of developed program for double blinded validation of computed inflammation severity measure by comparing it to expert’s opinion

Results

Histological assessment of colon inflammation in mice

Oral administration of 3.5 % DSS solution for 7 days induced severe acute colitis in mice with significant morphological alterations in the colon mucosa. We determined inflammatory cell infiltration of L. submucosa (3.8 ± 0.46) and major epithelium damage with loss of crypts in large areas (3.9 ± 0.38) in mice colon tissue with acute colitis. Administration of 3.5 % DSS for 44 days induced less severe damage of colon tissues, however, inflammatory cell infiltration of L. muscularis mucosae (2.3 ± 0.31) and loss of goblet cells in large areas (2.8 ± 0.16) were observed. Control mice possessed no histological alterations in the colon tissues (Table 1).

Table 1 Histological characteristics of BALB/c mice colon tissue

Histological assessment of colon inflammation in humans

We determined crypt abscesses, (1.50 ± 0.55) epithelial integrity (2.17 ± 0.75) and crypt architectural (2.00 ± 0.63) irregularities together with inflammatory cell infiltration (acute inflammatory cell infiltrate was assessed 1.83 ± 0.75; chronic inflammatory cell infiltrate–2.33 ± 0.82) and mucin depletion (2.33 ± 0.82) in colon mucosa of patients with UC. Control subjects had no or minor histological alterations in the colon tissues. Analysis of histological parameters showed statistically significant differences between control and UC groups (Table 2).

Table 2 Histological characteristics of human colon tissue

Assessment of PC1 ordered digital images cutouts for inflammation

Principal Component Analysis transformed characterization of all sample images from 108 features space, into optimal variables (‘Principal Components’) space. PC1 was representing the major part (97 % in mice and 71 % in human specimens) of total variation. Exact percentage of contribution of each principal component is shown in Fig. 3.

Fig. 3
figure 3

Contribution percentage of total feature variation of the first ten principal components

We normalized obtained values of PC1 corresponding to all images into [0, 1] range and considered it as the inflammation severity measure. Maximal value “1” was corresponding most severe inflammation and “0”–no inflammation (control). Ordered values of PC1 are presented in Fig. 4 together with several sample images, corresponding to certain values of PC1. Whole set of sample images ordered according to their corresponding values of PC1 are shown on Fig. 5.

Fig. 4
figure 4

Computed first principal component (PC1) values with analysed images, corresponding to certain values of it

Fig. 5
figure 5

Set of analyzed images ordered according to their computed first principal component (PC1) values. Composed pictures are starting with healthy control cutouts from mice and human specimens at top left images and ends with most severe inflammation at bottom right images

Three histology experts participated in double blind validation of proposed inflammation severity measure using custom made software. The software was showing randomly selected images corresponding to different values of PC1 and registered opinion of the expert which of them was corresponding to more severe inflammation. Expert’s opinion was matching with decision according PC1 values in 79.9 % of 3402 mice image pairs of specimen and in 67 % of 5796 human image pairs of specimen covering whole range of PC1 values. Absolute matching was in cases when difference in PC1 values was maximal. Dependency of ratio of expert’s opinion mismatching with difference in PC1 values is shown in Fig. 6. The highest yet acceptable ratio indicates resolution of our method.

Fig. 6
figure 6

Mismatch ratio between expert’s decision and first principal component (PC1) values

Proposed measure of inflammation severity–first Principal Component, is constructed by convolution of first eigenvector with initial set of features (see formula (7)). Therefore, values of its elements reflect contribution of particular features for constructed optimal representation. Values of first eigenvector corresponding to each particular feature (mean, histogram skewness or entropy at certain spatial frequency or spatial aspect ratio) are shown in Fig. 7. Highest values were found at positions corresponding to features representing objects similar to eosinophils (rounded spots about 7–25 pixels in diameter). Interestingly, Gabor functions corresponding to objects similar to crypts (elliptic spots varying about 180–350 pixels long and 50–130 pixels wide) were not expressed as important.

Fig. 7
figure 7

Values of first eigenvector corresponding to each particular feature: mean (marked with diamonds); analyzed image histogram skewness (marked with circles) or analyzed image entropy (marked with asterices) at certain spatial frequency and aspect ratio

Discussion

Several studies have shown that image processing and analysis systems may be successfully used for diagnosis and classification of various diseases, such as neuroblastoma, melanoma, lung, prostate, and breast cancer. However, these computerized analysis systems are based mainly on color-space derived features of histological images and indicate only areas with positive or negative diagnostic result [2630] ignoring morphological properties of specimen. In this study, we presented a new method for automated evaluation of inflammation severity based on spatial frequency features extracted from histological images of mice and human colon tissue. Developed technique computed quantitative estimate of inflammation severity and constructed a continuous scale estimate of it.

Currently published guidelines for visual evaluation of histological preparations of colon tissue describe expert scoring schemes enabling to classify severity of inflammation into several grades [8, 10, 31]. Our idea of elaboration of possibly continuous scale measure for inflammation severity was based on presumption that even specimens from the same investigative could represent certain variety of inflammation severity. The same principle concerns image cutouts from the same histological preparation. This presumption was supported by experts pathologists, who observed certain variety of visually evaluated features within cases classified into one or another class according to currently used scoring techniques. So we decided to pool all data, construct continuous scale measure and test it by simplified question to the experts during double blind experiment showing them two randomly selected images and asking: “just use your experience and select image representing more severe inflammation”. That experiment confirmed suitability of our measure. Pooling all data representing analyzed images arrays contained data representing several cutouts of images taken from several histological pictures of each investigative. It means, one can expect the data array to be not homogeneous and independent, but rather a mixture of several clusters. Therefore we tested and retained null-hypothesis about equality of distributions of used data from these several clusters (Kruskal-Wallis test).

At the moment we do not have any “golden standard” method for verification of our results, so determination of resolution achieved by our method could be based on maximal yet acceptable value of discordance between expert’s opinion in double blind test and our principal component analysis based estimate. The estimation of maximal yet acceptable value of mismatch ratio could be detected by evaluation of concordance between opinions of different experts on the same image pairs. However, this requires recruitment of many experts into the experiment and should be an interesting task of further research in this field. Detailed analysis of eigenvector values reveals diagnostic value of particular features and could be used for optimization of initial feature set for processing. Particular disease is related with unique tissue structure and changes of it in progress of disease. So, using our methodology we can elaborate disease progress measures for other diseases as well.

Currently, clinical, endoscopic, radiological, histological criteria and molecular markers are used to evaluate inflammation severity of colon in UC patients. Estimates obtained from standard clinically approved features could be also used for verification of our method. However, registration of such estimates “in vivo” is technically difficult and such combined experiments remain an interesting topic for future research. We show that complex evaluation of colon inflammation severity using computer-aided analysis could reveal new alternatives for evaluation of the degree of inflammation severity with higher precision and may provide new diagnostic possibilities.

Conclusions

Quantitative evaluation of inflammatory changes in histological preparations of colon tissues is feasible by estimation of spatial frequency parameters of histological images. Principal component analysis of the spatial frequency features improves efficacy of estimation of inflammation severity of colon tissue. The method may have potential clinical applications in patients with colon inflammation.