Introduction

The growth of abnormal and uncontrolled cells inside the brain or spinal canal is defined as brain tumor. There are four main types of the primary brain tumors: gliomas, meningiomas, pituitary adenomas and nerve sheath tumors. In biomedical analysis, segmentation of brain tumors from multi-modal magnetic resonance imaging (MRI) plays a vital role. Glioma can appear anywhere in the brain with different shapes, and there is a large variety and high complexity within one type of tumors in terms of intensities and textures [2]. Therefore, the challenge is to develop a method which creates a precise segmentation and works for multiple tumor classes and different imaging equipment [1].

Early detection and localization of the tumors can lead to changes in patient treatment plan that will impact on his/her health outcomes. Several approaches have been suggested in the literature for detection and segmentation of tumors in MRI modalities [3]. Some are based on the extracted features from the MR images, and they design models [4]. Hand crafted features were used in different brain tumor segmentation techniques which are fed into a classifier such as a decision tree (DT) [5]. The DT classifier demonstrated the best results among different conventional classifiers [4]. The limitation of the approaches based on hand designed features is that these methods require a large number of features for best representation of the brain tumor tissues. As a result, they need a high dimensional size of data which require more computational time for processing and a large number of experiments to optimize the parameters of the classifier. To address these problems, many deep learning-based methods were developed recently, which provide better accuracy for brain tumor segmentation [6,7,8]. Ghaffari et al. [24] proposed a 3D CNN model based on a variant of the U-Net architecture with some modifications to obtain the local features. They utilised connected component analysis as post-processing step to enhance the performance. In the paper [25], an efficient cascade CNN model was implemented for extracting both local and global features in two different ways with different sizes of extraction patches. Daimary et al. [26] demonstrated an algorithm that contains three hybrid CNN models, U-SegNet, Res-SegNet, and Seg-UNet, which are designed for high-accuracy automatic segmentation of brain tumor from MRI images. The suggested models inherit attributes from the most common CNN models for semantic segmentation, SegNet, U-Net, and ResNet. However, using only a method that is based on deep learning is insufficient for performing an accurate brain tumor region segmentation. The limitation is that the local features related to the changes of the texture tissue due to tumor growth are not sufficiently considered in the SegNet-based approach [22]. At the same time, some hand-crafted feature extraction methods take into account the local dependencies of the pixel classes, such as grey-level co-occurrence matrix (GLCM)-based texture features [9]. The GLCM was claimed to be the most popular texture-based method for MR images [10].

The motivation of this paper is to develop a hybrid method arising from the need of high accuracy segmentation. We proposed a new learning-based method, which combines the machine learned features and the hand crafted features for automated segmentation of the brain tumor structures from the generated ROI images from MRI dataset. The machine-learned features are the score maps extracted from the de-convolution layer in the trained SegNet network, and the hand-crafted features are the GLCM-based texture features. The proposed method was applied and evaluated on the publicly available BRATS 2017 dataset [411].

The main contributions of this paper are as follows:

  • An automatic method is proposed to generate region of interest (ROI) segment which is in agreement with experts’ delineation across all grades of gliomas through using a single commonly used MRI protocol, i.e. FLAIR as input data.

  • A DT classifier is applied only to the pixels that are considered as the tissues of ROI, which helps to largely reduce the computational cost through reducing the data size for classification.

  • A novel method is developed to overcome the limitation of SegNet network and increase the performance of detecting necrosis and enhanced brain tumor regions by combining hand-crafted features with machine-learned features.

Proposed Approach

The proposed segmentation method includes four main steps: pre-processing, ROI image generation, feature extraction, and pixel classification. The pre-processing is first performed through removing the artefacts and normalizing the intensity ranges of the MR images. Then, a binary mask as the ROI containing only tumor tissues is identified with a SegNet model trained on a single MRI modality, and the mask is applied on all the MRI modalities to produce ROI images. The machine-learned features are extracted for each pixel using another SegNet model trained on the ROI images, and the hand-crafted texture features are calculated based on GLCM. Finally, the combined features are fed into a DT classifier for labelling the pixels to corresponding tissues. The whole pipeline of the proposed method is shown in Figure 1. In the rest of the paper, we call our method as SegNet_GLCM_DT for short.

Fig. 1
figure 1

Pipeline of the brain tumor segmentation

Data Pre-processing

Artefacts often exist in MRI data due to the inhomogeneity in the magnetic field or the patient’s small movements during the scan period. As a result, a bias is produced across the results of the scans which affects the accuracy performance of the segmentation results especially when the segmentation is made by a computer-based algorithm. To correct that, we applied N4ITK bias field correction to all MRI modalities to remove unwanted artefacts [20].

Since the intensity values across MRI slices vary greatly, additional normalization step is also applied by subtracting the mean and dividing by the standard deviation of the brain region. Additionally, removing the top and bottom 1% intensity values in the normalization process brings them within a coherent range across all images in the training phase. To remove a significant portion of unnecessary zeros in the MRI dataset and to save training time with huge reduction of memory requirements for 3D data sets, we trimmed some black parts of the image background from the data of all modalities to get input images of size \({192\times 192}\).

Region of Interest Image Generation

The management of radiation dose planning and treatment response monitoring comes from the accurate detection of the tumor extent structure. Additionally, delineation of the tumor region, which is considered as ROI, is important for assessing the growth of glioma grades as well as extracting image features from abnormal regions for further tumor classification [12].

In this paper, an initial ROI was firstly identified using a semantic segmentation network, the SegNet [21]. A pre-trained SegNet model was modified and trained with each MRI modality as input separately for binary segmentation (normal and abnormal tissues). This process involves two main steps: ROI mask detection and ROI MRI image generation. An important scenario for this binary segmentation is to prepare ground truth masks by converting the original ground truth with four labels into that with two labels. See Fig. 2.

Fig. 2
figure 2

(left to right), FLAIR, T1, T1ce and T2 MRI modalities, ground truth with four classes and ground truth with two classes

To obtain optimal ROI mask, the pre-trained SegNet network was fine-tuned separately for each MRI modality. The four trained SegNet models were then evaluated separately on the testing dataset of each MRI modality for binary image segmentation. The model that achieved the highest F-measure accuracy out of all the four models was selected to detect the ROI in MRI images in the next step. The information for separating different sub-tumor regions (edema, necrosis and enhanced tumor) exists in different MRI modalities. Therefore, three MRI modalities images (FLAIR, T1ce and T2) were combined for segmentation. The ROI images were then generated from the combined MRI modalities based on the obtained ROI mask images, where all the pixels in the combined MRI modalities which correspond to the zero values in the ROI masks are set to zero, while the others are kept unchanged. The generated ROI images were used as the inputs in the next stage of our proposed method to segment sub-tumor structures. See Fig. 3.

Fig. 3
figure 3

(a) Combined MRI modalities, (b) ROI masks Images and (c) MRI ROI images

The semantic segmentation model in Fig. 4 takes full-size images as inputs for feature extraction in an end-to-end way. The pre-trained SegNet is used, and its parameters are tuned using the images with manually annotated tumor regions. In the testing process, the final SegNet model is used to create predicted segmentation masks of tumor regions for unidentified images. The motivation for using SegNet network instead of other deep learning networks is that SegNet has a small number of parameters which does not need high computational resources like DeconvNet [13], and it is easier to train end-to-end. Moreover, in U-Net network [14], the entire feature maps in the encoders are transferred to the corresponding up-sampling decoders and concatenated to the decoder feature maps, which leads to high memory requirement, while in SegNet only pooling indices are reused with less memory.

Machine Learned Feature Extraction with SegNet

After the ROI MRI images are obtained, we first use them to fine-tune a modified pre-trained SegNet [21] for semantic pixel-wise segmentation of the brain tumor regions (edema, necrosis and enhanced) in the images. In the testing stage, the final segmentation is obtained by max-voting to the final score maps of the SegNet. From experimental comparison of the output results with the ground truth, we found that the SegNet network can successfully segment some parts of different brain tumor regions. However, some of the output segmentation results have label dissimilarity between similar pixels, which leads to decrease of the SegNet performance in brain tumor region segmentation. The reason behind it is that the SegNet is not capable of catching all the changes in the brain tissues that are caused by the tumor. Therefore, we will incorporate information that reflects these tissue changes via using texture features.

The extracted features from the SegNet are the score maps produced from the trained SegNet for each output class. After the last decoder layer in the SegNet, the final predicted segmentation mask was obtained by setting each pixel label as that of the score map with the maximum value among all the final maps. Those score maps contain all the hierarchy’s features that are present in the lower and higher resolutions. It can be seen in Fig. 4 that the number of classification labels is the same as the number of score maps in the case of BRATS 2017 dataset.

A four-dimensional feature vector is created for each pixel in the MRI images. The value of each score map layer is equivalent to the value of each element in the feature vector for the corresponding pixel.

Fig. 4
figure 4

(a) Combined MRI modalities, (b) ROI masks Images and (c) MRI ROI Score maps Features extraction in SegNet network

Hand-Crafted Feature Extraction with GLCM

GLCM-based texture features has the ability to describe different types of regions because different natures of tissues in MR images present different textures. Consequently, the texture descriptors will have enough discrimination power to distinguish among the region types [15]. Fusing GLCM-based texture features with SegNet features can incorporate more powerful feature descriptors into the final segmentation, which can help to overcome the limitation in the SegNet network and improve the performance in brain tumor segmentation.

We extracted GLCM-based texture features and SegNet features from the ROI regions of MR images because we only need to perform segmentation in the ROI regions. In this paper, the GLCMs are constructed in four directions: \(\theta = 0^\circ , 45^\circ , 90^\circ\), and \(135^\circ\) with pixel distance d=1. GLCM-based texture features are calculated using the built-in function in MATLAB. The GLCM approach can deliver the spatial interrelationships of grey tones which are utilized in optimization of the brain tumor segmentation method.

Fig. 5
figure 5

(left to Right): MRI modalities, ROI image, T1ce MRI modality, GT, SegNet score map of edema and predicted mask

Combined Feature Extraction

This section describes spatially combined features which are optimized from SegNet and GLCM-based features. Figure 5 shows an example of patient image, SegNet-based score map of the necrosis class, and the feature representations of two pixels of different classes (edema and necrosis). In the edema score map of the SegNet network, there is no presentation of the obvious separation for necrosis and edema classes. The values for the corresponding pixels are 0.4186 and 0.4119, respectively. These two values are very close. Therefore, the local boundaries of the tumor regions in the edema score map does not have enough detailed presentation. Subsequently, the predicted mask image which is only based on SegNet method does not match the ground truth image. Considering the local neighbourhood dependencies of GLCM texture features, makes the labels, i.e. edema and necrosis more separable. See Fig. 6.

SegNet-based features are selected from each layer of the four score maps for the corresponding pixel. Whereas, GLCM-based texture features are extracted in a fixed-size window of \(8\times 8\), centred at that pixel in T1ce images. The T1ce MRI modality was selected because the tumor core has clear boundaries in this modality, which improves the segmentation performance of the tumor core. These texture features are based on the statistics that represents how frequently one grey level will appear with another specific grey level on the image. Three texture features can be extracted from each special dependency matrices of the grey level for distance d=1. The angular second-moment feature (ASM) which is a measure of the image homogeneity, the contrast feature which is a measure the amount of local variation in an image and the correlation feature which is a measure of grey-level linear-dependencies [23]. See Fig. 6(F).

The combined feature vector for each pixel consists of seven elements (Four SegNet scores(background, edema, necrosis and enhanced ) and three GLCM features (ASM, contrast and correlations) as shown in Fig. 6(G). It is fed into the DT to classify the pixel.

Fig. 6
figure 6

(A) ROI image, (B) SegNet score maps feature (background, edema, necrosis and enhanced), (C) T1ce MRI modality, (D) A (\({8\times 8}\)) T1ce image block of the interest pixel in red color, (E) Scaled version of the image block for the same interest pixel, (F)Texture features extracted from GLCM spatial dependencies matrices and (G) Output feature vector of the corresponding pixel

DT Parameters and Segmentation

The decision tree (DT) has a flowchart-like tree structure which is used to categorize each pixel into healthy or tumor brain tissues. Each non-leaf node of the tree represents a test on an attribute, and each leaf node represents a class label. A pixel going through the tree will reach a leaf node that represents its class, i.e., healthy or some type of tumor tissue. The procedure is performed based on the feature representation of the pixel. In the training stage, the tree grows into a specified tree depth \(D\_tree\). The reason for using DT as classifier in this study is that DT had been proved to have a high performance accuracy in the brain tumor segmentation field [4].

Taking ROI images as input, we only consider each pixel in the tumor target area. A feature vector (i.e. 7 features for each pixel) is extracted and then fed into the DT classifier for training. To select the optimal parameters of the DT classifier, different depths (\(D\_tree\)) were tested on the BRATS 2017 dataset. The optimal performance accuracy was obtained in fine tree with \(D\_tree=100\) which showed better classification accuracy than the medium and coarse tree with depths \(D\_tree=20\) and \(D\_tree= 4\), respectively.

Experimental Results

All 285 patient subjects with HGG (210) and LGG (75) in the BRATS 2017 dataset were involved in this study [4, 11]. Basically, 75% of the patients (158 HGG and 57 LGG) were selected to train the deep learning model and 25% (52 HGG and 18 LGG) were assigned as testing set. There are four types of MRI sequences (Flair, T1, T1ce and T2) for each patient. All images have been segmented manually with four rates (4 labels: 0 – background 1 – the necrotic and non-enhanced tumor, 2 – the edema, 4 – the enhanced tumor). The segmentation ground truth for each subject was observed by experienced neuro-radiologists.

The performance accuracy of the proposed model was evaluated on the test set. As a result of practical clinical application, the standard segmentation of the brain tumor structures are grouped into three different tumor regions which are defined by:

  • Whole tumor (edema, necrosis and non-enhanced, enhanced tumor).

  • Tumor core (necrosis and non-enhanced, enhanced tumor).

  • Enhanced tumor.

In each tumor structure, the segmentation results have been evaluated quantitatively using the F-measure.

We conduct experiments to evaluate ROI mask. We compared the performance of implementing each MRI modality to SegNet network for binary segmentation using different models (ROI_FLAIR_SegNet, ROI_T1_SegNet , ROI_T1ce_SegNet and ROI_T2_SegNet). From Table 1, it can been seen that ROI_FLAIR_SegNet model presents a high-performance accuracy result for ROI detection than the other models. The reason for this is that FLAIR modality is considered as a highly effective sequence image which helps to separate the edema region of hyper-intensity from the cerebrospinal fluid (CSF) as the water molecules signal is suppressed in this type of MRI modality [16].

Table 1 Results for the binary segmentation of brain tumor on BRATS 2017 dataset

We conduct another experiment by combining the machine extracted features from the learned SegNet network with the GLCM-based texture features to verify whether the later features help. In Table 2 we can observe that adding the GLCM-based texture features to the model pipeline significantly improves the F-measure performance in the three different brain tumor structures, i.e. whole, core and enhanced tumor. See Fig. 7 for a visual comparison of the segmentation results.

Table 2 Comparison of F-measure (mean and standard deviation) for our experiment results separated for whole tumour (WT), tumour core (TC) and enhanced tumour (ET) using BRATS2017 dataset
Fig. 7
figure 7

(Left to right) FLAIR, T1ce, T2, ROI images, Ground truth, predicted masks using only SegNet method and predicted masks using SegNet_GLCM_DT method

The reason of better performance from the combined features is that GLCM-based texture features can supply additional textural properties that may be not captured using only SegNet network. Consequently, they can provide local dependencies and neighborhood system of the pixel which is extremely helpful in improving the performance of brain tumor structure segmentation.

Table 3 shows the comparison of our method with some state of the art (SOTA) methods. It can be seen that our method has significantly higher accuracy in whole tumor segmentation than the other SOTA methods. However, this method comes at the expense of having a reduction in core and enhanced tumour segmentation accuracy in comparison to the majority of the other SOTA methods. This is because that the necrosis and enhanced regions have complex structures compared with the edema region, and our method has a relatively low accuracy in detecting necrosis and enhanced tumor. Nevertheless, our method appears to have more accurate results in all the sub-tumour regions than that of [18].

Table 3 Performance of our proposed method compared with other methods on BRATS 2017 dataset

Conclusion

This paper proposed a novel method for brain tumor segmentation from MR images. To reduce the computational cost and increase the segmentation accuracy, we proposed to first generate ROI images, which contain only tumor tissues, and then segment the ROI into sub-tumor regions. Considering that the machine-learned features cannot sufficiently represent the tumor tissues in MR images, we combined the machine-learned features generated from a SegNet model with the GLCM-based hand-crafted features, and used a DT to classify each pixel into sub-tumor region based on the combined features. Experimental results showed that FLAIR is the best MRI modality to generate ROI. It was also shown through experiments that the proposed SegNet_GLCM_DT method achieved much better results in whole tumor segmentation compared to some SOTA methods. Specifically, our method achieved F-measure 0.98 in segmenting whole tumor on the BRATS 2017 dataset. Although our method can achieve very high accuracy in segmenting whole tumour (WT), the accuracy of segmenting tumour core (TC) and enhanced tumour (ET) could be further improved. The reason for the relatively low accuracy in TC and ET segmentation is that the necrosis and enhanced tumor regions have more complicated structures in comparison to the edema region, and our technique has sacrificed the accuracy of necrosis and enhanced tumor detection for better whole tumor segmentation. Our future work will investigate more modification methods to improve TC and ET segmentation while keep the best of WT segmentation accuracy.