Enhancing brain tumor segmentation in MRI images using the IC-net algorithm framework

D S, Chandra Sekaran; Christopher Clement, J.

doi:10.1038/s41598-024-66314-4

Enhancing brain tumor segmentation in MRI images using the IC-net algorithm framework

Article
Open access
Published: 08 July 2024

Volume 14, article number 15660, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Enhancing brain tumor segmentation in MRI images using the IC-net algorithm framework

Download PDF

Chandra Sekaran D S¹^na1 &
J. Christopher Clement¹^na1

504 Accesses
Explore all metrics

Abstract

Brain tumors, often referred to as intracranial tumors, are abnormal tissue masses that arise from rapidly multiplying cells. During medical imaging, it is essential to separate brain tumors from healthy tissue. The goal of this paper is to improve the accuracy of separating tumorous regions from healthy tissues in medical imaging, specifically for brain tumors in MRI images which is difficult in the field of medical image analysis. In our research work, we propose IC-Net (Inverted-C), a novel semantic segmentation architecture that combines elements from various models to provide effective and precise results. The architecture includes Multi-Attention (MA) blocks, Feature Concatenation Networks (FCN), Attention-blocks which performs crucial tasks in improving brain tumor segmentation. MA-block aggregates multi-attention features to adapt to different tumor sizes and shapes. Attention-block is focusing on key regions, resulting in more effective segmentation in complex images. FCN-block captures diverse features, making the model more robust to various characteristics of brain tumor images. Our proposed architecture is used to accelerate the training process and also to address the challenges posed by the diverse nature of brain tumor images, ultimately leads to potentially improved segmentation performance. IC-Net significantly outperforms the typical U-Net architecture and other contemporary effective segmentation techniques. On the BraTS 2020 dataset, our IC-Net design obtained notable outcomes in Accuracy, Loss, Specificity, Sensitivity as 99.65, 0.0159, 99.44, 99.86 and DSC (core, whole, and enhancing tumors as 0.998717, 0.888930, 0.866183) respectively.

Advancing Brain Tumor Segmentation via Attention-Based 3D U-Net Architecture and Digital Image Processing

Multi-threshold Attention U-Net (MTAU) Based Model for Multimodal Brain Tumor Segmentation in MRI Scans

Brain Tumor Segmentation Using Attention-Based Network in 3D MRI Images

Introduction

Segmenting brain tumors is a crucial step in neuro-oncology that helps with accurate diagnosis, therapy planning, and therapeutic oversight. In this study, we developed the IC-Net (Inverted-C) architecture to enhance the segmentation of brain tumors from medical imaging data. The special issues raised by tumor heterogeneity, irregular forms, and the demand for greater accuracy are addressed in our proposed research work. Due to the enormous quantity of information that Magnetic Resonance Imaging (MRI) provides as well as the variability in the position and scope of the tumor, computerized segmentation is a difficult process¹. Therefore, to accurately segregate the tumorous region from healthy tissues of our human brain. By accurately predicting both the core and peritumoral regions of brain tumors, the IC-Net design seeks to achieve higher segmentation accuracy. This accuracy is essential for planning treatments and subsequent evaluations of treatment effectiveness, allowing doctors to customize therapies and improve patient care.

Brain tumor segmentation (BTS) is crucial for various reasons, including treatment planning, monitoring disease progression, conducting clinical trials, providing measurable data for medical research, extracting of meaningful features and aiding in image-guided navigation². It allows for precise definition of tumor boundaries, enabling targeted therapies like surgical resection, radiation therapy, or chemotherapy. It also aids in tracking changes in tumor size and shape over time, allowing for a better understanding of the tumor’s response to therapy and disease development. Additionally, it aids in extracting meaningful features from the segmented regions, such as physical traits, texture, and intensity, for a more comprehensive tumor investigation.

For the purpose of performing image segmentation tasks in the area of medical imaging, specifically to provide masks to segment images for brain tumors if found in brain MRI scans, we have investigated novel convolution neural network (CNN)-based model, named IC-Net. Developed models were created using U-Net as their foundation. The recently put forth models have made use of the Merge Block concept to incorporate together the global and local context, as well as the Attention Block concept to concentrate on the region of interest that contains a certain object.

Due to its effectiveness and efficiency, the U-Net architecture has been widely acknowledged and used in medical image segmentation applications, including BTS³. To precisely adapt the conventional U-Net design to the subtleties and difficulties of BTS, we have made several changes to it. Our IC-Net model tackles the challenge of precise BTS, discussing tumor size, form, and variation, overlapping structures, the adjustments improve the network’s capacity to precisely, and emphasizing on locating erratic tumor boundaries in MRI scans of the brain. Addressing the enormous data for training and deploying models on real-time implementations and also focus on uneven border variation within the brain’s tumor in MRI is a significant challenge.

The standard procedure for analyzing brain tumors at the moment is to use Computer Tomography (CT) or MRI imaging to assess the pathological status of brain tissue. The benefits of various imaging methods for tumor diagnosis vary⁴. In contrast to CT imaging, Hussain, Shah, et al. have reviewed and discussed a few techniques that employ MRI and noninvasive imaging techniques that can give the viewer high-quality images free of damage and skull artefacts, with a clear understanding of the anatomical structure, and with excellent soft tissue resolution⁵. By altering the pertinent settings, intracranial pictures in any direction can be produced concurrently. Moreover, MRI of the same tissue at various angles or utilizing various modalities can be produced using various imaging sequences. A multimodal MRI image is the common name for this kind of imaging⁶. Multimodal MRI scans are samples of the same tissue at various contrast levels, acquired using several MR development techniques⁷. Young Kim, Eun, and Hans J. Johnson discussed about water molecules that are present in brain tissue devoid of tumors and other lesions start to experience lesion effects, such as tissue edema⁸. The water molecules in the bonded state are visible as high target in Flair and T2 data. Hence, it is potentially possible to segment the entire tumor using Flair MRI images as the primary foundation. Nonetheless, the tumor will also exhibit asymmetrical alterations in the Flair image due to several unique situations⁹.

While executing the training, attention mechanism-based U-Net models constantly highlight the pertinent activation’s, which results in a superior generalization of the networks. By doing this, we can cut down on the amount of CPU power needed to process pointless activation’s. Hard attention and soft attention are the two different categories of attention systems. The hard attention method performs the picture segmentation operation by cropping the image to highlight the relevant parts. Because it uses a non-differentiable strategy and only works on one area at a time, this type of attention mechanism requires reinforcement learning in order to be back-propagated. In contrast, the soft attention mechanism employs a weighting strategy to give higher weights to the pertinent elements containing the regions of interest and lower weights to the non-relevant parts, which primarily make up the background area of the image. Section "Introduction" provides a thorough introduction to brain tumor followed by a clear explanation of related work in section "Related works". Section "Dataset" provides a brief description about the datasets. In section "Pre-processing", pre-processing where the different processing method are explained. Section "Methodology" Detailed Methodology of our IC-Net model is stated. In section "Results and discussion", we present our results which have been explained. We conclude our work by section "Conclusion".

The following characteristics of the intended models are crucial:

1.
To reduce computational complexity, IC-Net, a novel end-to-end framework for brain tumor segmentation, uses fewer convolutional layering techniques for both up- and down-convolution operations.
2.
Using multiple convolutional processes, the MA-Block (Multi-Attention) function enhances feature representations and captures intricate patterns, whereas IC-Net combines global features and contextual data.
3.
To ensure that the network’s attention focuses on key image areas, the Attention block is employed during segmentation to concentrate on particular areas of interest, such as brain tumors.
4.
The suggested model works better than most DL-based models by producing better results with a reasonable amount of sample labels and the right kinds of data augmentation techniques.

Contributions of IC Net architecture

Convolutional layers for multilevel feature learning, batch normalization along with activation stabilization, selective feature importance, extraction of heterogeneous data, adaptive feature integration, and improved model representation make up the IC-Net framework. These functionalities enhance the management of complex information representations, draw attention to significant characteristics, and boost productivity for operations like feature extraction and segmentation. The FCN makes it easier to extract different feature representations and data types, and the attention mechanism aids in keeping the model focused on pertinent areas.

Related works

This section discusses various BTS methods, including traditional and Deep Learning (DL) models like U-Net, FCN, and attention-based networks, and their limitations, emphasizing the need for developed network architecture IC-Net.

Modern medical image segmentation methods like U-Net make use of the idea of an encoder-decoder framework¹⁰. These segmentation models combine the low-level, fine-grained aspects of the encoder, or contracting path, with the high-level, coarse-grained attributes of the decoder, or expansive path, using the idea of skip connections. When doing medical image segmentation tasks, this concatenation is useful for producing reconstructed fine-grained characteristics of the target segmented mask. A U-shaped design results from the symmetrical construction of both the contracting and expansive paths. On the other hand, because the characteristics captured by those down-sampling layers within the ’contracting path’ are weak and correlate to a lot of lower-level features, the restoration of spatial information via these concatenations could fail to deliver precise information. We highlight the pertinent segments of the segmentation masks because such lower-level features could improve the irrelevant regions of the object being targeted segmentation masks.

Over the years, numerous approaches have been proposed for BTS, and CNNs have emerged as a powerful tool in this domain. The U-Net architecture, introduced by Ronneberger et al. has outperformed and widespread attention for its exceptional performance in biomedical image segmentation tasks, including BTS¹¹. According to the study, which tests networks with higher levels of depth using small (3$\times$3) convolution filters, improving networks to 16-19 weight layers could result in a significant improvement¹². In the study, the Edge U-Net model is used, which is a deep CNN that can accurately find the location of tumors by combining boundary-related MRI data with brain MRIs and a loss function¹³. The study was illustrated automatic BTS using hybrid filters and 3D medical pictures. The U-Net model is used for semantic segmentation, and 2D MRIs are used to view the tumor. The method is computationally and memory-efficient, proving to be the optimal choice for segmenting brain images¹⁴.

Baid, U. et al were designed a method for glioma tumor segmentation and survival prediction using a Deep Learning Radiomics Algorithm for Gliomas (DRAG) model and a 3D patch-based U-Net model. The model achieved 57.1% accuracy on the BraTS 2018 validation dataset, performing well and achieving the overall survival prediction task¹⁵. The model uses the MICCAI BRATS2018 to classify tumor types in MRI images based on their masses. It achieves significant results in segmenting tumors using DL approaches, with dice coefficient values for high-grade glioma volumes and low-grade glioma volumes of 0.9795 and 0.9950 respectively¹⁶. The research has examined an SPP-U-Net, a model that replaces residual connections with Spatial Pyramid Pooling and Attention blocks, enhancing reconstruction scope and context. It achieves comparable results without changing parameters over larger dimensions¹⁷. Sunita Roy, et al. proposed two new CNN-based models, S-Net and SA-Net, for image segmentation in medical imaging, particularly for brain tumors in MRI scans. These models use U-Net as the base architecture and leverage ’Merge Block’ and ’Attention Block’ concepts¹⁸.

Ruba,T, et al proposed a JGate-AttResU-Net network design for a reliable BTS system, enhancing tumor localization and generating competitive outcomes using the BRATS 2015 and 2019 datasets^19,20. BrainNET is a new network that uses DL networks to automate the detection and classification of brain tumors from MRI images, overcoming the complexity and variance of tumors and medical data in clinical routines²¹. Yanjun Peng and Jindong Sun suggested an automatic weighted dilated convolutional network (AD-Net) for learning multimodal brain tumor features through channel feature separation learning. The AD unit uses dual-scale convolutional feature maps, two learnable parameters, and deep supervision training to quickly fit the data using the BraTS2020 dataset²². ²³TransBTS validates the efficiency of a Transformer-based framework for visual tracking, emphasizing attention mechanisms for healthy feature learning.

Rehan Raza, et al. introduced a 3D BTS framework, a hybrid of deep residual networks and U-Net models, that significantly enhances the segmentation performance of brain tumor sub-regions when compared to state-of-the-art techniques²⁴. The 2D U-Net network, trained on the BraTS datasets, can identify four areas for BTS. It can set up multiple encoder and decoder routes for various image usages. Image segmentation is used to reduce computational time. Experiments on the BraTS datasets demonstrate the model’s effectiveness²⁵. This study presented a fully automated 2D U-net architecture on the BraTS2020 for detecting tumor regions in healthy tissue. After experimenting with all MRI sequences, the model achieved an accuracy of 99.41%, demonstrating its effectiveness. The model is further trained to assess its robustness and performance consistency²⁶. N. Phani Bindu and P. Narahari Sastry, improve accuracy, cross- or skip-connections between network blocks can be introduced. This approach improves model accuracy and performance compared to traditional U-Net models, as it eliminates the need for frequent skip connections²⁷.

Table 1 Comparative analysis with other pertinent strategies using the BRATS dataset.

Full size table

The study employs a model for medical image semantic segmentation using CNNs. The model eliminates semantic ambiguity in skip connection operations by adding attention gates, combining local features with global dependencies, and using multi-scale predictive fusion³⁶. A modified U-Net structure using residual networks and sub-pixel convolution was proposed, enhancing modelling capability and avoiding de-convolution overlapping. The model was evaluated on the BTS dataset, achieving accuracies of 93.40% and 92.20% and outperforming existing approaches in tumor subregion classification³⁷. Amin, Javeria, et al proposed model uses 07 layers for classification, including convolutional, ReLU, and softmax layers. It divides MR images into patches and assigns labels. Experiments were conducted on eight large-scale datasets, and results were validated on accuracy, sensitivity, specificity, precision, and similarity index³⁸. Raut, Gajendra, et al. classify new input images as tumorous or normal, use back propagation for accuracy and autoencoders for irrelevant features. The K-Means algorithm is used for further tumor region segmentation³⁹.

The study uses the 3D-U-Net model for volumetric segmentation of MRI images and tumor classification using CNNs. Validity is established through loss and precision diagrams. Performance is measured and compared, finding the proposed work more efficacious than state-of-the-art techniques⁴⁰. Agrawal P, Katal N, Hooda N. introduce a fully automatic method for separating brain tumours using U-Net-based deep convolutional networks. This method was tested on BRATS 2015 datasets with 220 high-grade tumour cases and 54 low-grade tumour cases, and cross-validation showed that it worked well for separating tumours⁴¹. This paper presents a DL method for segmenting brain tumours into subregions using a multitask framework and a three-stage cascaded framework for simultaneous and sequential segmentation⁴².

The study explores an attention-based U-Net for brain tumor MRI scans. The model uses U-Nets to segment glioma subregions using T1, T2, T1CE, and FLAIR modalities. The model segmented WT, TC, and ET using the FLAIR modality, achieving scores of 95.56, 93.31, and 89.95 over BraTS 2018²⁸. Feng, Xue, et al. proposed a patch-based 3D UNet with an attention block. Findings are mean Dice scores of 0.806 (ET), 0.863 (TC), and 0.918 (WT) in the validation dataset²⁹. Na Li and Kai Ren developed DAU-Net, an attention-based nested segmentation network. A deep supervised encoder-decoder architecture and a redesigned dense skip connection to identify key feature regions and merge extracted features³³. A large-kernel (LK) attention module that integrates convolution, self-attention, and channel adaptation was proposed by Li, Hao, et al. for efficient multi-organ and tumor segmentation³¹.

SCAU-Net is a 3D U-Net model for brain tumor segmentation, employing external attention and self-calibrated convolution modules. It achieves competitive results on the BraTS 2020, 2018 and 2019 validation datasets³². In this work, a 3D U-Net model that uses different skip connections with preset 3D MobileNetV2 and attention blocks has been developed by Chinnam SK, Sistla V, and Kolli VK by implementing 3D brain imaging data³⁰. Comparative analysis with other pertinent strategies using the BRATS dataset are stated in Table 1. Advanced optimization techniques like stochastic gradient descent and adaptive learning rate algorithms are being used to improve the performance of modified U-Net models for BTS.The integration of attention mechanisms, residual blocks, and advanced optimization techniques has led to significant improvements in segmentation accuracy and performance.

We recognize that our proposed blocks leverage combinations of existing methodologies, such as MA-Blocks, FCN blocks, and attention mechanisms. However, the true innovation of our work lies in the specific integration and optimization of these blocks within a unified framework tailored for the segmentation of brain tumors. Our IC-Net framework strategically combines these techniques to enhance feature extraction, improve segmentation accuracy, and address specific challenges in medical image analysis that individual blocks alone might not solve as effectively. This integrated approach results in a synergistic effect that significantly improves the performance metrics, as demonstrated in our comprehensive experimental results. We believe that this novel integration and its application to the specific domain of BTS contribute meaningful advancements to the field, providing a robust and effective tool for clinical use.

Dataset

The benchmark datasets used in this study are covered in this section’s discussion. The BraTS2020 dataset, obtained from Kaggle, serves as the basis for our system’s analysis and training. Four MRI sequences are collected for each patient: fluid attenuated inversion recovery (FLAIR), T1-contrast-enhanced (T1ce), T1-weighted (T1), and T2-weighted (T2), along with the associated data sample shown in Fig. 1. The training brains come with ground truth for which 5 segmentation labels are provided, namely non-tumor, necrosis, edema, non-enhancing tumor and enhancing tumor. The experts assigned labels to the presented ground truths. Each 3D volume has 155 2D slices/images of brain MRIs that were gathered from different parts of the brain. All brains in the dataset have the same orientation. In NIfTI format, each slice has a size of $240\times 240$ pixels and is composed of single-channel grayscale pixels. Table 2 provides a summary of the dataset.

Table 2 Summary of the dataset.

Full size table

Pre-processing

Preprocessing methods get the medical images ready for segmentation task training on a DL model are discussed. In order for the model to gain knowledge from the data during training, it must be in an appropriate format and contain relevant information. Data pre-processing is required to eliminate noisy regions and extract crucial segmentation properties before importing the dataset into the training model, ensuring clear labeling. The nib library is used for loading medical image data from NIfTI files, while data shuffling and data resizing are used for training models. Class mapping and One-Hot Encoding are used for segmentation tasks. Normalization ensures pixel values fall within a similar range, while data yielding generates batches of preprocessed data for training and evaluation.

Methodology

In this manuscript, we introduce IC-Net model that incorporates MA-blocks, attention blocks and FCN block. Our contributions can be summarized as follows:

The image segmentation operation of the IC-Net architecture is discussed in detail here (Fig. 2). A sample image of seven dots is taken for illustration. The image proceeds through a well-planned series of steps. The architecture starts with an input layer that accepts an image of specific dimensions followed by a five-block encoder component. An encoder block contains convolutional layers and max-pooling operations so that en Block (1 to 4) are capable of extracting hierarchical features from the input images. These convolutional layers capture complex features as we go deeper into the network, while the fifth encoder block represents the highest level of feature refinement. The operational representation of the MA Block is shown in Fig. 3. The cap-shaped structure above the block diagram, consisting of dotted lines connecting the input, MA Block, and output images, provides an overview of the flow block diagram’s processes. It processes the input image with seven colored circles through operations, starting with a convolutional layer for spatial features, followed by batch normalization for stable training and ReLU activation. A second convolutional layer refines these features, and combined, these operations enhance the image’s representation for segmentation. While the block doesn’t directly alter the output image, it primes it for segmentation. Subsequent layers in the neural network use these features for circle boundary and characteristic distinction, making convolutional layers vital for edge and pattern detection, while batch normalization and activation functions aid efficient feature learning.

The improvement in feature representation is achieved by the Fully Convolutional Network (FCN), whose operational mechanism is shown in Fig. 4. The cap-shaped structure above the block diagram, with dotted lines connecting the input, FCN Block, and output images, provides a comprehensive overview of the flow block diagram’s operations. It systematically processes an input image. It starts with a convolutional layer using ReLU activation for spatial feature enhancement. Two additional convolutional layers follow, each employing distinct activations: tanh emphasizes intensity variations, and sigmoid highlights pixel relationships. Their outputs are concatenated to form a three-channel feature map, capturing various image aspects. A final convolutional layer with ReLU further consolidates these features. Output image remains unaltered, but this block creates a rich and diverse feature representation, ideal for segmentation. The combination of activations and concatenation allows the network to capture spatial structures, intensity variations, and pixel relationships, aiding segmentation by providing an informative and discriminative feature map.

Followed by the operation of FCN, The cap-shaped structure above the Fig. 5, with dotted lines connecting the input from the convolution block which is given to up sampling, then given to attention and output images, provides a concise summary of the processes involved. The decoder part which consists of four blocks, each with a transpose convolutional layer, an attention mechanism, and a concatenation operation comes into play. The transpose convolutional layers are essential for boosting the spatial resolution of the feature maps, and attention algorithms make sure that upsampling is targeted and context-sensitive which refers to the ability of a model to adapt its processing based on the context or surroundings of the input data. This process involves: computing “theta” and “phi” features, combining them through ReLU activation to emphasize positive relationships, calculating attention weights with a sigmoid activation (ranging from 0 to 1), and multiplying these weights with the original image. This method enhances feature representation by emphasizing regions of interest, vital for accurate segmentation. It allows the network to focus on critical areas, aiding in distinguishing object boundaries from the background and ultimately enhancing segmentation accuracy. The fifth decoder block produces segmentation map and provide class probabilities for each pixel with the help of $1\times 1$ convolutional layer and SoftMax activation. This convoluted process reaches its culmination and the operation of Attention Block is shown in Fig. 5. This analysis aids a profound understanding and a granular insight of the IC-Net’s image segmentation functionality, workflow and its performance. Overall, this methodology facilitates image segmentation through feature extraction and its refinement to produce pixel-wise class predictions in the output segmentation map. The entire operation is illustrated in Algorithm-1.

Proposed methodology aims to enhance BTS accuracy by integrating MA-Blocks, FCN Block, and an attention mechanism within the IC-Net architecture as shown in Fig. 2.

We revisit the operations of MA block, FCN block and attention block functions with complete mathematical representation as shown below.

Convolution operation

The convolution operation given an input image ${\textbf {X}}(i)$ is represented as

$$\begin{aligned} \quad \textbf{Y}_1(i) = \textbf{X}(i) \cdot \textbf{W} \;\; ;\; \forall i = 1,2,\cdots , v_1\cdot v_2 \end{aligned}$$

(1)

where $v_1 = v_2 = n+2p-f+1\; \;;$, ${\textbf {W}}$ is a kernel, n, size of input image, p is padding size and f is the filter size and $v_1$, $v_2$ are the height and width of the output feature matrix respectively. The functionalities of each blocks are explained below with the corresponding mathematical descriptions. For the easiness of illustration, we drop the subscripts in the notation and it is assumed that the input image to the architecture of each block is assumed as ${\textbf {X}}$ output as ${\textbf {Y}}$ and the outputs of intermediate stages as ${\textbf {Y}}_1$, ${\textbf {Y}}_2$, ${\textbf {Y}}_3$ and so on respectively.

MA Block The operation of MA block includes convolution between ${\textbf {X}}(i)$ and ${\textbf {W}}$ to give ${\textbf {Y}}_1$ as mentioned in (1) followed by a batch normalization, $\quad \textbf{Y}_{2}(i) = \frac{\textbf{Y}_{1}(i) - {\mu }_{{\textbf {Y}}_1}}{{\sqrt{{\sigma }^2 _{{\textbf {Y}}_i} + \epsilon }}}$, $\epsilon$ is a very small positive number. Following this operation, there will be a ReLU activation ${\textbf {Y}}_3(i) = \max \left( 0, {\textbf {Y}}_2 (i) \right)$ and convolution ${\textbf {Y}}_4(i) = {\textbf {Y}}_3(i) \cdot {\textbf {W}}$ all in sequence. In these expressions, ${\textbf {X}}$ denotes input feature matrix, ${\textbf {W}}$ represent the filter used in the convolution, ${\textbf {Y}}_4$ denotes output image and ${\textbf {Y}}_1$, ${\textbf {Y}}_2$, & ${\textbf {Y}}_3$ denote intermediate output features respectively.
FCN Block The operation of the FCN block comprises several key steps. It starts with a convolution operation as mentioned in (1) followed by a $\text {ReLU}(x) = \max \left( 0,x\right)$ activation function to give an intermediate output as ${\textbf {Y}}_1$. The output ${\textbf {Y}}_1$ undergoes one more convolution as in (1) followed by a Tanh Activation to give a response ${\textbf {Y}}_2$. In addition, one more convolution along with $\tanh$ is used in the next stage to give ${\textbf {Y}}_3$. It is to be noted that $\tanh = \frac{1 - e^{-2x}}{1 + e^{-2x}}$ and $\sigma (z) = \frac{1}{1 + e^{-z}}$. After convolution with Tanh Activation, another convolution between ${\textbf {Y}}_2$ and ${\textbf {W}}$ as in (1) followed by a $\sigma$ activation function is applied to get ${\textbf {Y}}_3$. Now all these are concatenated using $\textbf{Y}_4 = \mathbf {Y_1} || \mathbf {Y_2} || \mathbf {Y_3}$, where || denotes column wise concatenation. Final response ${\textbf {Y}}$ is obtained by doing another convolution between $\mathbf {Y_4}$ and ${\textbf {W}}$. In these equations, $\textbf{X}$ and $\textbf{Y}$ represents the input and output feature matrix respectively. $\textbf{Y}_1$, $\textbf{Y}_2$, $\textbf{Y}_3$ and $\textbf{Y}_4$ represent output after convolution with Tanh, $\sigma$ and ReLU respectively. $\textbf{Y}_4$ denoted concatenated feature matrix.
Attention Block It encompasses several integral components and operations. We have the input feature matrix $\textbf{T}{(i)}$, along with an additional input feature matrix $\textbf{P}{(i)}$. Output feature matrix $\textbf{B}$ is the result of the convolution operation, while the application of attention weights yields the output feature matrix $\textbf{Y}$. In this Operations, Firstly theta convolution operation $\textbf{T} = \textbf{T}(i) \cdot \textbf{W}$ takes place as in (1). Similar operation takes place to operate on $\textbf{P}$ in Phi convolution. These results are added together to yield $\textbf{A}$, with a subsequent application of the ReLU activation function. Output from the addition is processed further through the operation $\textbf{B} = \textbf{A}(i) \cdot \textbf{W}$ as in (1). Finally, the output with attention is obtained as $\textbf{Y} = \textbf{B} \cdot \textbf{T}$.

Firstly, MA-Blocks inspired by recent advancements in attention mechanisms are integrated into IC-Net to focus on pertinent tumor regions within the input, incorporating both global and local context information. The MA-Block function, a vital component of IC-Net, is detailed, encompassing specific operations such as a $3\times 3$ convolutional layer, batch normalization, ReLU activation, and another convolutional layer. The architecture of these MA-blocks allows for adaptive feature map weighting, effectively capturing intricate patterns within the data. It improves spatial integrity and capturing delicate features for subsequent layers. Also, ensures that it identifies detailed patterns in input data using convolutional layers, batch normalization, and ReLU activation, improving spatial integrity and capturing delicate features for subsequent layers. These blocks play a pivotal role in the encoder section, aiding in the extraction of hierarchical features from the input image across various scales through progressive down-sampling via max-pooling layers. The number of filters employed in each convolutional block is indicated by the numerals (32, 64, 128, 256, 512). Furthermore, FCN are integrated within IC-Net to capture multi-scale information and enhance the network’s capability to handle tumors of varying sizes and shapes. The FCN blocks are thoroughly explained, emphasizing the concatenation of features derived from various activation functions (ReLU, tanh sigmoid), significantly enriching feature representations by effectively combining information from multiple activation functions. It also enhances the network’s feature capture by utilizing various activation functions and encouraging multi-scale feature extraction. Transforms unstructured data into a multimodal feature map with various semantic characteristics. Achieves convergence while maintaining spatial coherence using convolutional techniques and concatenation.

Table 3 Hyperparameters used for training the network.

Full size table

An attention mechanism is used to emphasize critical spatial information between two feature maps, focusing on relevant parts of the input image while suppressing irrelevant information consequently enhancing segmentation accuracy. This mechanism, strategically placed within the decoder part of the model, enhances the network’s ability to focus on critical regions. The mechanism involves specific steps such as 1x1 convolutions, element-wise addition, ReLU activation, dimensionality reduction, and the creation of an attention mask. Organizes an elegant attention performance by incorporating input feature maps and contextual data. It directs the network’s attention, improving its interpretive abilities. Provides neural networks with specific focus capabilities. Perpetually allocates attention to regions within the input space, enhancing interpretative prowess and predictive accuracy. The integration of these components, MA-Blocks, FCN blocks, and the attention mechanism, forms a robust methodology for BTS. The model’s segmentation accuracy is enhanced by utilizing attention mechanisms, diverse activation functions, and adaptive feature weighting to enhance feature representation.

Impact of IC NET - Our IC-Net framework presents a number of novel developments, each of which adds in a different way to its improved usefulness and speed. Our architecture’s MA block uses a simple yet efficient series of convolutional, batch normalization, and activation layers to facilitate deeper network representation while streamlining feature extraction. Our FCN block, on the other hand, is unique in that it uses parallel convolutional layers with several activation functions (ReLU, tanh, and sigmoid) to enable the combination of multiple feature representations in a single block. In the meantime, the attention block integrates a complex channel attention mechanism that enables the model to selectively emphasize important characteristics across channels. It does this by using gating signals and convolutions on input feature maps to build attention maps. Our architecture has been shown superior through extensive testing and comparative analyses, which reveal its higher performance in tasks like classification and segmentation on benchmark datasets. Feature map and attention mechanism visualizations demonstrate how well our architecture can capture complex patterns and highlight important regions, while thoughtful design decisions in each block address particular shortcomings in traditional systems. These developments, in conjunction with use case cases in highly specialized areas such as remote sensing and medical picture segmentation, establish our IC-Net as a cutting edge and very efficient solution for image analysis work.

Results and discussion

For the implementation, we utilized PyCharm as our development environment. Our system configuration was centered around a PowerEdge R740 server equipped with an INTEL XEON Silver 4208 2.1 GHz processor, a Tesla V100 GPU boasting 8 cores and 16 threads, delivering 9.6 gigapixels per second (GP/s) of graphics processing power, and backed by 128 GB of DDR4 RAM. We import TensorFlow, a deep learning framework, and Keras, a high-level API for model construction. It includes layers like dense, convolutional, and recurrent, and the Model class for defining and configuring deep learning models. It serves as a foundational step for building and training neural networks for various machine learning tasks. According to the experimental findings, IC-Net performs better at tumor segmentation than conventional U-Net models. Utilizing criteria like accuracy, sensitivity, specificity, and precision, the model’s performance is assessed. To establish the optimal segmentation performance, it is trained on the brain MRI dataset BraTS2020. During our experimental time, we encountered constraints, choosing an optimizer and computational resources. Switching to GPU-based training was necessary for increased effectiveness. The Adam optimizer was chosen due to flexibility. Addressing hyperparameter alteration, data constraints, and model complexity is crucial.

Model training parameters training approach

Using categorized training data we trained our model using the Adam optimizer with a learning rate of 0.001 and the loss function as Categorical Crossentropy. Measures used to calculate the performance of the model includes: Accuracy, Precision, Sensitivity, Specificity, DSC (core, whole, and enhancing tumors), Tversky, Focal and boundary loss. Important detailing of the dataset used have been discussed above in section "Dataset" and Important basics of our technique on preprocessing processes to the dataset in section "Pre-processing". These details were vital for guaranteeing the repeatability of our findings and enabling a thorough evaluation of our IC-Net methodology. With four different MRI sequences, the model suggested in this research is separately trained and validated four times. Hyper-parameters used for training the network are described in Table 3. The number of floating-point operations (FLOPs) for a convolutional layer is calculated as twice the product of the number of kernels, the kernel height, the kernel width, the output height, and the output width. Number of FLOPs for a fully connected layer is calculated as twice the product of the input size and the output size. Using the above details, the total FLOPs for the MA Block can be calculated by multiplying twice the kernel size by the input channels, output channels, height, and width. For the Attention Block, the total FLOPs involve the sum of two sets of operations: the convolutional layers and the subsequent addition, activation, and multiplication operations. The total FLOPs for the FCN (Fully Convolutional Network) Block are determined by tripling twice the kernel size and multiplying by the input channels, output channels, height, and width. With the above reference and given assumptions, the FLOPs for the MA Block, Attention Block, and FCN Block were calculated to be about 2,147,483,648 FLOPs, 3,221,225,472 FLOPs, and 4,294,967,296 FLOPs, respectively. Total FLOPs for the model were determined to be around 9,663,676,416 FLOPs. Our system requirement me have mentioned it in section "Results and discussion" results and discussion. The FLOPs per second for our Tesla V100 GPU is rounded $9.66 \, \text {GFLOPs/s}$ (gigaflops per second). Time consumption can be calculated when Number of Training Samples is multiplied with number of Epochs and time per epoch and when divided with number of GPU cores will give the training time. The assessed training period, calculated based on the number of GPU cores, number of training samples, number of epochs and the time per epoch, is about 3112.5 s. For calculating memory consumption be Trainable params which can be added with non-trainable params multiplied by size of parameter in bytes. Our IC-Net memory consumption is estimated based on the sum of trainable and non-trainable parameters; it approximately sums to 63.23 megabytes.

Table 4 On the BraTS2020 dataset, a comparison of IC-Net with state-of-the-art techniques.

Full size table

Table 5 Performance of our IC-Net.

Full size table

Table 6 Performance comparisons of subclass tumor using Brats2020 data.

Full size table

Table 7 Performance comparisons of subclass tumor using Brats2021 data.

Full size table

Table 8 Performance comparisons of subclass tumor using Brats2019 data.

Full size table

Evaluation metrics

Accuracy, sensitivity, precision and specificity are used to evaluate the model based on the provided segmented ground truth of the tumor part in the MRI are then calculated using following Eqs. (2), (3), (4), (6) and (7). By dividing the total number of predictions by the number of right predictions, accuracy processes determine how exact each prediction is. The precision technique highlights exactness by calculating the percentage of true positive predictions among all positive forecasts. Remember processes the percentage of all real positive cases that are accurate positive predictions, with an emphasis on completeness.

$$\begin{aligned} \text {Accuracy}= & {} \frac{TP+TN}{TP+TN+FP+FN} \end{aligned}$$

(2)

$$\begin{aligned} \text {Precision}= & {} \frac{TP}{TP+FP} \end{aligned}$$

(3)

$$\begin{aligned} \text {Recall}= & {} \frac{TP}{TP+FN} \end{aligned}$$

(4)

DSC score performance metric computes the similarity percentage between the ground truth and the output of a model. Suppose, C and D are two sets, the dice similarity of these two sets are then calculated with Eq. (5)

$$\begin{aligned} \text {DSC} = \frac{2*|C \cap D|}{|C|+|D|} \end{aligned}$$

(5)

Sensitivity is calculated with Eq. (6) where, cardinalities of sets C and D are denoted with |C| and |D| respectively, $G_1$ representing the proportion of tumor regions of ground truth images and $C_1$ represents tumor regions that were predicted by the model.

$$\begin{aligned} \text {Sensitivity}(C,G)=\frac{|C_1 \wedge G_1|}{|G_1|} \end{aligned}$$

(6)

Specificity is calculated with Eq. (7) where $G_0$ represents non-tumor tissue regions of the ground truth and $C_0$ represents the non-tumor tissue regions predicted by the model.

$$\begin{aligned} \text {Specificity}(C,G)=\frac{|C_0 \wedge G_0|}{|G_0|} \end{aligned}$$

(7)

Model evaluation

In the results section of our research paper, we are delighted to report exceptionally high-performance metrics obtained after training our model for 50 and 100 epochs, underscoring the effectiveness of our approach. For the 50-epoch experiment, we achieved accuracy rates consistently exceeding 99.39%, coupled with low loss value of 0.0246, indicating the model’s strong predictive capabilities and the minimization of errors. Precision scores consistently above 99.43% demonstrate the model’s proficiency in correctly identifying positive cases while keeping false positives to a minimum. Sensitivity consistently exceeding 99.26%. This indicates the model’s ability to successfully capture a significant proportion of true positive cases. Specificity above 99.79%, reflecting its capability to effectively distinguish between negative and positive instances. Results for 50 epoch are displayed in Figs. 6 and 7. For the 100-epoch experiment, our model consistently achieved accuracy rates exceeding 99.65%, maintained low loss values of 0.0159, and demonstrated precision scores of 99.58%, minimizing false positives. Moreover, our research exhibited high sensitivity exceeding 99.44%, indicating the model’s ability to capture true positive cases, and specificity consistently of 99.86%, highlighting its proficiency in distinguishing negative instances. Remarkably, these high-performance metrics remained stable and even improved marginally in the 100-epoch experiment, demonstrating the robustness and reliability of our methodology over extended training periods are given in Figs. 8 and 9. Our model consistently outperforms advanced methods in performance metrics, demonstrating its superiority in various aspects. Key findings summarize these results in Table 4 and results when compared with existing methods are given in Fig. 16 in a graphical way. To make sure that our results were reliable and consistent, we ran the model several times during our studies. These numerous runs yielded the following accuracy values: 099636, 099635, 099664, 099667, 099635, 099664, 099667, and 099635. We determined the standard deviation of these accuracy numbers in order to further verify the stability of our framework. Performance of our model displays a high degree of consistency and reliability, as evidenced by the standard deviation of 0.00014. This low standard deviation supports the effectiveness of our method by showing that the accuracy of our model is both high and consistent across runs.

Our research demonstrated the effectiveness of our approach, with high accuracy, low loss, and notable precision, sensitivity, and specificity scores, indicating its reliability and potential contributions. Performance of our IC-Net resuts are presented in Table 5. The results provide a solid foundation for our research paper’s conclusions, demonstrating the effectiveness of our methodology in achieving our intended goals.

Our analysis reveals that our approach outperforms^43,44 in the segmentation of brain tumors using the BraTS2020. Figures 10 and 11 represent the prediction using existing models which outperforms by our IC-Net. Figures 12 and 14 shows the segmentation result of brain tumors using Brats 2021 and 2019 data (Figs. 13, 14). The superior results are evident in the accuracy of delineating tumor classes and individual core, whole, and enhancing tumors, showcased in columns four, five, and six of the images. Tables 6, 7 and 8 provides a comparison of the DSC values of 2020, 2021 and 2019 brats data for Whole Tumor (WT), Tumor Core (TC), and Enhanced Tumor (ET) with the most recent algorithms. In addition, the graphical depiction of the outcomes for WT, TC, and ET is shown in Fig. 15. Results of our IC-Net shows promising visuals which are presented below in Fig. 13. Our model effectively defines data boundaries, improving segmentation quality and precision. It yields more accurate, visually appealing results, highlighting its practical value in real-world applications for improved decision-making processes for clinical applications and medical image analysis (Fig. 16).

Conclusion

In summary, IC-Net is a novel semantic segmentation architecture that combines MA-blocks, FCN, attention blocks and an encoder-decoder structure. It achieves state-of-the-art results in segmentation tasks by effectively capturing complex features and context information. Our manuscript provides a comprehensive explanation of the IC-Net framework and its constituent components, along with experimental results demonstrating its effectiveness. Our results of tumor segmentation are influenced differently by many variables. This research suggests a multimodal BTS method based on IC-Net to make better use of multimodal brain tumor image data. Through wide experimentations and quantitative evaluations, we validate the superiority of our IC-Net model compared to other state-of-the-art segmentation methods. Our proposed modifications have proven highly accurate and reliable in achieving BTSs, even in challenging scenarios with irregular tumor shapes and variable appearances.

Future directions

We outline several paths for future research, including model refinement, domain adaptation, and real-time clinical deployment. Outline the development and evaluation of IC-Net, a modified neural network model for BTS. Through rigorous experiments, we demonstrate that IC-Net offers superior performance compared to traditional U-Net and other existing models. Subsequent investigations will concentrate on incorporating IC-Net into clinical procedures, guaranteeing its usability and efficacy in practical contexts. The model will be created with ease of integration into current systems, enabling real-time tumor segmentation and supporting the choice of diagnosis and course of treatment. Performance evaluations will take place in actual clinical settings, and input from medical professionals will be vital to improving the model. Other medical imaging tasks will also be added to IC-Net, which could enhance patient outcomes and diagnostic precision. To improve IC-Net’s usefulness in medical imaging, future studies should examine how it may be integrated into clinical workflows, evaluate how well it works in practical situations, and extend its use to additional medical imaging modalities like PET, CT, and ultrasonography images. Our research contributes to the ongoing efforts to enhance medical image analysis, aiming to improve patient care and treatment outcomes in neuro-oncology.

Data availability

The dataset was gathered from the Brain Tumor Segmentation 2020 Dataset, which is available online. It is publicly accessible and unrestricted. https://www.kaggle.com/awsaf49/brats20-dataset-training-validation.

References

Pereira, S., Pinto, A., Alves, V. & Silva, C. A. Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Trans. Med. Imaging 35, 1240–1251 (2016).
Article PubMed Google Scholar
Havaei, M. et al. Brain tumor segmentation with deep neural networks. Med. Image Anal. 35, 18–31 (2017).
Article PubMed Google Scholar
Yin, X.-X. et al. U-net-based medical image segmentation. J. Healthc. Eng.2022 (2022).
Tripathi, S., Anand, R. & Fernandez, E. A review of brain MR image segmentation techniques. In Proceedings of International Conference on Recent Innovations in Applied Science, Engineering and Technology, 16–17 (2018).
Hussain, S. et al. Modern diagnostic imaging technique applications and risk factors in the medical field: A review. BioMed Research International 2022 (2022).
Sun, L., Zhang, S., Chen, H. & Luo, L. Brain tumor segmentation and survival prediction using multimodal MRI scans with deep learning. Front. Neurosci. 13, 810 (2019).
Article PubMed PubMed Central Google Scholar
Zhang, W. et al. Overview of multi-modal brain tumor mr image segmentation. In Healthcare, vol. 9, 1051 (MDPI, 2021).
Young Kim, E. & Johnson, H. J. Robust multi-site mr data processing: iterative optimization of bias correction, tissue classification, and registration. Front. Neuroinform. 7, 29 (2013).
Article PubMed PubMed Central Google Scholar
Ghaffari, M., Sowmya, A. & Oliver, R. Automated brain tumor segmentation using multimodal brain scans: A survey based on models submitted to the brats 2012–2018 challenges. IEEE Rev. Biomed. Eng. 13, 156–168. https://doi.org/10.1109/RBME.2019.2946868 (2020).
Article PubMed Google Scholar
Futrega, M., Milesi, A., Marcinkiewicz, M. & Ribalta, P. Optimized u-net for brain tumor segmentation. In International MICCAI Brainlesion Workshop, 15–29 (Springer, 2021).
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, 234–241 (Springer, 2015).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
M. Gab Allah, A., M. Sarhan, A. & M. Elshennawy, N. Edge u-net: Brain tumor segmentation using mri based on deep u-net model with boundary information. Expert Systems with Applications 213, 118833. https://doi.org/10.1016/j.eswa.2022.118833 (2023).
Esmaeilzadeh Asl, S., Chehel Amirani, M. & Seyedarabi, H. Brain tumors segmentation using a hybrid filtering with u-net architecture in multimodal MRI volumes. Int. J. Inf. Technol., 1–10 (2023).
Baid, U. et al. Deep learning radiomics algorithm for gliomas (drag) model: a novel approach using 3d unet based deep convolutional neural network for predicting survival in gliomas. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 4th International Workshop, BrainLes 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers, Part II 4, 369–379 (Springer, 2019).
Iqbal, M. J. et al. Brain tumor segmentation in multimodal MRI using u-net layered structure. CMC-Comput. Mater. Continua 74, 5267–5281 (2023).
Article Google Scholar
Vijay, S., Guhan, T., Srinivasan, K., Vincent, P. M. D. R. & Chang, C.-Y. MRI brain tumor segmentation using residual spatial pyramid pooling-powered 3d u-net. Front. Public Health 11. https://doi.org/10.3389/fpubh.2023.1091850 (2023).
Roy, S. et al. Brain tumour segmentation using s-net and sa-net. IEEE Access 11, 28658–28679 (2023).
Article Google Scholar
Ruba, T., Tamilselvi, R. & Parisa Beham, M. Brain tumor segmentation using JGate-AttResUNet: A novel deep learning approach. Biomed. Signal Process. Control 84, 104926. https://doi.org/10.1016/j.bspc.2023.104926 (2023).
Article Google Scholar
Ruba, T., Tamilselvi, R. & Beham, M. P. Brain tumor segmentation in multimodal mri images using novel lsis operator and deep learning. J. Ambient. Intell. Humaniz. Comput. 14, 13163–13177 (2023).
Article Google Scholar
Raj, A., Anil, A., Deepa, P., Aravind Sarma, H. & Naveen Chandran, R. Brainnet: A deep learning network for brain tumor detection and classification. In Advances in Communication Systems and Networks: Select Proceedings of ComNet 2019, 577–589 (Springer, 2020).
Peng, Y. & Sun, J. The multimodal MRI brain tumor segmentation based on ad-net. Biomed. Signal Process. Control 80, 104336 (2023).
Article Google Scholar
Wang, W., Zhang, K., Su, Y., Wang, J. & Wang, Q. Learning cross-attention discriminators via alternating time-space transformers for visual tracking. IEEE Trans. Neural Netw. Learn. Syst. 1–14. https://doi.org/10.1109/TNNLS.2023.3282905 (2023).
Raza, R., Bajwa, U. I., Mehmood, Y., Anwar, M. W. & Jamal, M. H. dresu-net: 3d deep residual u-net based brain tumor segmentation from multimodal MRI. Biomed. Signal Process. Control 79, 103861 (2023).
Article Google Scholar
Abdullah Al Nasim, M. et al. Brain tumor segmentation using enhanced u-net model with empirical analysis. arXiv e-prints arXiv–2210 (2022).
Montaha, S., Azam, S., Rakibul Haque Rafid, A., Hasan, M. Z. & Karim, A. Brain tumor segmentation from 3d mri scans using u-net. SN Comput. Sci.4, 386 (2023).
Bindu, N. P. & Sastry, P. N. Automated brain tumor detection and segmentation using modified unet and resnet model. Soft. Comput. 27, 9179–9189 (2023).
Article Google Scholar
Chinnam, S. K. R., Sistla, V. & Kolli, V. K. K. Brain tumor segmentation using 3d attention u net. In International Advanced Computing Conference, 475–484 (Springer, 2022).
Feng, X. et al. Brain tumor segmentation with patch-based 3d attention unet from multi-parametric mri. In International MICCAI Brainlesion Workshop, 90–96 (Springer, 2021).
Nodirov, J., Abdusalomov, A. B. & Whangbo, T. K. Attention 3d u-net with multiple skip connections for segmentation of brain tumor images. Sensors22. https://doi.org/10.3390/s22176501 (2022).
Li, H., Nan, Y. & Yang, G. Lkau-net: 3d large-kernel attention-based u-net for automatic mri brain tumor segmentation. In Annual Conference on Medical Image Understanding and Analysis, 313–327 (Springer, 2022).
Liu, D. et al. Scau-net: 3d self-calibrated attention u-net for brain tumor segmentation. Neural Comput. Appl. 35, 23973–23985 (2023).
Article Google Scholar
Li, N. & Ren, K. Double attention u-net for brain tumor mr image segmentation. Int. J. Intell. Comput. Cybern. 14, 467–479 (2021).
Article Google Scholar
Tripathi, P. C. & Bag, S. An attention-guided cnn framework for segmentation and grading of glioma using 3d mri scans. IEEE/ACM Trans. Comput. Biol. Bioinform. (2022).
Cheng, J., Liu, J., Kuang, H. & Wang, J. A fully automated multimodal mri-based multi-task learning for glioma segmentation and idh genotyping. IEEE Trans. Med. Imaging 41, 1520–1532. https://doi.org/10.1109/TMI.2022.3142321 (2022).
Article PubMed Google Scholar
Cai, Y. & Wang, Y. Ma-unet: An improved version of unet based on multi-scale and attention mechanism for medical image segmentation. In Third International Conference on Electronics and Communication; Network and Computer Technology (ECNCT 2021), vol. 12167, 205–211 (SPIE, 2022).
Pedada, K. R. et al. A novel approach for brain tumour detection using deep learning based technique. Biomed. Signal Process. Control 82, 104549 (2023).
Article Google Scholar
Amin, J., Sharif, M., Yasmin, M. & Fernandes, S. L. Big data analysis for brain tumor detection: Deep convolutional neural networks. Future Gener. Comput. Syst. 87, 290–297 (2018).
Article Google Scholar
Raut, G., Raut, A., Bhagade, J., Bhagade, J. & Gavhane, S. Deep learning approach for brain tumor detection and segmentation. In 2020 International Conference on Convergence to Digital World-Quo Vadis (ICCDW), 1–5 (IEEE, 2020).
Yousef, R. et al. U-net-based models towards optimal mr brain image segmentation. Diagnostics 13, 1624 (2023).
Article PubMed PubMed Central Google Scholar
Agrawal, P., Katal, N. & Hooda, N. Segmentation and classification of brain tumor using 3d-unet deep neural networks. Int. J. Cognit. Comput. Eng. 3, 199–210 (2022).
Article Google Scholar
Ghaffari, M., Sowmya, A. & Oliver, R. Automated brain tumour segmentation using cascaded 3d densely-connected u-net. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 6th International Workshop, BrainLes 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers, Part I 6, 481–491 (Springer, 2021).
Sangui, S., Iqbal, T., Chandra, P. C., Ghosh, S. K. & Ghosh, A. 3d mri segmentation using u-net architecture for the detection of brain tumor. Procedia Computer Science218, 542–553. https://doi.org/10.1016/j.procs.2023.01.036 (2023). International Conference on Machine Learning and Data Engineering.
Ali, T. M. et al. A sequential machine learning-cum-attention mechanism for effective segmentation of brain tumor. Front. Oncol.12, https://doi.org/10.3389/fonc.2022.873268 (2022).
Munir, K., Frezza, F. & Rizzi, A. Deep learning hybrid techniques for brain tumor segmentation. Sensors22, https://doi.org/10.3390/s22218201 (2022).
Zheng, P., Zhu, X. & Guo, W. Brain tumour segmentation based on an improved u-net. BMC Med. Imaging 22, 199 (2022).
Article PubMed PubMed Central Google Scholar
Qin, C. et al. Improved u-net3+ with stage residual for brain tumor segmentation. BMC Med. Imaging 22, 14 (2022).
Article PubMed PubMed Central Google Scholar
Ding, H., Lu, J., Cai, J., Zhang, Y. & Shang, Y. Slf-unet: Improved unet for brain MRI segmentation by combining spatial and low-frequency domain features. In Computer Graphics International Conference, 415–426 (Springer, 2023).
Aghalari, M., Aghagolzadeh, A. & Ezoji, M. Brain tumor image segmentation via asymmetric/symmetric unet based on two-pathway-residual blocks. Biomed. Signal Process. Control 69, 102841 (2021).
Article Google Scholar
Ottom, M. A., Rahman, H. A. & Dinov, I. D. Znet: Deep learning approach for 2d MRI brain tumor segmentation. IEEE J. Translat. Eng. Health Med. 10, 1–8 (2022).
Article Google Scholar
Akter, A. et al. Robust clinical applicable cnn and u-net based algorithm for MRI classification and segmentation for brain tumor. Expert Syst. Appl. 238, 122347 (2024).
Article Google Scholar
Lin, C.-W., Hong, Y. & Liu, J. Aggregation-and-attention network for brain tumor segmentation. BMC Med. Imaging 21, 109 (2021).
Article CAS PubMed PubMed Central Google Scholar
Nehru, V. & Prabhu, V. Automated multimodal brain tumor segmentation and localization in mri images using hybrid res2-unext. J. Electr. Eng. Technol., 1–13 (2024).
Huang, H. et al. A deep multi-task learning framework for brain tumor segmentation. Front. Oncol. 11, 690244 (2021).
Article PubMed PubMed Central Google Scholar
Fang, Y. et al. Nonlocal convolutional block attention module vnet for gliomas automatic segmentation. Int. J. Imaging Syst. Technol. 32, 528–543 (2022).
Article Google Scholar
Guan, X. et al. 3d agse-vnet: An automatic brain tumor MRI data segmentation framework. BMC Med. Imaging 22, 1–18 (2022).
Article Google Scholar
Li, R. et al. A continuous learning approach to brain tumor segmentation: Integrating multi-scale spatial distillation and pseudo-labeling strategies. Front. Oncol. 13, 1247603 (2024).
Article PubMed PubMed Central Google Scholar
Peiris, H., Chen, Z., Egan, G. & Harandi, M. Reciprocal adversarial learning for brain tumor segmentation: a solution to brats challenge 2021 segmentation task. In International MICCAI Brainlesion Workshop, 171–181 (Springer, 2021).
Wang, S., Li, L. & Zhuang, X. Attu-net: Attention u-net for brain tumor segmentation. In International MICCAI Brainlesion Workshop, 302–311 (Springer, 2021).
Yousef, R. et al. Bridged-u-net-aspp-evo and deep learning optimization for brain tumor segmentation. Diagnostics 13, 2633 (2023).
Article PubMed PubMed Central Google Scholar
Li, X., Luo, G. & Wang, K. Multi-step cascaded networks for brain tumor segmentation. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 5th International Workshop, BrainLes 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 17, 2019, Revised Selected Papers, Part I 5, 163–173 (Springer, 2020).
Cheng, X., Jiang, Z., Sun, Q. & Zhang, J. Memory-efficient cascade 3d u-net for brain tumor segmentation. In Brainlesion: Glioma, multiple sclerosis, stroke and traumatic brain injuries: 5th International workshop, BrainLes 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 17, 2019, Revised Selected Papers, Part I 5, 242–253 (Springer, 2020).
Liang, J., Yang, C., Zeng, M. & Wang, X. Transconver: Transformer and convolution parallel network for developing automatic brain tumor segmentation in mri images. Quant. Imaging Med. Surg. 12, 2397 (2022).
Article PubMed PubMed Central Google Scholar
Zhang, M. et al. Augmented transformer network for MRI brain tumor segmentation. J. King Saud Univ.-Comput. Inf. Sci. 36, 101917 (2024).
Google Scholar

Download references

Funding

Open access funding provided by Vellore Institute of Technology.

Author information

These authors contributed equally: D. S. Chandra Sekaran and J. Christopher Clement.

Authors and Affiliations

School of Electronics Engineering, Vellore Institute of Technology, Vellore, Tamilnadu, 632014, India
Chandra Sekaran D S & J. Christopher Clement

Authors

Chandra Sekaran D S
View author publications
You can also search for this author in PubMed Google Scholar
J. Christopher Clement
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

CS and CC contributed significantly to a research on brain tumor segmentation using IC-Net Framework. CS analyzed, implemented and compared various methods, while CC supervised the project. CS and CC analysed the results. CS and CC reviewed the manuscript.

Corresponding author

Correspondence to J. Christopher Clement.

Ethics declarations

Competing interest

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

D S, C., Christopher Clement, J. Enhancing brain tumor segmentation in MRI images using the IC-net algorithm framework. Sci Rep 14, 15660 (2024). https://doi.org/10.1038/s41598-024-66314-4

Download citation

Received: 01 April 2024
Accepted: 01 July 2024
Published: 08 July 2024
DOI: https://doi.org/10.1038/s41598-024-66314-4
Springer Nature Limited

Enhancing brain tumor segmentation in MRI images using the IC-net algorithm framework

Abstract

Similar content being viewed by others

Advancing Brain Tumor Segmentation via Attention-Based 3D U-Net Architecture and Digital Image Processing

Multi-threshold Attention U-Net (MTAU) Based Model for Multimodal Brain Tumor Segmentation in MRI Scans

Brain Tumor Segmentation Using Attention-Based Network in 3D MRI Images