PolySeg Plus: Polyp Segmentation Using Deep Learning with Cost Effective Active Learning

Saad, Abdelrahman I.; Maghraby, Fahima A.; Badawy, Osama

doi:10.1007/s44196-023-00330-6

PolySeg Plus: Polyp Segmentation Using Deep Learning with Cost Effective Active Learning

Research Article
Open access
Published: 14 September 2023

Volume 16, article number 148, (2023)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computational Intelligence Systems Aims and scope Submit manuscript

PolySeg Plus: Polyp Segmentation Using Deep Learning with Cost Effective Active Learning

Download PDF

1080 Accesses
Explore all metrics

Abstract

A deep convolution neural network image segmentation model based on a cost-effective active learning mechanism is proposed and named PolySeg Plus. It is intended to address polyp segmentation with a lack of labeled data and a high false-positive rate of polyp discovery. In addition to applying active learning, which assisted in labeling more image samples, a comprehensive polyp dataset formed of five benchmark datasets was generated to increase the number of images. To enhance the captured image features, the locally shared feature method is used, which utilizes the power of employing neighboring features together with one another to improve the quality of image features and overcome the drawbacks of the Conditional Random Features method. Medical image segmentation was performed using ResUNet++, ResUNet, UNet++, and UNet models. Gaussian noise was removed from the images using a gaussian filter, and the images were then augmented before being fed into the models. In addition to optimizing model performance through hyperparameter tuning, grid search is used to select the optimum parameters to maximize model performance. The results demonstrated a significant improvement and applicability of the proposed method in polyp segmentation when compared to state-of-the-art methods on the datasets CVC-ClinicDB, CVC-ColonDB, ETIS Larib Polyp DB, KVASIR-SEG, and Kvasir-Sessile, with Dice coefficients of 0.9558, 0.8947, 0.7547, 0.9476, and 0.6023, respectively. Not only did the suggested method improve the dice coefficients on the individual datasets, but it also produced better results on the comprehensive dataset, which will contribute to the development of computer-aided diagnosis systems.

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

LW-MHFI-Net: a lightweight multi-scale network for medical image segmentation based on hierarchical feature incorporation

Article 15 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Colorectal cancer is recognized as a dangerous disease causing deaths worldwide, with nearly two million new cases and 1 million cancer deaths in the last 2 years [1]. Like any type of cancer, healthy human body cells can turn into harmful cells in the form lesions [2]. Colorectal cancer commonly arises from polyps of the colon or rectal epithelium, which are non-cancerous neoplasms. Some polyps can develop into precancerous lesions, which can lead to colorectal cancer. Detecting and removing adenomas early (early screening) will reduce the severity of colorectal cancer. In the USA, colorectal cancer represents the third most common reason causing cancer for men and women and the second reason causing deaths for both genders [3].

Colorectal cancer, which is also called bowel cancer, has several risk factors that have been approved by the American Cancer Society. The most important risk factors are lifestyle-related and changeable, such as being overweight or obese, not being physically active, certain types of diet, and alcohol consumption. On the contrary, there are other factors that cannot be changed over time, such as one’s age [4], history of a certain person or one of his family members with polyps, cancer, or inflammatory bowel disease [5], as well as an inherited syndrome.

The Adenomas Detection Rate (ADR) measures the frequency with which a practitioner detects precancerous adenomas. A 1% rate is considered a good adenoma detection rate, which is accompanied by a 3% reduction in the risk of having colorectal cancer [6, 7]. This rate is thought to be influenced by two aspects: blind spots and human error. The first aspect could be addressed using a broad scope, while the second aspect is challenging, and researchers are very interested in artificial intelligence to reduce human error.

A medical endoscopy decision support system follows a standard procedure. The first step is often to prepare the tissue region to be studied. Preprocessing may be required after an image has been acquired to improve the quality of degraded photos. Based on the application’s goal [8], the appropriate features must then be located and extracted to detect polyps or cancer. Some methods, like classification, are intended for Content-Based Image Retrieval (CBIR) [9] or Content-Based Video Retrieval (CBVR). The primary distinction between automated decision support systems and CBIR/CBVR systems is that the output of a decision support system based on automation [10] can be a suggestion for the last diagnosis phase or more information for a diagnosis.

Medical image segmentation is the process of extracting Regions of Interest (ROIs) from 3D image data such as Magnetic Resonance Imaging (MRI) or Computed Tomography (CT) scans [11]. The primary goal of the segmentation task is to highlight areas of the anatomy needed for a specific study. Segmentation of images consumes much time, but recent advances in Artificial Intelligence (AI) tools are trying to make repetitive tasks faster and more efficient.

The problem is to detect and remove precancerous adenomas in patients with colorectal cancer, which significantly reduces the severity of the disease. Factors such as lifestyle risks and genetic syndromes contribute to the development of the disease. Therefore, more efficient and accurate detection methods are needed to reduce human error and eliminate screening blind spots. A medical endoscopy decision support system using AI tools and image segmentation techniques may improve adenoma detection rates and reduce the impact of colorectal cancer.

The novelty and main work of this paper are as follows:

1.
Reducing the high false-positive rates of polyp discovery in SOTA algorithms.
2.
Enhancing and improving the image quality in the preprocessing phase using Gaussian filters.
3.
Contributing to the shortage of labeled data (normal images without polyps) problem by applying a cost-effective active learning technique.
4.
Creating a comprehensive polyp dataset by combining six different datasets.
5.
Applying Locally Shared Featured technique and integrating it with deep learning models to improve their performance and reduce computational time.
6.
Hyperparameter tuning using grid search to enhance the performance of the models.

The proposed study aims to develop an automated system to help gastroenterologists segment polyps of various sizes and decide whether to remove or leave the polyp after examination. A Gaussian filter is used in the preprocessing stage to improve the image quality. We combine six different data sets to create a comprehensive polyp data set and use active learning techniques to address the lack of normal labeled data (images without polyps). Grid search hyperparameter tuning is performed to select the best parameters and optimize the model. The ultimate goal is to improve the accuracy and efficiency of polyp segmentation, giving gastroenterologists better information to make informed polypectomy decisions.

The rest of the paper is organized as follows: Sect. 2, introduces a brief introduction to medical image segmentation techniques. Section 3 describes the related work done in polyp segmentation. Section 4 illustrates the datasets and various methods, while Sect. 5 shows the different variations of the proposed model architecture used in this study. Section 6 presents the results and experiments of the proposed model and state-of-the-art (SOTA) models, in addition to ablation studies. Section 7 discusses the experiment results in detail compared to previous work. Section 8 discusses the hypothesis and the limitations of the proposed model. Finally, Sect. 9 is the study’s conclusion.

2 Background

The purpose of this section is to set the context and provide the foundation for understanding medical image segmentation, computer-aided diagnosis, and their relation to the deep learning field, that helps demonstrate the important terms in the existing knowledge.

2.1 Medical Image Segmentation

One of the key benefits of medical image segmentation is that it facilitates much more specific anatomical analysis of data by separating only the areas that are required [12]. Segmentation works with CT, MRI, as well as other types of scans by producing a mask from the background image data. Based on the task, users are able to work on their scans in two or three dimensions colorectal polyp segmentation is indeed a difficult task caused by variations in polyp form and color intensity in colonoscopic frames [13]. Polyp segmentation was divided into three main methods by the researchers. The first method is image processing-based segmentation, which does not employ any learning methods. The second method involves extracting features first and then segmenting them using classifiers, as shown in Fig. 1 where on the left side is the raw image, which is considered an input to the model, while on the right side is the output, which is the segmented image or ground truth (mask). In the third method, approaches that perform segmentation using convolutional neural networks are grouped together.

2.2 Computer-Aided Diagnosis (CAD)

Computer-aided detection is a computer-based framework that assists medical physicians in making quick decisions in the field of medical imaging [14]. Medical imaging is concerned with information existing in images that medical practitioners, such as gastroenterologists, must assess and analyze in a short span of time, such as discovering polyps that will aid in the decision of whether to leave or resect these polyps. Image processing evaluation is an important task in the medical sector because imaging is a basic method for identifying any disease in its early phases, but image acquisition should also not endanger the human body, such as during endoscopic operations, X-ray, and MRI scans [15], and so on. Images taken with great intensity of energy provide superior quality but endanger the body; thus, images are captured with much less energy and in turn, will have poor quality and low contrast, which will be a valuable area to investigate by the researchers.

2.3 Deep Learning

Deep learning and machine learning are the foundations of any CAD or medical decision support system. Deep learning is based on combining low-level features, placing higher-level abstract feature characteristics, and classifying intangible objects. Deep learning methodology is derived by researchers from different studies and experiments on artificial neural networks. The most used deep learning models for processing and analyzing images are convolutional neural networks (CNNs) or deep convolutional neural networks (DCNNs), in addition to the recurrent neural network (RNN), model which is widely used in CNNs, with different network frameworks, such as long short-term memory (LSTM) networks, a form of recurrent neural networks that is good at learning order reliance in predicting sequence [16].

3 Literature Review

The purpose of this section is to highlight the progress made, identify current problems, and establish the way for creative approaches to the accurate segmentation and localization of polyps by conducting a comprehensive review of earlier works.

In 2020, Mandal et al. [17] developed a reliable and effective method for segmenting polyp regions. In their research, fuzzy clustering was used to split polyp areas from healthy areas in colonoscopic image frames by producing a distinctive threshold level from the hue, saturation, and lightness color space’s V channel. There are several types of fuzzy clustering that investigate cluster information, including hard and soft clustering. Hard clustering divides the data into distinct groups or distinct clusters, with each data object precisely assigned to one of the groups. Soft clustering, on the other hand, assigns each data object to one or perhaps more clusters, with membership levels assigned during the process. Their model achieved an accuracy of 98.80% compared to three other studies proposed by Hwang et al., Alexandre LA et al., and Kodogiannis et al., where their accuracy are 77.77%, 94.87% and 97.14%, respectively.

In 2021, Debesh Jha et al. [18] did thorough research into segmenting colorectal polyps. They used many different models such as ResUNet++, ResUNet, and UNet as major models. These models were tested on six datasets, which are CVC-ClinicDB, CVC-ColonDB, ETIS Larib Polyp DB, Kvasir-SEG, ASU-Mayo Clinic Colonoscopy Video Database and CVC-VideoClinicDB, with a total of 33,119 images. Data augmentation was applied to increase the number of polyps, and they also reduced the complexity by modifying the size of the images to 256 $\times $ 256. They improved the results of the experimental model by implementing augmentation at the test time and Conditional Random Field (CRF) as a post-processing technique. After testing the proposed model on different datasets, they concluded that ResUNet++ is better at segmenting all different types of polyps (large, small, and regular polyps), especially smaller and sessile polyps. In addition, using ResUNet++ combined with the CRF improved precision and recall.

In 2021, Banik et al. [19] developed a polyp segmentation network called Polyp-Net that is based on fusion. They enhanced the CNN with a network called a Binary Tree Wavelet. The dataset used in this study is from a polyp segmentation challenge called Endoscopic Vision held in Singapore. For training, they used 300 frames, while for testing, they used 612 frames. In the preprocessing phase, they focused on noise in the frames such as blood vessels and endoluminal folds by applying the Mumford-Shah-Euler in-painting method. Since the resulting segmented image was not promising in terms of an accurate region of interest, they used Local Gradient Weighting as a type of Level-Set Method (LSM) to overcome this problem. Their proposed model outperformed CNN and achieved a precision and recall of 0.836 and 0.811, respectively, compared to UNet and ResNet-50.

In 2022, Qiu et al. [20] designed the Boundary Distribution Guided Network (BDG-Net) to segment polyps accurately. The research focused on enhancing segmentation by integrating many scale features since polyps have various sizes and undefined boundaries. The suggested model consists of two units. The first unit is for generating boundary distribution, which is used to assemble high-level features and generate a map of this boundary. The second unit is the Boundary Distribution Guided Decoder (BDGD), which enhances polyp segmentation using the previously generated BDM and integrates that with many scale features. The training set contained a total of 1450 images from CVC-ClinicDB and Kvasir, while they used three different datasets for testing, which are CVC300, ETIS, and CVC-ColonDB. They contrasted their proposal with the state-of-the-art algorithms such as SFA, PraNet, UNet, UNet++, ResUNet-mod, and ResUNet++. The proposed method achieved a mean dice of 0.915, which outperformed the previously mentioned algorithms.

In 2022, Mohapatra et al. [21] proposed a segmentation architecture called U-PolySeg that concatenates features using dilated convolution. Due to their different sizes, the images were resized to 416 $\times $ 416 pixels during the processing step. A comprehensible transport module was applied to remove specular reflections in the image, and the contrast of the images was enhanced using contrast limited adaptive histogram equalization. The architecture of UNet model was modified to add more advanced blocks. Many experiments were done to select the best parameters of the proposed model to ensure its effectiveness. The dataset used was the Kvasir-SEG dataset, which has 1000 images and masks. Finally, they compared their proposed model to ColonSegNet. The proposed model achieved 0.9677, 0.9686, 0.8791, 0.9557, and 0.9229 in terms of global accuracy, dice coefficient, intersection over union, recall, and precision, respectively.

In 2022, Gautam et al. [22] constructed an encoder and decoder structure and focused on multi-scale features by applying squeeze and excitation modules. They modified the skip connection by using Fusion Attention Blocks to minimize the semantic gap between both encoder and decoder (FAB). To enrich and extract more features, a Multi-Scale Information (MSI) block is applied, which will help in the representation of relevant features. The Kvasir-SEG dataset was used for training and testing, and due to the lack of labeled data, data were augmented using a special library called albumentations. The proposed model succeeded in segmenting different sizes of polyps and achieved a dice score of 85.15% compared to the other four models.

Table 1 Summary of literature review

PolySeg Plus: Polyp Segmentation Using Deep Learning with Cost Effective Active Learning

Abstract

Similar content being viewed by others

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

LW-MHFI-Net: a lightweight multi-scale network for medical image segmentation based on hierarchical feature incorporation

1 Introduction

2 Background

2.1 Medical Image Segmentation

2.2 Computer-Aided Diagnosis (CAD)

2.3 Deep Learning

3 Literature Review

4 Materials and Methods

4.1 Datasets

4.2 Methods

4.2.1 Deep Learning Techniques

4.2.2 UNet

4.2.3 UNet++

4.2.4 ResUNet

4.2.5 ResUNet++

4.2.6 Data Processing Techniques

4.2.7 Conditional Random Fields

4.2.8 Locally Shared Features

4.2.9 Data Augmentation

4.2.10 Active Learning

5 Proposed Model for Segmenting Polyps

5.1 Data Fusion Phase

5.2 Active Learning Module Sub-phase

5.3 Preprocessing Phase

5.4 Training and Testing Phase

5.5 Hyper-Parameter Tuning Sub-phase

6 Experiment Results

6.1 Experimental Settings

6.2 Evaluation Metrics

6.3 Ablation Study

6.3.1 Active Learning Method

6.3.2 Case Study 1: Altering the Number of Active Learning Iterations and the Number of Predictions for Each Iteration

6.3.3 Case Study 2: Altering Batch Size

6.3.4 Preprocessing

6.3.5 Case Study 1: Applying Gaussian Filters

6.3.6 Case Study 2: Applying LSF

6.3.7 Hyperparameter Tuning

6.3.8 Case Study 1: Altering Optimizer

6.3.9 Case Study 2: Altering Filter Size

6.3.10 Case Study 3: Altering Batch Size

6.3.11 Case Study 4: Altering the Pooling Layer’s Configuration

6.3.12 Hyperparamter Tuning Using Grid Search

6.4 Experimental Results Analogy of Baseline Models and PolySeg Plus on CVC-ClinicDB Dataset

6.5 Experimental Results Analogy of Baseline Models and PolySeg Plus on CVC-ColonDB Dataset

6.6 Experimental Results Analogy of Baseline Models and PolySeg Plus on ETIS Larib Polyp DB Dataset

6.7 Experimental Results Analogy of Baseline Models and PolySeg Plus on KVASIR-SEG Dataset

6.8 Experimental Results Analogy of Baseline Models and PolySeg Plus on Kvasir-Sessile Dataset

6.9 Experimental Results of Poly-Seg Plus on Comprehinsive Polyp Dataset

7 Results Discussion

7.1 Previous Work Results Discussion

8 Hypothesis and Limitations

8.1 Hypothesis

8.2 Limitations

9 Conclusion

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Consent for Publication

Ethical Approval and Consent to Participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation