ColonGen: an efficient polyp segmentation system for generalization improvement using a new comprehensive dataset

Mozaffari, Javad; Amirkhani, Abdollah; Shokouhi, Shahriar B.

doi:10.1007/s13246-023-01368-8

ColonGen: an efficient polyp segmentation system for generalization improvement using a new comprehensive dataset

Scientific Paper
Published: 15 January 2024

Volume 47, pages 309–325, (2024)
Cite this article

Physical and Engineering Sciences in Medicine Aims and scope Submit manuscript

Javad Mozaffari¹,
Abdollah Amirkhani ORCID: orcid.org/0000-0001-6891-4528² &
Shahriar B. Shokouhi¹

279 Accesses
Explore all metrics

Abstract

Colorectal cancer (CRC) is one of the most common causes of cancer-related deaths. While polyp detection is important for diagnosing CRC, high miss rates for polyps have been reported during colonoscopy. Most deep learning methods extract features from images using convolutional neural networks (CNNs). In recent years, vision transformer (ViT) models have been employed for image processing and have been successful in image segmentation. It is possible to improve image processing by using transformer models that can extract spatial location information, and CNNs that are capable of aggregating local information. Despite this, recent research shows limited effectiveness in increasing data diversity and generalization accuracy. This paper investigates the generalization proficiency of polyp image segmentation based on transformer architecture and proposes a novel approach using two different ViT architectures. This allows the model to learn representations from different perspectives, which can then be combined to create a richer feature representation. Additionally, a more universal and comprehensive dataset has been derived from the datasets presented in the related research, which can be used for improving generalizations. We first evaluated the generalization of our proposed model using three distinct training-testing scenarios. Our experimental results demonstrate that our ColonGen-V1 outperforms other state-of-the-art methods in all scenarios. As a next step, we used the comprehensive dataset for improving the performance of the model against in- and out-of-domain data. The results show that our ColonGen-V2 outperforms state-of-the-art studies by 5.1%, 1.3%, and 1.1% in ETIS-Larib, Kvasir-Seg, and CVC-ColonDB datasets, respectively. The inclusive dataset and the model introduced in this paper are available to the public through this link: https://github.com/javadmozaffari/Polyp_segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Medical image analysis based on deep learning approach

Article 06 April 2021

References

Siegel RL, Miller KD, Sauer AG, Fedewa SA, Butterly LF, Anderson JC, Cercek A, Smith RA, Jemal A (2020) Colorectal cancer statistics, 2020. CA: Cancer J Clin 70(3):145–164
PubMed Google Scholar
Wang M, An X, Li Y, Li N, Hang W, Liu G (2021) “EMS-Net: Enhanced Multi-Scale Network for Polyp Segmentation,” in Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS,
Ahn S, Han D, Bae J, Byun T, Kim J, Eun C (2012) The Miss Rate for Colorectal Adenoma determined by Quality-Adjusted, back-to-back colonoscopies. Gut Liver 6(1):64–70
Article PubMed PubMed Central Google Scholar
Tjoa MP, Krishnan SM (2003) Feature extraction for the analysis of colon status from the endoscopic images. Biomed Eng Online 2(1):1–17
Article Google Scholar
Karkanis SA, Iakovidis DK, Maroulis DE, Karras DA, Tzivras M (2003) Computer-aided Tumor detection in endoscopic video using Color Wavelet features. IEEE Trans Inf Technol Biomed 7(3):141–152
Article PubMed Google Scholar
Barshooi AH, Amirkhani A (2022) A novel data augmentation based on Gabor filter and convolutional deep learning for improving the classification of COVID-19 chest X-Ray images. Biomed Signal Process Control 72:103326
Article PubMed Google Scholar
Zhang L, Dolwani S, Ye X (2017) Automated polyp segmentation in colonoscopy frames using fully convolutional neural network and textons. Commun Comput Inform Sci 723:707–717
Article Google Scholar
Ayatollahi F, Shokouhi S, Mann R, Teuwen J (2021) Automatic breast lesion detection in ultrafast DCE-MRI using deep learning. Med Phys 48(10):5897–5907
Article PubMed Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation, MICCAI. Springer, Heidelberg
Google Scholar
Long J, Shelhamer E, Darrell T (2015) “Fully Convolutional Networks for Semantic Segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
Li Q, Yang G, Chen Z, Huang B, Chen L, Xu D, Zhou X (2017) “Colorectal polyp segmentation using a fully convolutional neural network,” in Proceedings – 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics,
Brandao P, Mazomenos E, Ciuti G, Caliò R, Bianchi F, Menciassi A (2017) Fully convolutional neural networks for polyp segmentation in colonoscopy,. Med Imaging 10134:101–107
Google Scholar
Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J (2018) “UNet++: A Nested U-Net Architecture for Medical Image Segmentation,” Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. DLMIA ML-CDS 2018 2018. Lecture Notes in Computer Science, vol. 11045,
Jha D, Smedsrud PH, Riegler MA, Johansen D, De Lange T, Halvorsen P, Johansen HD (2019) “ResUNet++: An Advanced Architecture for Medical Image Segmentation,” in Proceedings – 2019 IEEE International Symposium on Multimedia,
He K, Zhang X, Ren S, Sun J (2016) “Deep Residual Learning for Image Recognition pattern recognition,” in Proceedings of IEEE conference on computer vision and Pattern Recognition,
Hu J, Shen L, Sun G (2018) “Squeeze-and-Excitation Networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article PubMed Google Scholar
Jha D, Jha D, Tomar NK, Johansen HD, Johansen D, Rittscher J, Riegler MA, Halvorsen P (2021) Real-time polyp detection, localization and segmentation in Colonoscopy using deep learning. IEEE Access 9:40496–40510
Article PubMed Google Scholar
Fang Y, Chen C, Yuan Y, Tong K (2019) “Selective feature aggregation network with area-boundary constraints for polyp segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 302–310,
Fan DP, Ji GP, Zhou T, Chen G, Fu H, Shen J, Shao L (2020) “PraNet: Parallel Reverse Attention Network for Polyp Segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention,
Meng Y, Zhang H, Zhao Y, Yang X, Qiao Y, Maccormick IJ, Huang X, Zheng Y (2022) Graph-based region and boundary aggregation for biomedical image segmentation. IEEE Trans Med Imaging 41(3):690–701
Article PubMed Google Scholar
Ashkani Chenarlogh V, Shabanzadeh A, Ghelich Oghli M, Sirjani N, Farzin Moghadam S, Akhavan A (2022) Clinical target segmentation using a novel deep neural network: double attention res-U-Net. Sci Rep 12(1):1–17
Article Google Scholar
Liu G, Jiang Y, Liu D, Chang B, Ru L, Li M (2023) A coarse-to-fine segmentation frame for polyp segmentation via deep and classification features. Expert Syst Appl 214:118975
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30:5998–6008
Google Scholar
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T (2010) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, arXiv:11929v2, 2020
Bazi Y, Bashmal L, Al Rahhal MM, Dayil RA, Ajlan NA (2021) “Vision Transformers for Remote Sensing Image Classification " Remote Sensing 13(3):516
Google Scholar
Hong D, Han Z, Yao J, Gao L, Zhang B, Plaza A, Chanussot J (2022) SpectralFormer: rethinking hyperspectral image classification with transformers. IEEE Trans Geosci Remote Sens 60:1–15
Article Google Scholar
Dai Y, Gao Y, Liu F (2021) TransMed: Transformers advance multi-modal medical image classification. Diagnostics 11(8):1384
Article PubMed PubMed Central Google Scholar
Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) “TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation,” arXiv:2102.04306v1,
Strudel R, Garcia R, Laptev I, Schmid C (2021) “Segmenter: Transformer for Semantic Segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision,
Fang Y, Liao B, Wang X, Fang J, Qi J, Wu R, Niu J, Liu W (2021) You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection,. Adv Neural Inf Process Syst 34:26183–26197
Google Scholar
Yuan Z, Song X, Bai L, Wang Z, Ouyang W (2022) Temporal-Channel transformer for 3D lidar-based video object detection for Autonomous Driving. IEEE Trans Circuits Syst Video Technol 32(4):2068–2078
Article Google Scholar
Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P (2021) SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers,. Adv Neural Inf Process Syst 34:1–14
Google Scholar
Park K-B, Lee JY (2022) SwinE-Net: hybrid deep learning approach to novel polyp segmentation using convolutional neural network and swin transformer. J Comput Des Eng 9(2):616–632
Google Scholar
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision,
Dong B, Wang W, Li J, Fan D-P (2021) “Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers,” arXiv:2108.06932v3,
Wang W, Xie E, Li X, Fan DP, Song K, Liang D (2022) PVT v2: improved baselines with pyramid vision transformer. Comput Visual Media 8(3):415–424
Article CAS Google Scholar
Duc NT, Oanh NT, Thuy NT, Triet TM, Sang DV (2022) ColonFormer: an efficient transformer based Method for Colon polyp segmentation. IEEE Access 10:80575–80586
Article Google Scholar
Xiao T, Liu Y, Zhou B, Jiang Y, Sun J (2018) “Unified perceptual parsing for scene understanding,” in Proceedings of the European Conference on Computer Vision, pp.418–434,
Qiu J, Hayashi Y, Oda M, Kitasaka T, Mori K (2022) Boundary-aware feature and prediction refinement for polyp segmentation. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization 11(4):1187–1196
Google Scholar
Bernal J, Sánchez J, Vilarino F (2012) Towards automatic polyp detection with a polyp appearance model. Pattern Recogn 45(6):3166–3182
Article Google Scholar
Silva J, Histace A, Romain O, Dray X, Granado B (2014) Toward embedded detection of polyps in wce images for early diagnosis of Colorectal cancer. Int J Comput Assist Radiol Surg 9(2):283–293
Article PubMed Google Scholar
Bernal J, Sánchez F, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F (2015) WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput Med Imaging Graph 43:99–111
Article PubMed Google Scholar
Jha D, Smedsrud P, Riegler M, Halvorsen P, Lange T, Johansen D, Johansen H (2019) “Kvasir-seg: A segmented polyp,” in International Conference on Multimedia Modeling,
Ali S, Jha D, Ghatwary N, Realdon S, Cannizzaro R, Salem OE (2023) A multi-centre polyp detection and segmentation dataset for generalisability assessment. Sci Data 10(1):1–17
Article Google Scholar
Ji GP, Xiao G, Chou YC, Fan DP, Zhao K, Chen G, Van Gool L (2022) Video polyp segmentation: a deep learning perspective. Mach Intell Res 19(6):531–549
Article Google Scholar
Misawa M, Kudo S, Mori Y, Hotta K, Ohtsuka K, Matsuda T, Saito S, Kudo T, Baba T, Ishida F, Itoh H (2021) Development of a computer-aided detection system for colonoscopy and a publicly accessible large colonoscopy video database (with video). Gastrointest Endosc 93(4):960–967
Article PubMed Google Scholar
Tajbakhsh N, Gurudu S, Liang J (2015) Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans Med Imaging 35(2):630–644
Article PubMed Google Scholar
Sánchez-Peralta L, Pagador J, Picón A, Calderón Á, Polo F, Andraka N, Bilbao R, Glover B, Saratxaga C, Sánchez-Margallo F (2020) PICCOLO white-light and narrow-band imaging colonoscopic dataset: a performance comparative of models and datasets. Appl Sci 10(23):8501
Article Google Scholar
Ngoc LP, An N, Hang D, Long D, Trung T, Thuy N, Sang D (2021) “NeoUNet: Towards accurate colon polyp segmentation and neoplasm detection,” in International Symposium on Visual Computing Oct 4, 2021
Ding M, Xiao B, Codella N, Luo P, Wang J, Yuan L (2022) “DaViT: Dual Attention Vision Transformers,” in European Conference on Computer Vision,
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2010) “ImageNet: A large-scale hierarchical image database,” in Proceedings of the IEEE conference on computer vision and pattern recognition,
Zhang D, Fu H, Han J, Borji A, Li X (2018) A review of Co-saliency Detection algorithms: fundamentals, applications, and challenges. ACM Trans Intell Syst Technol 9(4):1–31
Article Google Scholar
Fan D-P, Cheng M-M, Liu Y, Li T, Borji A (2017) “Structure-Measure: A New Way to Evaluate Foreground Maps,” in Proceedings of the IEEE International Conference on Computer Vision,
Chen Y, Xiao X, Dai T, Xia ST (2020) “Hrnet: Hamiltonian Rescaling Network for Image Downscaling,” in Proceedings - International Conference on Image Processing, ICIP,
Li Y, Yuan G, Wen Y, Hu E, Evangelidis G, Tulyakov S, Wang Y, Ren J (2022) “EfficientFormer: Vision Transformers at MobileNet Speed,” arXiv:2206.01191v4,
Zhang W, Huang Z, Luo G, Chen T, Wang X, Liu W, Yu G, Shen C (2022) “TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition,
Raghu M, Unterthiner T, Kornblith S, Zhang C, Dosovitskiy A (2021) Do Vision transformers See like Convolutional neural networks? Adv Neural Inf Process Syst 34:12116–12128
Google Scholar
Cortes C, Mohri M, Rostamizadeh A (2012) Algorithms for learning kernels based on centered alignment. J Mach Learn Res 13(1):795–828
Google Scholar

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

School of Electrical Engineering, Iran University of Science and Technology, Tehran, 16846-13114, Iran
Javad Mozaffari & Shahriar B. Shokouhi
School of Automotive Engineering, Iran University of Science and Technology, Tehran, 16846-13114, Iran
Abdollah Amirkhani

Authors

Javad Mozaffari
View author publications
You can also search for this author in PubMed Google Scholar
Abdollah Amirkhani
View author publications
You can also search for this author in PubMed Google Scholar
Shahriar B. Shokouhi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdollah Amirkhani.

Ethics declarations

Conflict of interest

The authors have no affiliation with any organization with a direct or indirect financial interest in the subject matter discussed in the manuscript.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mozaffari, J., Amirkhani, A. & Shokouhi, S.B. ColonGen: an efficient polyp segmentation system for generalization improvement using a new comprehensive dataset. Phys Eng Sci Med 47, 309–325 (2024). https://doi.org/10.1007/s13246-023-01368-8

Download citation

Received: 09 June 2023
Accepted: 06 December 2023
Published: 15 January 2024
Issue Date: March 2024
DOI: https://doi.org/10.1007/s13246-023-01368-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ColonGen: an efficient polyp segmentation system for generalization improvement using a new comprehensive dataset

Abstract

Access this article

Similar content being viewed by others

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Medical image analysis based on deep learning approach

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

ColonGen: an efficient polyp segmentation system for generalization improvement using a new comprehensive dataset

Abstract

Access this article

Similar content being viewed by others

Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Medical image analysis based on deep learning approach

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation