Skip to main content
Log in

Polyp Segmentation Using a Hybrid Vision Transformer and a Hybrid Loss Function

  • Published:
Journal of Imaging Informatics in Medicine Aims and scope Submit manuscript

Abstract

Accurate and early detection of precursor adenomatous polyps and their removal at the early stage can significantly decrease the mortality rate and the occurrence of the disease since most colorectal cancer evolve from adenomatous polyps. However, accurate detection and segmentation of the polyps by doctors are difficult mainly these factors: (i) quality of the screening of the polyps with colonoscopy depends on the imaging quality and the experience of the doctors; (ii) visual inspection by doctors is time-consuming, burdensome, and tiring; (iii) prolonged visual inspections can lead to polyps being missed even when the physician is experienced. To overcome these problems, computer-aided methods have been proposed. However, they have some disadvantages or limitations. Therefore, in this work, a new architecture based on residual transformer layers has been designed and used for polyp segmentation. In the proposed segmentation, both high-level semantic features and low-level spatial features have been utilized. Also, a novel hybrid loss function has been proposed. The loss function designed with focal Tversky loss, binary cross-entropy, and Jaccard index reduces image-wise and pixel-wise differences as well as improves regional consistencies. Experimental works have indicated the effectiveness of the proposed approach in terms of dice similarity (0.9048), recall (0.9041), precision (0.9057), and F2 score (0.8993). Comparisons with the state-of-the-art methods have shown its better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

Data are available on request.

References

  1. Siegel RL, Wagle NS, Cercek A, Smith RA, Jemal A: Colorectal cancer statistics 2023. CA Cancer J Clinic 73:233-254, 2023

    Article  Google Scholar 

  2. Salmo E, Haboubi N: Adenoma and malignant colorectal polyp: pathological considerations and clinical applications. Gastroenterology 7:92–102, 2018

    Google Scholar 

  3. Yue G, Wei P, Liu Y, Luo Y, Du J, Wang T: Automated endoscopic image classification via deep neural network with class imbalance loss. IEEE Transactions on Instrumentation and Measurement 72:1-11, 2023

    Google Scholar 

  4. Yue G, Cheng D, Zhou T, Hou J, Liu W, Xu L, Wang T, Cheng J: Perceptual quality assessment of enhanced colonoscopy images: A benchmark dataset and an objective method. IEEE Transactions on Circuits and Systems for Video Technology 1:1-33, 2023

    Google Scholar 

  5. Leufkens AM, Van Oijen MG, Vleggaar FP, Siersema PD: Factors influencing the miss rate of polyps in a back-to-back colonoscopy study. Endoscopy 22:470-475, 2012

    Google Scholar 

  6. Kim NH, Jung YS, Jeong WS, Yang HJ, Park SK, Choi K, Park DI: Miss rate of colorectal neoplastic polyps and risk factors for missed polyps in consecutive colonoscopies. Intestinal Research 15:411-418, 2017

    Article  PubMed  PubMed Central  Google Scholar 

  7. Lee J, et al: Risk factors of missed colorectal lesions after colonoscopy. Medicine 96:1-6, 2017

    Google Scholar 

  8. Jha D, Smedsrud PH, Riegler MA, Halvorsen P, de Lange T, Johansen D, Johansen HD: Kvasir-seg: A segmented polyp dataset. 26th Int. Conf. on MultiMedia Modeling (MMM 2020), Daejeon, South Korea, pp. 451–462, 2020

  9. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I: Attention is all you need. 31st Conf. on Neural Information Processing Systems (NIPS 2017), Long Beach, USA, pp. 1–11, 2017

  10. Wang J, Huang Q, Tang F, Meng J, Su J, Song S: Stepwise feature fusion: local guides global. arXiv preprint arXiv:2203.03635, 2022

  11. Wang W, et al: Pvtv 2: ımproved baselines with pyramid vision transformer. Comput. Vis. Media 8:1–10, 2022

    Google Scholar 

  12. Wang W, et al: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. The IEEE/CVF Int. Conf. on Computer Vision, Virtual Conf., pp. 568–578, 2021

  13. Ranftl R, Bochkovskiy A, Koltun V: Vision transformers for dense prediction. The IEEE/CVF International Conf. on Computer Vision, Virtual Conf., pp. 12179–12188, 2021

  14. Xie E, Wang W, Yu Z, Anandkumar A, Alvarez JM, Luo P: SEGFormer: simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems 34:12077-12090, 2021

    Google Scholar 

  15. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B: Swin transformer: Hierarchical vision transformer using shifted windows. IEEE/CVF Int. Conf. Computer Vis. (ICCV), Virtual Conf., pp. 10012–10022, 2021

  16. Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr, P. H. S, Zhang, L. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. IEEE/CVF Conf. Computer Vis. Pattern Recognition (CVPR), Virtual Conf., pp. 6881–6890, 2021

  17. Vázquez D. et al. A benchmark for endoluminal scene segmentation of colonoscopy images. J. Healthcare Engineering 1:1-10, 2017

    Article  Google Scholar 

  18. Bernal J, Sanchez J, Vilariño F: Towards automatic polyp detection with a polyp appearance model. Pattern Recognition 45:3166–3182, 2012

    Article  Google Scholar 

  19. Yang X, Wei Q, Zhang C, Zhou K, Kong L, Jiang W: Colon polyp detection and segmentation based on improved mrcnn. IEEE Trans. on Instrumentation and Measurement 70:1-10, 2020

    Article  Google Scholar 

  20. Liu G, Jiang Y, Liu D, Chang B, Ru L, Li M: A coarse-to-fine segmentation frame for polyp segmentation via deep and classification features. Expert Sys. with Applications 214:118975, 2023

    Article  Google Scholar 

  21. Su Y, Cheng J, Zhong C, Jiang C, Ye J, He J: Accurate polyp segmentation through enhancing feature fusion and boosting boundary performance. Neurocomputing 545:126233, 2023

    Article  Google Scholar 

  22. Zhu J, Ge M, Chang Z, Dong W: CRCNet: Global-local context and multi-modality cross attention for polyp segmentation. Biomedical Signal Processing and Control 83:104593, 2023

    Article  Google Scholar 

  23. Zhou T, Zhou Y, He K, Gong C, Yang J, Fu H, Shen D: Cross-level feature aggregation network for polyp segmentation. Pattern Recognition 140:109555, 2023

    Article  Google Scholar 

  24. Zheng X, Gong W, Yang R, Zuo G: Image segmentation of intestinal polyps using attention mechanism based on convolutional neural network. Adv. Comp. Sci. and App. 14:1-9, 2023

    Google Scholar 

  25. Khan TM, Arsalan M, Razzak I, Meijering E: Simple and robust depth-wise cascaded network for polyp segmentation. Eng. Applications of Artificial Intelligence 121:106023, 2023

    Article  Google Scholar 

  26. Nanni L, Cuza D, Lumini A, Loreggia A, Brahman S: Polyp segmentation with deep ensembles and data augmentation. Artificial Intelligence and Machine Learning for Healthcare: Image and Data Analytics 1:133-153, 2022

    Google Scholar 

  27. Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H: Encoder-decoder with atrous separable convolution for semantic image segmentation. European Conference on Computer Vision (ECCV), Munich, Germany, pp. 801–818, 2018

  28. Huang CH, Wu HY, Lin YL: Hardnet-mseg: A simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv preprint arXiv:2101.07172, 2021

  29. Zhang Y, Liu H, Hu Q: Transfuse: Fusing transformers and cnns for medical image segmentation. InMedical Image Computing and Computer Assisted Intervention - MICCAI 2021: 24th International Conference, Strasbourg, France, pp. 14–24, 2021

  30. Liu F, Hua Z, Li J, Fan L: Dbmf: Dual branch multiscale feature fusion network for polyp segmentation. Computers in Biology and Medicine 151:1-20, 2021

    Google Scholar 

  31. Zhang W, Fu C, Zheng Y, Zhang F, Zhao Y, Sham CW: HSNet: A hybrid semantic network for polyp segmentation. Computers in Biology and Medicine 150:1-10, 2022

    Article  Google Scholar 

  32. Chang Q, Ahmad D, Toth J, Bascom R, Higgins WE: ESFPNet: efficient deep learning architecture for real-time lesion segmentation in autofluorescence bronchoscopic video. Medical Imaging 2023: Biomed. Applications in Molecular, Structural, and Functional Imaging 12468:1246803, 2023

    Google Scholar 

  33. Li W, Zhao Y, Li F, Wang L: MIA-Net: Multi-information aggregation network combining transformers and convolutional feature learning for polyp segmentation. Knowledge-Based Systems 247:108824, 2022

    Article  Google Scholar 

  34. Sanderson E, Matuszewski BJ: FCN-transformer feature fusion for polyp segmentation. Annual Conference on Medical Image Understanding and Analysis, Cambridge, United Kingdom, pp. 892–907, 2022

  35. Trinh QH: Meta-Polyp: a baseline for efficient polyp segmentation. arXiv preprint arXiv:2305.07848, 2023

  36. Lewis J, Cha YJ, Kim J: Dual encoder–decoder-based deep polyp segmentation network for colonoscopy images. Scientific Reports 13:1183, 2023

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Nguyen M, Bui TT, Van Nguyen Q, Nguyen TT, Van Pham T: LAPFormer: A light and accurate polyp segmentation transformer. arXiv preprint arXiv:2210.04393, 2022

  38. Dong B, Wang W, Fan DP, Li J, Fu H, Shao L: Polyp-pvt: Polyp segmentation with pyramid vision transformers. arXiv preprint arXiv:2108.06932, 2021

  39. Li Y, Hu M, Yang X: Polyp-sam: Transfer sam for polyp segmentation. arXiv preprint arXiv:2305.00293, 2023

  40. Hu K, Chen W, Sun Y, Hu X, Zhou Q, Zheng Z: PPNet: Pyramid pooling based network for polyp segmentation. Computers in Biology and Medicine 160:1-13, 2023

    Article  Google Scholar 

  41. Park KB, Lee JY: SwinE-Net: Hybrid deep learning approach to novel polyp segmentation using convolutional neural network and Swin Transformer. Journal of Computational Design and Engineering 9:616-632, 2022

    Article  Google Scholar 

  42. Bernal J, Sánchez FJ, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F: WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics 43:99–111, 2023

  43. Tajbakhsh N, Gurudu SR, Liang J: Automated polyp detection in colonoscopy videos using shape and context information. IEEE Transactions on Medical Imaging 35:630-644, 2015

    Article  PubMed  Google Scholar 

  44. Ali S, Jha D, Ghatwary N, Realdon S, Cannizzaro R, Salem OE, Lamarque D, Daul C, Riegler MA, Anonsen KV, Petlund A: PolypGen: A multi-center polyp detection and segmentation dataset for generalisability assessment. arXiv preprint arXiv:2106.04463, 2021

  45. Ngoc Lan P, An NS, et. al: NeoUNet: Towards accurate colon polyp segmentation and neoplasm detection. Adv. in Visual Computing: 16th Int. Symp. (ISVC2021), Virtual Conf., pp. 15–28, 2021

  46. Silva J, Histace A, Romain O, Dray X, Granado B: Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Comput. Assist. Radiol. Surg. 9:283-293, 2014

    Article  Google Scholar 

  47. Gastrointestinal Image Analysis (GIANA) challenge. Available at https://giana.grand-challenge.org. Accessed 21 June 2023

  48. Endoscopic Vision Challenge. Sub-challenge: Gastrointestinal Image ANAlysis (GIANA). Available at https://giana.grand-challenge.org. Accessed 21 June 2023

  49. Sanchez-Peralta LF, Pagador JB, Picón A, Calderón ÁJ, Polo F, Andraka N, Bilbao R, Glover B, Saratxaga CL, Sánchez-Margallo FM: Piccolo white-light and narrow-band imaging colonoscopic dataset: a performance comparative of models and datasets. Appl Sci 10:8501, 2020

    Article  CAS  Google Scholar 

  50. Ma Y, Chen X, Cheng K, Li Y, Sun B: LDPolypVideo benchmark: a large-scale colonoscopy video dataset of diverse polyps. Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, pp. 387–396, 2021

  51. Wei J, Wang S, Huang Q: F3Net: fusion, feedback and focus for salient object detection. Proceedings of The AAAI Conference on Artificial Intelligence 34:12321-12328, 2020

    Article  Google Scholar 

  52. Fan DP, Ji GP, Zhou T, Chen G, Fu H, Shen J, Shao L: Pranet: Parallel reverse attention network for polyp segmentation. Int. Conf. on Medical Image Computing and Computer-Assisted Intervention, Lima, Peru, pp. 263-273, 2020

    Google Scholar 

  53. Salehi SS, Erdogmus D, Gholipour A: Tversky loss function for image segmentation using 3D fully convolutional deep networks. Int. Workshop on Machine Learning in Medical Imaging, Quebec, Canada, pp. 379-387, 2017

    Google Scholar 

  54. Lin TY, Goyal P, Girshick R, He K, Dollár P: Focal loss for dense object detection. The IEEE International Conference on Computer Vision, Venice, Italy, pp. 2980–2988, 2017

  55. Abraham N, Khan NM: A novel focal tversky loss function with improved attention u-net for lesion segmentation. IEEE Symp. on Biomed. Imaging (ISBI2019), Venice, Italy, pp. 683–687, 2019

  56. Bertels J, Eelbode T, Berman M, Vandermeulen D, et. al: Optimizing the Dice score and Jaccard index for medical image segmentation: Theory and practice. Medical Image Computing and Computer Assisted Intervention (MICCAI 2019), Shenzhen, China, pp. 92–100, 2019

  57. Zhang D, Fu H, Han J, Borji A, Li X: A review of co-saliency detection algorithms: Fundamentals, applications, and challenges. ACM Trans. on Intelligent Sys. and Tech. 9:1–31, 2018

Download references

Author information

Authors and Affiliations

Authors

Contributions

This is a single-authored manuscript.

Corresponding author

Correspondence to Evgin Goceri.

Ethics declarations

Ethics Approval

Ethical approval was not needed since data from online sources were used in study.

Competing Interests

The author declares no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Goceri, E. Polyp Segmentation Using a Hybrid Vision Transformer and a Hybrid Loss Function. J Digit Imaging. Inform. med. 37, 851–863 (2024). https://doi.org/10.1007/s10278-023-00954-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10278-023-00954-2

Keywords

Navigation