Abstract
Automatic medical image segmentation plays a pivotal role in clinical diagnosis. In the past decades, medical image segmentation has made remarkable improvements with the aid of convolutional neural networks (CNNs). However, extracting context information and disease features for dense segmentation remains a challenging task because of the low contrast between lesions and the background of the medical images. To address this issue, we propose a novel enhanced feature fusion scheme in this work. First, we develop a global feature enhancement modTule, which captures the long-range global dependencies of the spatial domains and enhances global features learning. Second, we propose a channel fusion attention module to extract multi-scale context information and alleviate the incoherence of semantic information among different scale features. Then, we combine these two schemes to produce richer context information and to enhance the feature contrast. In addition, we remove the decoder with the progressive deconvolution operations from classical U-shaped networks, and only utilize the features of the last three layers to generate predictions. We conduct extensive experiments on three public datasets: the poly segmentation dataset, ISIC-2018 dataset, and the Synapse Multi-Organ Segmentation dataset. The experimental results demonstrate superior performance and robustness of our method in comparison with state-of-the-art methods.
Similar content being viewed by others
Data availability
In our work, the datasets used are publicly available, namely the ISIC-2018 dataset (https://challenge2018.isic-archive.com/), the Synapse multi organ segmentation dataset (https://www.synapse.org/#!Synapse:syn3193805/wiki/217789) and the Polyp segmentation dataset include five datasets, i.e. ETIS [58], Kvasir-seg [59], EndoScene [60], CVCColonDB [61], and CVC-ClinicDB [62], as described in Sect. 4.1.
References
Richhariya B, Tanveer M, Rashid AH, Initiative ADN et al (2020) Diagnosis of Alzheimer’s disease using Universum support vector machine based recursive feature elimination (usvm-rfe). Biomed Sig Proc Control 59:101903
Tanveer M, Rashid AH, Ganaie M, Reza M, Razzak I, Hua K-L (2021) Classification of Alzheimer’s disease using ensemble of deep neural networks trained through transfer learning. IEEE J Biomed Health Inform 26(4):1453–1463
Beheshti I, Ganaie M, Paliwal V, Rastogi A, Razzak I, Tanveer M (2021) Predicting brain age using machine learning algorithms: A comprehensive evaluation. IEEE J Biomed Health Inf 26(4):1432–1440
Ning Z, Zhong S, Feng Q, Chen W, Zhang Y (2021) Smu-net: saliency-guided morphology-aware u-net for breast lesion segmentation in ultrasound image. IEEE Transact Med Imaging 41(2):476–490
Wang G, Liu X, Li C, Xu Z, Ruan J, Zhu H, Meng T, Li K, Huang N, Zhang S (2020) A noise-robust framework for automatic segmentation of covid-19 pneumonia lesions from ct images. IEEE Transact Med Imag 39(8):2653–2663
Chu X, Yang W, Ouyang W, Ma C, Yuille AL, Wang X (2017) Multi-context attention for human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1831–1840
Huang Z, Zhong Z, Sun L, Huo Q (2019) Mask r-cnn with pyramid attention network for scene text detection. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 764–772. IEEE
Gupta A, Agrawal D, Chauhan H, Dolz J, Pedersoli M (2018) An attention model for group-level emotion recognition. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp. 611–615
Liu J, Zhou W, Cui Y, Yu L, Luo T (2022) Gcnet: Grid-like context-aware network for rgb-thermal semantic segmentation. Neurocomputing 506:60–67
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241. Springer
Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3d u-net: learning dense volumetric segmentation from sparse annotation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 424–432 . Springer
Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J (2018) Unet++: A nested u-net architecture for medical image segmentation. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, pp 3–11
Isensee F, Jaeger PF, Kohl SA, Petersen J, Maier-Hein KH (2021) nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 18(2):203–211
Cheng Z, Li Y, Chen H, Zhang Z, Pan P, Cheng L (2022) Dsgmffn: Deepest semantically guided multi-scale feature fusion network for automated lesion segmentation in abus images. Computer Methods and Programs in Biomedicine, 106891
Cao F, Gao C, Ye H (2022) A novel method for image segmentation: two-stage decoding network with boundary attention. Int J Mach Learn Cybernet 13(5):1461–1473
Song K, Zhao Z, Wang J, Qiang Y, Zhao J, Zia MB (2022) Segmentation-based multi-scale attention model for kras mutation prediction in rectal cancer. Int J Mach Learn Cybernet 13(5):1283–1299
Huang H, Lin L, Tong R, Hu H, Zhang Q, Iwamoto Y, Han X, Chen Y-W, Wu J (2020) Unet 3+: A full-scale connected unet for medical image segmentation. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1055–1059. IEEE
Xiao X, Lian S, Luo Z, Li S (2018) Weighted res-unet for high-quality retina vessel segmentation. In: 2018 9th International Conference on Information Technology in Medicine and Education (ITME), pp. 327–331. IEEE
Li X, Chen H, Qi X, Dou Q, Fu C-W, Heng P-A (2018) H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes. IEEE Transact Med Imag 37(12):2663–2674
Li S, Liu J, Song Z (2022) Brain tumor segmentation based on region of interest-aided localization and segmentation u-net. International Journal of Machine Learning and Cybernetics, 1–11
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803
Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 603–612
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722
Sinha A, Dolz J (2020) Multi-scale self-guided attention for medical image segmentation. IEEE J Biomed Health Inform 25(1):121–130
Fan D-P, Ji G-P, Zhou T, Chen G, Fu H, Shen J, Shao L (2020) Pranet: Parallel reverse attention network for polyp segmentation. International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, pp 263–273
Yao C, Tang J, Hu M, Wu Y, Guo W, Li Q, Zhang X-P (2021) Claw u-net: a unet variant network with deep feature concatenation for scleral blood vessel segmentation. In: Artificial Intelligence: First CAAI International Conference, CICAI 2021, Hangzhou, China, June 5–6, 2021, Proceedings, Part II 1, pp. 67–78. Springer
Gu Z, Cheng J, Fu H, Zhou K, Hao H, Zhao Y, Zhang T, Gao S, Liu J (2019) Ce-net: Context encoder network for 2d medical image segmentation. IEEE Transact Med Imag 38(10):2281–2292
Feng S, Zhao H, Shi F, Cheng X, Wang M, Ma Y, Xiang D, Zhu W, Chen X (2020) Cpfnet: Context pyramid fusion network for medical image segmentation. IEEE Transact Med Imag 39(10):3008–3018
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890
Ni J, Wu J, Tong J, Chen Z, Zhao J (2020) Gc-net: Global context network for medical image segmentation. Comput Methods Programs Biomed 190:105121
Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306
Wang W, Chen C, Ding M, Yu H, Zha S, Li J (2021) Transbts: Multimodal brain tumor segmentation using transformer. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 109–119. Springer
Gao Y, Zhou M, Metaxas DN (2021) Utnet: a hybrid transformer architecture for medical image segmentation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24, pp. 61–71. Springer
Wang H, Cao P, Wang J, Zaiane OR (2022) Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer. Proc. AAAI Conf Artif Intell 36:2441–2449
Wang J, Wei L, Wang L, Zhou Q, Zhu L, Qin J (2021) Boundary-aware transformers for skin lesion segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 206–216. Springer
Ji Y, Zhang R, Wang H, Li Z, Wu L, Zhang S, Luo P (2021) Multi-compound transformer for accurate biomedical image segmentation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24, pp. 326–336. Springer
Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M (2021) Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022
Lin A, Chen B, Xu J, Zhang Z, Lu G (2021) Ds-transunet: Dual swin transformer u-net for medical image segmentation. arXiv preprint arXiv:2106.06716
Huang X, Deng Z, Li D, Yuan X (2021) Missformer: An effective medical image segmentation transformer. arXiv preprint arXiv:2109.07162
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transact Patte Anal Mach Intell 40(4):834–848
Milletari F, Navab N, Ahmadi S-A (2016) V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571 . IEEE
Valanarasu JMJ, Sindagi VA, Hacihaliloglu I, Patel VM (2020) Kiu-net: Towards accurate segmentation of biomedical images using over-complete representations. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 363–373 . Springer
Jha D, Riegler MA, Johansen D, Halvorsen P, Johansen HD (2020) Doubleu-net: A deep convolutional neural network for medical image segmentation. In: 2020 IEEE 33rd International Symposium on Computer-based Medical Systems (CBMS), pp. 558–564. IEEE
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A.N, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al. (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH, et al. (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229. Springer
Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jégou H (2021) Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR
Wu H, Chen S, Chen G, Wang W, Lei B, Wen Z (2022) Fat-net: feature adaptive transformers for automated skin lesion segmentation. Med Image Anal 76:102327
Xue Y, Xu T, Zhang H, Long LR, Huang X (2018) Segan: adversarial network with multi-scale l1 loss for medical image segmentation. Neuroinformatics 16(3):383–392
Wang R, Chen S, Ji C, Fan J, Li Y (2022) Boundary-aware context neural network for medical image segmentation. Med Image Anal 78:102395
Huang C-H, Wu H-Y, Lin Y-L (2021) Hardnet-mseg: a simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv preprint arXiv:2101.07172
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19
Chen C-FR, Fan Q, Panda R (2021) Crossvit: Cross-attention multi-scale vision transformer for image classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 357–366
Vázquez D, Bernal J, Sánchez FJ, Fernández-Esparrach G, López AM, Romero A, Drozdzal M, Courville A (2017) A benchmark for endoluminal scene segmentation of colonoscopy images. Journal of healthcare engineering 2017:1–9. https://doi.org/10.1155/2017/4037190. https://www.hindawi.com/journals/jhe/2017/4037190/
Jha D, Smedsrud PH, Riegler MA, Halvorsen P, Lange Td, Johansen D, Johansen HD (2020) Kvasir-seg: A segmented polyp dataset. In: International Conference on Multimedia Modeling, pp. 451–462. Springer
Tajbakhsh N, Gurudu SR, Liang J (2015) Automated polyp detection in colonoscopy videos using shape and context information. IEEE Transact Med Imag 35(2):630–644
Bernal J, Sánchez FJ, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F (2015) Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput Med Imag Graph 43:99–111
Silva J, Histace A, Romain O, Dray X, Granado B (2014) Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int J Comput Assist Radiol Surg 9(2):283–293
Margolin R, Zelnik-Manor L, Tal A (2014) How to evaluate foreground maps? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255
Fan D-P, Cheng M-M, Liu Y, Li T, Borji A (2017) Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4548–4557
Fan D-P, Gong C, Cao Y, Ren B, Cheng M-M, Borji A (2018) Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:1805.10421
Fu S, Lu Y, Wang Y, Zhou Y, Shen W, Fishman E, Yuille A (2020) Domain adaptive relational reasoning for 3d multi-organ segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 656–666. Springer
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al. (2018) Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999
Wang H, Xie S, Lin L, Iwamoto Y, Han X-H, Chen Y-W, Tong R (2022) Mixed transformer u-net for medical image segmentation. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2390–2394. IEEE
Jha D, Smedsrud PH, Riegler MA, Johansen D, De Lange T, Halvorsen P, Johansen HD (2019) Resunet++: An advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM), pp. 225–2255. IEEE
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transact Patt Anal Mach Intell 39(12):2481–2495
Jha D, Ali S, Tomar NK, Johansen HD, Johansen D, Rittscher J, Riegler MA, Halvorsen P (2021) Real-time polyp detection, localization and segmentation in colonoscopy using deep learning. IEEE Access 9:40496–40510
Zhang Z, Liu Q, Wang Y (2018) Road extraction by deep residual u-net. IEEE Geosci Remote Sens Lett 15(5):749–753
Fang Y, Chen C, Yuan Y, Tong K-y (2019) Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 302–310. Springer
Acknowledgements
This work was supported in part by the Provincial Natural Science Foundation of Anhui under Grant 1908085MF217, the Natural Science Research Project of Anhui Provincial Education Department under Grant KJ2019A0022918005 and the National Natural Science Foundation of China under Grant 62276146.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Ablation studies on the ISIC2018 dataset: To further validate the effectiveness of our proposed module, we conduct ablation experiments on the ISIC2018 dataset again. As is shown in Table 9, each component in the proposed module removed can result the degradation of performance.
Ablation on input resolution ont the the ISIC2018 dataset:To further evaluate the effectiveness of the proposed model on different resolutions, we carry out ablation studies with two different resolution sizes of \({512 \times 512}\) and \({256 \times 256}\) on the ISIC2018 dataset again. Dermoscopic images are relatively clear RGB images, so high resolution images and low resolution images have little impact on the experimental results. The detailed experimental data are showed on the Table 10.
Ablation of pre-training model: To explore the effectiveness of the pre-trained model of the proposed method, we conduct two comparative experiments, that is, the encoder ResNet-50 with and without pre-trained weight respectively. Figure 11 shows the curve of Dice and loss during training. The results show that the network with the pre-trained weight is easier to optimize and converge faster, which may benefit from the power of the pre-trained model to capture useful information quickly and efficiently.
Notation table The variables and notation used in the paper are showed in the Table 11
Research on the influence of long-range dependencies on low contrast image segmentation Images with low contrast between target and background need to make full use of context information to model the relationship between background and target, thus the network can identify the faint difference of pixels between target and the background, achieving more accurate segmentation. In the proposed method, the GFE module is designed to model the long-range dependencies. To verify the influence of long-range dependencies on the low-contrast images, we display the heatmaps of the model with and without the GFE module. It can be seen from the Fig. 12, the model with GFE module perform significantly better than those without GFE module in the low-contrast images.
Detailed neural architecture To enhance the reproducibility, we present the detailed neural architecture of model configuration. Figure 13 shows the configuration of pipeline architecture and the Fig. 14 shows the configuration of the GFE module.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bao, H., Li, Q. & Zhu, Y. Segmentation-based context-aware enhancement network for medical images. Int. J. Mach. Learn. & Cyber. 15, 963–983 (2024). https://doi.org/10.1007/s13042-023-01950-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-01950-2