Abstract
The precise delineation of esophageal gross tumor volume (GTV) on medical images can promote the radiotherapy effect of esophagus cancer. This work is intended to explore effective learning-based methods to tackle the challenging auto-segmentation problem of esophageal GTV. By employing the progressive hierarchical reasoning mechanism (PHRM), we devised a simple yet effective two-stage deep framework, ConVMLP-ResU-Net. Thereinto, the front-end ConVMLP integrates convolution (ConV) and multi-layer perceptrons (MLP) to capture localized and long-range spatial information, thus making ConVMLP excel in the location and coarse shape prediction of esophageal GTV. According to the PHRM, the front-end ConVMLP should have a strong generalization ability to ensure that the back-end ResU-Net has correct and valid reasoning. Therefore, a condition control training algorithm was proposed to control the training process of ConVMLP for a robust front end. Afterward, the back-end ResU-Net benefits from the yielded mask by ConVMLP to conduct a finer expansive segmentation to output the final result. Extensive experiments were carried out on a clinical cohort, which included 1138 pairs of 18F-FDG positron emission tomography/computed tomography (PET/CT) images. We report the Dice similarity coefficient, Hausdorff distance, and Mean surface distance as 0.82 ± 0.13, 4.31 ± 7.91 mm, and 1.42 ± 3.69 mm, respectively. The predicted contours visually have good agreements with the ground truths. The devised ConVMLP is apt at locating the esophageal GTV with correct initial shape prediction and hence facilitates the finer segmentation of the back-end ResU-Net. Both the qualitative and quantitative results validate the effectiveness of the proposed method.
Similar content being viewed by others
References
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 71(2021):209–249
Jin D, Guo D, Ho TY, Harrison AP, Xiao J, Tseng CK, Lu L (2021) DeepTarget: gross tumor and clinical target volume segmentation in esophageal cancer radiotherapy. Med Image Anal 68:101909
Lu J, Sun XD, Yang X, Tang XY, Qin Q, Zhu HC, Cheng HY, Sun XC (2016) Impact of PET/CT on radiation treatment in patients with esophageal cancer: a systematic review. Crit Rev Oncol Hemat 107:128–137
Lei T, Wang R, Wan Y, Zhang B, Meng H, Nandi AK (2020) Medical image segmentation using deep learning a survey, arXiv:2009.13120
Du X, Xu X, Liu H, Li S (2021) TSU-net: two-stage multi-scale cascade and multi-field fusion U-net for right ventricular segmentation. Comput Med Imaging Graph 93:101971
Feng Y, Hafiane A, Laurent H (2021) A deep learning-based multiscale approach to segment the areas of interest in whole slide images. Comput Med Imaging Graph 90:101923
Hao Z, Liu J, Liu J (2017) Esophagus tumor segmentation using fully convolutional neural network and graph cut, In: Proceedings of Chinese intelligent systems conference MudanJiang. CN, pp 413–420
Chen S, Yang H, Fu J, Mei W, Ren S, Liu Y, Zhu Z, Liu L, Li H, Chen H (2019) U-Net plus: deep semantic segmentation for esophagus and esophageal cancer in computed tomography images. IEEE Access 7:82867–82877
Yousefi S, Sokooti H, Elmahdy MS, Peters FP, Shalmani MTM, Zinkstok RT, Staring M (2018) Esophageal gross tumor volume segmentation using a 3D convolutional neural network, In: Proceedings of the international conference on medical image computing and computer-assisted intervention, pp 343–351
Yousefi S, Sokooti H, Elmahdy MS, Lips IM, Manzuri Shalmani MT, Zinkstok RT, Dankers FJWM, Staring M (2021) Esophageal tumor segmentation in CT images using a dilated dense attention Unet (DDAUnet). IEEE Access 9:99235–99248
Tan S, Li L, Choi W, Kang MK, D’Souza WD, Lu W (2017) Adaptive region-growing with maximum curvature strategy for tumor segmentation in (18)F-FDG PET. Phys Med Biol 62:5383–5402
Shi J, Li J, Li F, Zhang Y, Guo Y, Wang W, Wang J (2021) Comparison of the gross target volumes based on diagnostic PET/CT for primary esophageal cancer. Front Oncol 11:550100
Xu L, Tetteh G, Lipkova J, Zhao Y, Li H, Christ P, Piraud M, Buck A, Shi K, Menze BH (2018) Automated whole-body bone lesion detection for multiple myeloma on (68)Ga-Pentixafor PET/CT imaging using deep learning methods. Contrast Media Mol I(2018):1–12
Zhao X, Li L, Lu W, Tan S (2018) Tumor co-segmentation in PET/CT using multi-modality fully convolutional neural network. Phys Med Biol 64:015011
Jin D, Guo D, Ho T-Y, Harrison AP, Xiao J, Tseng C-K, Lu L (2019) Accurate esophageal gross tumor volume segmentation in PET/CT using two-stream chained 3D deep network fusion, In: Proceedings of the international conference on medical image computing and computer-assisted intervention. pp 182–191
Yue Y, Li N, Zhang G, Zhu Z, Liu X, Song S, Ta D (2022) Automatic segmentation of esophageal gross tumor volume in (18)F-FDG PET/CT images via GloD-LoATUNet. Comput Methods Programs Biomed 229(2023):1–10
Zhang Z, Liu Q, Wang Y (2018) Road extraction by deep residual U-Net. IEEE Geosci Remote S 15:749–753
Chao CS, Yang DL, Liu AC (2001) An automated fault diagnosis system using hierarchical reasoning and alarm correlation. J Netw Syst Manag 9:183–202
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, In: Proceedings of the international conference on learning representations. San Diego, CA, USA, pp 1–14
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, In: Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, NV, USA, pp 770–778
Jin Y, Yang G, Fang Y, Li R, Xu X, Liu Y, Lai X (2021) 3D PBV-Net: an automated prostate MRI data segmentation method. Comput Biol Med 128:104160
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. Springer, Berlin, pp 1–8
Ibtehaz N, Rahman MS, (2019) MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation, In: Proceedings of IEEE conference on computer vision and pattern recognition. pp 1–25
Huang G, Zhu J, Li J, Wang Z, Cheng L, Liu L, Li H, Zhou J (2020) Channel-attention U-Net: channel attention mechanism for semantic segmentation of esophagus and esophageal cancer. IEEE Access 8:122798–122810
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, Glocker B, Daniel R (2018) Attention U-Net: learning where to look for the pancreas. In: Proceedings of the international conference on medical imaging with deep learning, Amsterdam, Netherlands, pp 1–10
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network, pp 6230–6239
Lin L, Dou Q, Jin YM, Zhou GQ, Tang YQ, Chen WL, Su BA, Liu F, Tao CJ, Jiang N, Li JY, Tang LL, Xie CM, Huang SM, Ma J, Heng PA, Wee JTS, Chua MLK, Chen H, Sun Y (2019) Deep learning for automated contouring of primary tumor volumes by MRI for nasopharyngeal carcinoma. Radiology 291:677–686
Li L, Hu Z, Huang Y, Zhu W, Wang Y, Chen M, Yu J (2021) Automatic multi-plaque tracking and segmentation in ultrasonic videos. Med Image Anal 74:102201
Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 61871263), the Program of Shanghai Academic Research Leader (Grant No. 21XD1431300), the Medical-Industrial Integration Program of Fudan University (Grant No. XM03211178).
Author information
Authors and Affiliations
Contributions
YY: Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing—original draft. NL: Investigation, Data curation, Writing—review & editing. WX: Writing—review & editing. GZ: Writing—review & editing. XL: Supervision, Writing—review & editing. ZZ: Conceptualization, Methodology, Software, Validation, Writing—review & editing. SS: Supervision, Data curation. DT: Supervision, Writing—review & editing.
Corresponding authors
Ethics declarations
Competing interests
The authors have no relevant conflicts of interest to disclose.
Ethical approval
This retrospective study was approved by the Ethics Committee of Fudan University Shanghai Cancer Center (No. 1909207-14-1910). Informed consents were obtained from all individual participants included in the study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yue, Y., Li, N., Xing, W. et al. Condition control training-based ConVMLP-ResU-Net for semantic segmentation of esophageal cancer in 18F-FDG PET/CT images. Phys Eng Sci Med 46, 1643–1658 (2023). https://doi.org/10.1007/s13246-023-01327-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13246-023-01327-3