H-DenseFormer: An Efficient Hybrid Densely Connected Transformer for Multimodal Tumor Segmentation

Shi, Jun; Kan, Hongyu; Ruan, Shulan; Zhu, Ziqi; Zhao, Minfan; Qiao, Liang; Wang, Zhaohui; An, Hong; Xue, Xudong

doi:10.1007/978-3-031-43901-8_66

Jun Shi¹⁴,
Hongyu Kan¹⁴,
Shulan Ruan¹⁴,
Ziqi Zhu¹⁴,
Minfan Zhao¹⁴,
Liang Qiao¹⁴,
Zhaohui Wang¹⁴,
Hong An¹⁴ &
…
Xudong Xue¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14223))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

4448 Accesses

Abstract

Recently, deep learning methods have been widely used for tumor segmentation of multimodal medical images with promising results. However, most existing methods are limited by insufficient representational ability, specific modality number and high computational complexity. In this paper, we propose a hybrid densely connected network for tumor segmentation, named H-DenseFormer, which combines the representational power of the Convolutional Neural Network (CNN) and the Transformer structures. Specifically, H-DenseFormer integrates a Transformer-based Multi-path Parallel Embedding (MPE) module that can take an arbitrary number of modalities as input to extract the fusion features from different modalities. Then, the multimodal fusion features are delivered to different levels of the encoder to enhance multimodal learning representation. Besides, we design a lightweight Densely Connected Transformer (DCT) block to replace the standard Transformer block, thus significantly reducing computational complexity. We conduct extensive experiments on two public multimodal datasets, HECKTOR21 and PI-CAI22. The experimental results show that our proposed method outperforms the existing state-of-the-art methods while having lower computational complexity. The source code is available at https://github.com/shijun18/H-DenseFormer.

J. Shi and H. Kan contributed equally. This study was supported by the Fundamental Research Funds for the Central Universities (No. YD2150002001).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://pi-cai.grand-challenge.org/.

References

Andrearczyk, V., et al.: Overview of the HECKTOR challenge at MICCAI 2020: automatic head and neck tumor segmentation in PET/CT. In: Andrearczyk, V., Oreiller, V., Depeursinge, A. (eds.) HECKTOR 2020. LNCS, vol. 12603, pp. 1–21. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67194-5_1
Chapter Google Scholar
Cao, H., et al.: Swin-UNet: UNet-like pure transformer for medical image segmentation. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds.) Computer Vision – ECCV 2022 Workshops. ECCV 2022, Part III. LNCS, vol. 13803, pp. 205–218. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-25066-8_9
Chen, C., Dou, Q., Jin, Y., Chen, H., Qin, J., Heng, P.-A.: Robust multimodal brain tumor segmentation via feature disentanglement and gated fusion. In: Shen, D., et al. (eds.) MICCAI 2019, Part III 22. LNCS, vol. 11766, pp. 447–456. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32248-9_50
Chapter Google Scholar
Chen, J., et al.: TransUNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Chen, L., Wu, Y., DSouza, A.M., Abidin, A.Z., Wismüller, A., Xu, C.: MRI tumor segmentation with densely connected 3D CNN. In: Medical Imaging 2018: Image Processing, vol. 10574, pp. 357–364. SPIE (2018)
Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with Atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chapter Google Scholar
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49
Chapter Google Scholar
Cui, S., Mao, L., Jiang, J., Liu, C., Xiong, S.: Automatic semantic segmentation of brain gliomas from MRI images using a deep cascaded neural network. J. Healthc. Eng. 2018, 4940593 (2018)
Article Google Scholar
Dobko, M., Kolinko, D.I., Viniavskyi, O., Yelisieiev, Y.: Combining CNNs with transformer for multimodal 3D MRI brain tumor segmentation. In: Crimi, A., Bakas, S. (eds.) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 7th International Workshop, BrainLes 2021, Held in Conjunction with MICCAI 2021, Virtual Event, 27 September 2021, Revised Selected Papers, Part II, pp. 232–241. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-09002-8_21
Dolz, J., Gopinath, K., Yuan, J., Lombaert, H., Desrosiers, C., Ayed, I.B.: HyperDense-Net: a hyper-densely connected CNN for multi-modal image segmentation. IEEE Trans. Med. Imag. 38(5), 1116–1126 (2018)
Article Google Scholar
Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Foster, B., Bagci, U., Mansoor, A., Xu, Z., Mollura, D.J.: A review on segmentation of positron emission tomography images. Comput. Bio. Med. 50, 76–96 (2014)
Article Google Scholar
Fu, X., Bi, L., Kumar, A., Fulham, M., Kim, J.: Multimodal spatial attention module for targeting multimodal PET-CT lung tumor segmentation. IEEE J. Biomed. Health Inform. 25(9), 3507–3516 (2021)
Article Google Scholar
Gao, Y., Zhou, M., Metaxas, D.N.: UTNet: a hybrid transformer architecture for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021, Part III. LNCS, vol. 12903, pp. 61–71. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_6
Chapter Google Scholar
Guo, Z., Li, X., Huang, H., Guo, N., Li, Q.: Deep learning-based image segmentation on multimodal medical imaging. IEEE Trans. Radiat. Plasma Med. Sci. 3(2), 162–169 (2019)
Article Google Scholar
Hatamizadeh, A., et al.: UNETR: transformers for 3D medical image segmentation. In: WACV 2022 Proceedings, pp. 574–584 (2022)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR 2017 Proceedings, pp. 4700–4708 (2017)
Google Scholar
Iantsen, A., Visvikis, D., Hatt, M.: Squeeze-and-excitation normalization for automated delineation of head and neck primary tumors in combined PET and CT images. In: Andrearczyk, V., Oreiller, V., Depeursinge, A. (eds.) HECKTOR 2020. LNCS, vol. 12603, pp. 37–43. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67194-5_4
Chapter Google Scholar
Isensee, F., et al.: nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18(2), 203–211 (2021)
Article Google Scholar
Kamnitsas, K., et al.: Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med. Image Anal. 36, 61–78 (2017)
Article Google Scholar
Kamnitsas, K., et al.: Ensembles of multiple models and architectures for robust brain tumour segmentation. In: Crimi, A., Bakas, S., Kuijf, H., Menze, B., Reyes, M. (eds.) BrainLes 2017. LNCS, vol. 10670, pp. 450–462. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75238-9_38
Chapter Google Scholar
Kan, H., et al.: ITUnet: Integration of transformers and UNet for organs-at-risk segmentation. In: EMBC 2022, pp. 2123–2127. IEEE (2022)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV 2017 Proceedings, pp. 2980–2988 (2017)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV 2021 Proceedings, pp. 10012–10022 (2021)
Google Scholar
Menze, B.H., et al.: The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imag. 34(10), 1993–2024 (2014)
Article Google Scholar
Pereira, S., Pinto, A., Alves, V., Silva, C.A.: Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Trans. Med. Imag. 35(5), 1240–1251 (2016)
Article Google Scholar
Rodríguez Colmeiro, R.G., Verrastro, C.A., Grosges, T.: Multimodal brain tumor segmentation using 3D convolutional networks. In: Crimi, A., Bakas, S., Kuijf, H., Menze, B., Reyes, M. (eds.) BrainLes 2017. LNCS, vol. 10670, pp. 226–240. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75238-9_20
Chapter Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015, Part III 18. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Saeed, N., Sobirov, I., Al Majzoub, R., Yaqub, M.: TMSS: an end-to-end transformer-based multimodal network for segmentation and survival prediction. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022, Part VII. LNCS, vol. 13437, pp. 319–329. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16449-1_31
Vaswani, A., et al.: Attention is all you need. In: NIPS 2017, vol. 30 (2017)
Google Scholar
Wang, W., Chen, C., Ding, M., Yu, H., Zha, S., Li, J.: TransBTS: multimodal brain tumor segmentation using transformer. In: de Bruijne, M., et al. (eds.) MICCAI 2021, Part I 24. LNCS, vol. 12901, pp. 109–119. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_11
Chapter Google Scholar
Xiao, X., Lian, S., Luo, Z., Li, S.: Weighted Res-UNet for high-quality retina vessel segmentation. In: ITME 2018, pp. 327–331. IEEE (2018)
Google Scholar
Zhang, Y., et al.: mmFormer: multimodal medical transformer for incomplete multimodal learning of brain tumor segmentation. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022 Proceedings, Part V, vol. 13435, pp. 107–117. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16443-9_11
Zhao, X., et al.: A deep learning model integrating FCNNs and CRFs for brain tumor segmentation. Med. Image Anal. 43, 98–111 (2018)
Article Google Scholar
Zhong, Z., et al.: 3D fully convolutional networks for co-segmentation of tumors on PET-CT images. In: ISBI 2018, pp. 228–231. IEEE (2018)
Google Scholar
Zhou, T., Ruan, S., Canu, S.: A review: deep learning for medical image segmentation using multi-modality fusion. Array 3, 100004 (2019)
Article Google Scholar
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: UNet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imag. 39(6), 1856–1867 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Science and Technology of China, Hefei, China
Jun Shi, Hongyu Kan, Shulan Ruan, Ziqi Zhu, Minfan Zhao, Liang Qiao, Zhaohui Wang & Hong An
Hubei Cancer Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
Xudong Xue

Authors

Jun Shi
View author publications
You can also search for this author in PubMed Google Scholar
Hongyu Kan
View author publications
You can also search for this author in PubMed Google Scholar
Shulan Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Ziqi Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Minfan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Liang Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Zhaohui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hong An
View author publications
You can also search for this author in PubMed Google Scholar
Xudong Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hong An .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen's University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, J. et al. (2023). H-DenseFormer: An Efficient Hybrid Densely Connected Transformer for Multimodal Tumor Segmentation. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14223. Springer, Cham. https://doi.org/10.1007/978-3-031-43901-8_66

Download citation

DOI: https://doi.org/10.1007/978-3-031-43901-8_66
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43900-1
Online ISBN: 978-3-031-43901-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)