TU-Net: U-shaped Structure Based on Transformers for Medical Image Segmentation

Zhao, Jiamei; Wu, Dikang; Wang, Zhifang

doi:10.1007/978-981-19-5194-7_28

Jiamei Zhao¹¹,
Dikang Wu¹¹ &
Zhifang Wang¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1628))

Included in the following conference series:

International Conference of Pioneering Computer Scientists, Engineers and Educators

678 Accesses
1 Citations

Abstract

Recently, the development of deep learning technology in medical image segmentation has become increasingly mature, and the symmetric U-Net has made breakthrough progress. However, because of the inherent limitations of convolution operations, U-Net has some shortcomings in the interaction of global context information. For this reason, this paper proposes TU-Net based on transformers. TU-Net can strengthen the modeling of global context information, enhance the extraction of detailed information and reduce the computational complexity of the algorithm. In patch embedding, successive convolutional layers with small convolutional kernels are proposed to extract features. Cross Attention-Skip is proposed to complete the fusion of shallow and deep features during the skip connection process. TU-Net is performed on the Synapse dataset to segment eight abdominal organs. The experimental results show that TU-Net is superior to ViT, V-Net, U-Net and Swin-Unet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Swin-TransUper: Swin Transformer-based UperNet for medical image segmentation

Article 02 April 2024

GDTNet: A Synergistic Dilated Transformer and CNN by Gate Attention for Abdominal Multi-organ Segmentation

LiteTrans: Reconstruct Transformer with Convolution for Medical Image Segmentation

References

Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
He, K., Zhang, X., Ren, S.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Yu, Q., Xie, L., Wang, Y., Zhou, Y., Fishman, E.K., Yuille, A.L.: Recurrent saliency transformation network: incorporating multi-stage visual cues for small organ segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8280–8289 (2018)
Google Scholar
Zhou, Y., Xie, L., Shen, W., Wang, Y., Fishman, E.K., Yuille, A.L.: A fixed-point model for pancreas segmentation in abdominal CT scans. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10433, pp. 693–701. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66182-7_79
Chapter Google Scholar
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49
Chapter Google Scholar
Milletari, F., Navab, N., Ahmadi, S.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)
Google Scholar
Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., et al.: TransUNet: transformers make strong encoders for medical image segmentation, arXiv preprint arXiv:2102.04306 (2021)
Schlemper, J., et al.: Attention gated networks: learning to leverage salient regions in medical images. Med. Image Anal. 53, 197–207 (2019)
Article Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Article Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6881–6890 (2021)
Google Scholar
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
Google Scholar
Zhang, Y., Liu, H., Hu, Q.: TransFuse: fusing transformers and CNNs for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 14–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_2
Chapter Google Scholar
Valanarasu, J.M.J., Oza, P., Hacihaliloglu, I., Patel, V.M.: Medical transformer: gated axial-attention for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 36–46. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_4
Chapter Google Scholar
Chen, B., Liu, Y., Zhang, Z.: TransAttUnet: multi-level attention-guided u-net with transformer for medical image segmentation. arXiv preprint arXiv:2107.05274 (2021)
Cao, H., Wang, Y., Chen, J., Jiang, D., et al.: Swin-Unet: unet-like pure transformer for medical image segmentation, arXiv preprint arXiv:2105.05537 (2021)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Hendrycks, D., Gimpel, K.: Gaussian error linear units (gelus). arXiv:1606.08415 (2016)
Ba, J.L., Kiros, J.R., Hinton, G.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
Fu, S., et al.: Domain adaptive relational reasoning for 3D multi-organ segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 656–666. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_64
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Heilongjiang University, Harbin, 150080, China
Jiamei Zhao, Dikang Wu & Zhifang Wang

Authors

Jiamei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Dikang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhifang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhifang Wang .

Editor information

Editors and Affiliations

Southwest Petroleum University, Chengdu, China
Yang Wang
University of Electronic Science and Technology of China, Chengdu, China
Guobin Zhu
Harbin Engineering University, Harbin, China
Qilong Han
Harbin Institute of Technology, Harbin, China
Hongzhi Wang
Harbin University of Science and Technology, Harbin, China
Xianhua Song
National Academy of Guo Ding Institute of Data Sciences, Beijing, China
Zeguang Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, J., Wu, D., Wang, Z. (2022). TU-Net: U-shaped Structure Based on Transformers for Medical Image Segmentation. In: Wang, Y., Zhu, G., Han, Q., Wang, H., Song, X., Lu, Z. (eds) Data Science. ICPCSEE 2022. Communications in Computer and Information Science, vol 1628. Springer, Singapore. https://doi.org/10.1007/978-981-19-5194-7_28

Download citation

DOI: https://doi.org/10.1007/978-981-19-5194-7_28
Published: 10 August 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-5193-0
Online ISBN: 978-981-19-5194-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

TU-Net: U-shaped Structure Based on Transformers for Medical Image Segmentation

Abstract

Access this chapter

Similar content being viewed by others

Swin-TransUper: Swin Transformer-based UperNet for medical image segmentation

GDTNet: A Synergistic Dilated Transformer and CNN by Gate Attention for Abdominal Multi-organ Segmentation

LiteTrans: Reconstruct Transformer with Convolution for Medical Image Segmentation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

TU-Net: U-shaped Structure Based on Transformers for Medical Image Segmentation

Abstract

Access this chapter

Similar content being viewed by others

Swin-TransUper: Swin Transformer-based UperNet for medical image segmentation

GDTNet: A Synergistic Dilated Transformer and CNN by Gate Attention for Abdominal Multi-organ Segmentation

LiteTrans: Reconstruct Transformer with Convolution for Medical Image Segmentation

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation