Skip to main content

Advertisement

Log in

Efficient brain tumor segmentation using Swin transformer and enhanced local self-attention

  • Review Article
  • Published:
International Journal of Computer Assisted Radiology and Surgery Aims and scope Submit manuscript

Abstract

Purpose

Fully convolutional neural networks architectures have proven to be useful for brain tumor segmentation tasks. However, their performance in learning long-range dependencies is limited to their localized receptive fields. On the other hand, vision transformers (ViTs), essentially based on a multi-head self-attention mechanism, which generates attention maps to aggregate spatial information dynamically, have outperformed convolutional neural networks (CNNs). Inspired by the recent success of ViT models for the medical images segmentation, we propose in this paper a new network based on Swin transformer for semantic brain tumor segmentation.

Methods

The proposed method for brain tumor segmentation combines Transformer and CNN modules as an encoder–decoder structure. The encoder incorporates ELSA transformer blocks used to enhance local detailed feature extraction. The extracted feature representations are fed to the decoder part via skip connections. The encoder part includes channel squeeze and spatial excitation blocks, which enable the extracted features to be more informative both spatially and channel-wise.

Results

The method is evaluated on the public BraTS 2021 datasets containing 1251 cases of brain images, each with four 3D MRI modalities. Our proposed approach achieved excellent segmentation results with an average Dice score of 89.77% and an average Hausdorff distance of 8.90 mm.

Conclusion

We developed an automated framework for brain tumor segmentation using Swin transformer and enhanced local self-attention. Experimental results show that our method outperforms state-of-th-art 3D algorithms for brain tumor segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://monai.io/.

References

  1. Isensee F, Jaeger PF, Kohl SA, Petersen J, Maier-Hein KH (2021) nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 18(2):203–211

    Article  CAS  PubMed  Google Scholar 

  2. Hatamizadeh A, Tang Y, Nath V, Yang D, Myronenko A, Landman B, Roth HR, Xu D (2022) Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 574–584

  3. Roy AG, Navab N, Wachinger C (2018) Concurrent spatial and channel ‘squeeze and excitation’ in fully convolutional networks. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 421–429

  4. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16\(\times \)16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  5. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems, 30

  6. Jiang Y, Zhang Y, Lin X, Dong J, Cheng T, Liang J (2022) Swinbts: a method for 3d multimodal brain tumor segmentation using swin transformer. Brain Sci 12(6):797

    Article  PubMed  PubMed Central  Google Scholar 

  7. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022

  8. Zhou J, Wang P, Wang F, Liu Q, Li H, Jin R (2021) Elsa: enhanced local self-attention for vision transformer. arXiv preprint arXiv:2112.12786

  9. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  10. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241. Springer

  11. Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3d u-net: learning dense volumetric segmentation from sparse annotation. In: International conference on medical image computing and computer-assisted intervention, pp 424–432. Springer

  12. Milletari F, Navab N, Ahmadi S-A (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth international conference on 3D vision (3DV), pp 565–571. IEEE

  13. Isensee F, Petersen J, Klein A, Zimmerer D, Jaeger PF, Kohl S, Wasserthal J, Köhler G, Norajitra T, Wirkert SJ, Maier-Hein KH (2018) nnu-net: Self-adapting framework for u-net-based medical image segmentation. arXiv preprint arXiv:1809.10486

  14. Zhou T, Noeuveglise A, Ghazouani F, Modzelewski R, Thureau S, Fontanilles M, Ruan S (2022) Prediction of brain tumor recurrence location based on Kullback–Leibler divergence and nonlinear correlation learning. In: 2022 26th International conference on pattern recognition (ICPR), pp 4414–4419. IEEE

  15. Zhou T, Ruan S, Vera P, Canu S (2022) A tri-attention fusion guided multi-modal segmentation network. Pattern Recognit 124:108417

    Article  Google Scholar 

  16. Li J, Wang W, Chen C, Zhang T, Zha S, Yu H, Wang J (2022) Transbtsv2: wider instead of deeper transformer for medical image segmentation. arXiv preprint arXiv:2201.12785

  17. Wang W, Chen C, Ding M, Yu H, Zha S, Li J (2021) Transbts: multimodal brain tumor segmentation using transformer. In: International conference on medical image computing and computer-assisted intervention, pp 109–119. Springer

  18. Jia Q, Shu H (2022) Bitr-unet: a cnn-transformer combined network for MRI brain tumor segmentation. In: International MICCAI Brainlesion workshop, pp 3–14. Springer

  19. Hatamizadeh A, Nath V, Tang Y, Yang D, Roth HR, Xu D (2022) Swin unetr: swin transformers for semantic segmentation of brain tumors in MRI images. In: International MICCAI brainlesion workshop, pp 272–284. Springer

  20. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  21. Baid U, Ghodasara S, Bilello M, Mohan S, Calabrese E, Colak E, Farahani K, Kalpathy-Cramer J, Kitamura FC, Pati S, Prevedello LM, Rudie JD, Sako C, Shinohara RT, Bergquist T, Chai R, Eddy JA, Elliott J, Reade W, Schaffter T, Yu T, Zheng J, Annotators B, Davatzikos C, Mongan J, Hess C, Cha S, Villanueva-Meyer JE, Freymann JB, Kirby JS, Wiestler B, Crivellaro P, Colen RR, Kotrotsou A, Marcus DS, Milchenko M, Nazeri A, Fathallah-Shaykh HM, Wiest R, Jakab A, Weber M, Mahajan A, Menze BH, Flanders AE, Bakas S (2021) The rsna-asnr-miccai brats 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv preprint arXiv:2107.02314

  22. Luu HM, Park S-H (2021) Extending nn-unet for brain tumor segmentation. In: International MICCAI brainlesion workshop, pp 173–186. Springer

  23. Yuan Y (2021) Evaluating scale attention network for automatic brain tumor segmentation with large multi-parametric MRI database. In: International MICCAI Brainlesion workshop, pp 42–53. Springer

  24. Futrega M, Milesi A, Marcinkiewicz M, Ribalta P (2021) Optimized u-net for brain tumor segmentation. In: International MICCAI brainlesion workshop, pp 15–29. Springer

Download references

Funding

This study was funded by LITIS-QuantIF Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fethi Ghazouani.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethics approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

This articles does not contain patient data.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghazouani, F., Vera, P. & Ruan, S. Efficient brain tumor segmentation using Swin transformer and enhanced local self-attention. Int J CARS 19, 273–281 (2024). https://doi.org/10.1007/s11548-023-03024-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11548-023-03024-8

Keywords

Navigation