Abstract
Purpose
Fully convolutional neural networks architectures have proven to be useful for brain tumor segmentation tasks. However, their performance in learning long-range dependencies is limited to their localized receptive fields. On the other hand, vision transformers (ViTs), essentially based on a multi-head self-attention mechanism, which generates attention maps to aggregate spatial information dynamically, have outperformed convolutional neural networks (CNNs). Inspired by the recent success of ViT models for the medical images segmentation, we propose in this paper a new network based on Swin transformer for semantic brain tumor segmentation.
Methods
The proposed method for brain tumor segmentation combines Transformer and CNN modules as an encoder–decoder structure. The encoder incorporates ELSA transformer blocks used to enhance local detailed feature extraction. The extracted feature representations are fed to the decoder part via skip connections. The encoder part includes channel squeeze and spatial excitation blocks, which enable the extracted features to be more informative both spatially and channel-wise.
Results
The method is evaluated on the public BraTS 2021 datasets containing 1251 cases of brain images, each with four 3D MRI modalities. Our proposed approach achieved excellent segmentation results with an average Dice score of 89.77% and an average Hausdorff distance of 8.90 mm.
Conclusion
We developed an automated framework for brain tumor segmentation using Swin transformer and enhanced local self-attention. Experimental results show that our method outperforms state-of-th-art 3D algorithms for brain tumor segmentation.
Similar content being viewed by others
Notes
References
Isensee F, Jaeger PF, Kohl SA, Petersen J, Maier-Hein KH (2021) nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 18(2):203–211
Hatamizadeh A, Tang Y, Nath V, Yang D, Myronenko A, Landman B, Roth HR, Xu D (2022) Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 574–584
Roy AG, Navab N, Wachinger C (2018) Concurrent spatial and channel ‘squeeze and excitation’ in fully convolutional networks. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 421–429
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2020) An image is worth 16\(\times \)16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems, 30
Jiang Y, Zhang Y, Lin X, Dong J, Cheng T, Liang J (2022) Swinbts: a method for 3d multimodal brain tumor segmentation using swin transformer. Brain Sci 12(6):797
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022
Zhou J, Wang P, Wang F, Liu Q, Li H, Jin R (2021) Elsa: enhanced local self-attention for vision transformer. arXiv preprint arXiv:2112.12786
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241. Springer
Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016) 3d u-net: learning dense volumetric segmentation from sparse annotation. In: International conference on medical image computing and computer-assisted intervention, pp 424–432. Springer
Milletari F, Navab N, Ahmadi S-A (2016) V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth international conference on 3D vision (3DV), pp 565–571. IEEE
Isensee F, Petersen J, Klein A, Zimmerer D, Jaeger PF, Kohl S, Wasserthal J, Köhler G, Norajitra T, Wirkert SJ, Maier-Hein KH (2018) nnu-net: Self-adapting framework for u-net-based medical image segmentation. arXiv preprint arXiv:1809.10486
Zhou T, Noeuveglise A, Ghazouani F, Modzelewski R, Thureau S, Fontanilles M, Ruan S (2022) Prediction of brain tumor recurrence location based on Kullback–Leibler divergence and nonlinear correlation learning. In: 2022 26th International conference on pattern recognition (ICPR), pp 4414–4419. IEEE
Zhou T, Ruan S, Vera P, Canu S (2022) A tri-attention fusion guided multi-modal segmentation network. Pattern Recognit 124:108417
Li J, Wang W, Chen C, Zhang T, Zha S, Yu H, Wang J (2022) Transbtsv2: wider instead of deeper transformer for medical image segmentation. arXiv preprint arXiv:2201.12785
Wang W, Chen C, Ding M, Yu H, Zha S, Li J (2021) Transbts: multimodal brain tumor segmentation using transformer. In: International conference on medical image computing and computer-assisted intervention, pp 109–119. Springer
Jia Q, Shu H (2022) Bitr-unet: a cnn-transformer combined network for MRI brain tumor segmentation. In: International MICCAI Brainlesion workshop, pp 3–14. Springer
Hatamizadeh A, Nath V, Tang Y, Yang D, Roth HR, Xu D (2022) Swin unetr: swin transformers for semantic segmentation of brain tumors in MRI images. In: International MICCAI brainlesion workshop, pp 272–284. Springer
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Baid U, Ghodasara S, Bilello M, Mohan S, Calabrese E, Colak E, Farahani K, Kalpathy-Cramer J, Kitamura FC, Pati S, Prevedello LM, Rudie JD, Sako C, Shinohara RT, Bergquist T, Chai R, Eddy JA, Elliott J, Reade W, Schaffter T, Yu T, Zheng J, Annotators B, Davatzikos C, Mongan J, Hess C, Cha S, Villanueva-Meyer JE, Freymann JB, Kirby JS, Wiestler B, Crivellaro P, Colen RR, Kotrotsou A, Marcus DS, Milchenko M, Nazeri A, Fathallah-Shaykh HM, Wiest R, Jakab A, Weber M, Mahajan A, Menze BH, Flanders AE, Bakas S (2021) The rsna-asnr-miccai brats 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv preprint arXiv:2107.02314
Luu HM, Park S-H (2021) Extending nn-unet for brain tumor segmentation. In: International MICCAI brainlesion workshop, pp 173–186. Springer
Yuan Y (2021) Evaluating scale attention network for automatic brain tumor segmentation with large multi-parametric MRI database. In: International MICCAI Brainlesion workshop, pp 42–53. Springer
Futrega M, Milesi A, Marcinkiewicz M, Ribalta P (2021) Optimized u-net for brain tumor segmentation. In: International MICCAI brainlesion workshop, pp 15–29. Springer
Funding
This study was funded by LITIS-QuantIF Laboratory.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethics approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
This articles does not contain patient data.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ghazouani, F., Vera, P. & Ruan, S. Efficient brain tumor segmentation using Swin transformer and enhanced local self-attention. Int J CARS 19, 273–281 (2024). https://doi.org/10.1007/s11548-023-03024-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-023-03024-8