TransFusion: Multi-view Divergent Fusion for Medical Image Segmentation with Transformers

Liu, Di; Gao, Yunhe; Zhangli, Qilong; Han, Ligong; He, Xiaoxiao; Xia, Zhaoyang; Wen, Song; Chang, Qi; Yan, Zhennan; Zhou, Mu; Metaxas, Dimitris

doi:10.1007/978-3-031-16443-9_47

Di Liu¹²,
Yunhe Gao¹²,
Qilong Zhangli¹²,
Ligong Han¹²,
Xiaoxiao He¹²,
Zhaoyang Xia¹²,
Song Wen¹²,
Qi Chang¹²,
Zhennan Yan¹³,
Mu Zhou¹³ &
…
Dimitris Metaxas¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13435))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

8226 Accesses
12 Citations

Abstract

Combining information from multi-view images is crucial to improve the performance and robustness of automated methods for disease diagnosis. However, due to the non-alignment characteristics of multi-view images, building correlation and data fusion across views largely remain an open problem. In this study, we present TransFusion, a Transformer-based architecture to merge divergent multi-view imaging information using convolutional layers and powerful attention mechanisms. In particular, the Divergent Fusion Attention (DiFA) module is proposed for rich cross-view context modeling and semantic dependency mining, addressing the critical issue of capturing long-range correlations between unaligned data from different image views. We further propose the Multi-Scale Attention (MSA) to collect global correspondence of multi-scale feature representations. We evaluate TransFusion on the Multi-Disease, Multi-View & Multi-Center Right Ventricular Segmentation in Cardiac MRI (M &Ms-2) challenge cohort. TransFusion demonstrates leading performance against the state-of-the-art methods and opens up new perspectives for multi-view imaging integration towards robust medical image segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Bernard, O., et al.: Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE Trans. Med. Imaging 37(11), 2514–2525 (2018)
Article Google Scholar
Campello, V.M., et al.: Multi-centre, multi-vendor and multi-disease cardiac segmentation: the m &ms challenge. IEEE Trans. Med. Imaging 40(12), 3543–3554 (2021)
Article Google Scholar
Cao, H., et al.: Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537 (2021)
Chang, Q., Yan, Z., Lou, Y., Axel, L., Metaxas, D.N.: Soft-label guided semi-supervised learning for bi-ventricle segmentation in cardiac cine mri. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp. 1752–1755. IEEE (2020)
Google Scholar
Chang, Q., et al.: Deeprecon: Joint 2d cardiac segmentation and 3d volume reconstruction via a structure-specific generative method. arXiv preprint arXiv:2206.07163 (2022)
Chen, J., et al.: Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Gao, Y., Zhou, M., Liu, D., Metaxas, D.: A multi-scale transformer for medical image segmentation: Architectures, model efficiency, and benchmarks. arXiv preprint arXiv:2203.00131 (2022)
Gao, Y., Zhou, M., Metaxas, D.N.: UTNet: a hybrid transformer architecture for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12903, pp. 61–71. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_6
Chapter Google Scholar
Ge, C., Liu, D., Liu, J., Liu, B., Xin, Y.: Automated recognition of arrhythmia using deep neural networks for 12-lead electrocardiograms with fractional time-frequency domain extension. J. Med. Imaging Health Inf. 10(11), 2764–2767 (2020)
Article Google Scholar
Hatamizadeh, A., et al.: Unetr: Transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)
Google Scholar
He, X., Tan, C., Qiao, Y., Tan, V., Metaxas, D., Li, K.: Effective 3d humerus and scapula extraction using low-contrast and high-shape-variability mr data. In: Medical Imaging 2019: Biomedical Applications in Molecular, Structural, and Functional Imaging, vol. 10953, p. 109530O. International Society for Optics and Photonics (2019)
Google Scholar
Hu, J.B., Guan, A., Zhangli, Q., Sayadi, L.R., Hamdan, U.S., Vyas, R.M.: Harnessing machine-learning to personalize cleft lip markings. Plastic Reconstr. Surg. Glob. Open 8(9S), 150–151 (2020)
Article Google Scholar
Hu, Z., Metaxas, D., Axel, L.: In vivo strain and stress estimation of the heart left and right ventricles from mri images. Med. Image Anal. 7(4), 435–444 (2003)
Article Google Scholar
Ji, Y., Zhang, R., Wang, H., Li, Z., Wu, L., Zhang, S., Luo, P.: Multi-compound transformer for accurate biomedical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 326–336. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_31
Chapter Google Scholar
Kim, Y., Denton, C., Hoang, L., Rush, A.M.: Structured attention networks. arXiv preprint arXiv:1702.00887 (2017)
Li, L., Ding, W., Huang, L., Zhuang, X.: Right ventricular segmentation from short-and long-axis mris via information transition. arXiv preprint arXiv:2109.02171 (2021)
Liu, D., Ge, C., Xin, Y., Li, Q., Tao, R.: Dispersion correction for optical coherence tomography by the stepped detection algorithm in the fractional fourier domain. Opt. Express 28(5), 5919–5935 (2020)
Article Google Scholar
Liu, D., Liu, J., Liu, Y., Tao, R., Prince, J.L., Carass, A.: Label super resolution for 3d magnetic resonance images using deformable u-net. In: Medical Imaging 2021: Image Processing, vol. 11596, p. 1159628. International Society for Optics and Photonics (2021)
Google Scholar
Liu, D., Xin, Y., Li, Q., Tao, R.: Dispersion correction for optical coherence tomography by parameter estimation in fractional fourier domain. In: 2019 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 674–678. IEEE (2019)
Google Scholar
Liu, D., Yan, Z., Chang, Q., Axel, L., Metaxas, D.N.: Refined deep layer aggregation for multi-disease, multi-view & multi-center cardiac mr segmentation. In: STACOM 2021. LNCS, vol. 13131, pp. 315–322. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-93722-5_34
Chapter Google Scholar
Petitjean, C., Dacher, J.N.: A review of segmentation methods in short axis cardiac mr images. Med. Image Anal. 15(2), 169–184 (2011)
Article Google Scholar
Remedios, S.W., Han, S., Dewey, B.E., Pham, D.L., Prince, J.L., Carass, A.: Joint image and label self-super-resolution. In: Svoboda, D., Burgos, N., Wolterink, J.M., Zhao, C. (eds.) SASHIMI 2021. LNCS, vol. 12965, pp. 14–23. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87592-3_2
Chapter Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Tian, Y., Peng, X., Zhao, L., Zhang, S., Metaxas, D.N.: Cr-gan: learning complete representations for multi-view generation. arXiv preprint arXiv:1806.11191 (2018)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Vigneault, D.M., Xie, W., Ho, C.Y., Bluemke, D.A., Noble, J.A.: \(\omega \)-net (omega-net): fully automatic, multi-view cardiac mr detection, orientation, and segmentation with deep neural networks. Med. Image Anal. 48, 95–106 (2018)
Article Google Scholar
Wang, S., et al.: A multi-view deep convolutional neural networks for lung nodule segmentation. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1752–1755. IEEE (2017)
Google Scholar
Xia, Y., et al.: Uncertainty-aware multi-view co-training for semi-supervised medical image segmentation and domain adaptation. Med. Image Anal. 65, 101766 (2020)
Google Scholar
Yu, F., Wang, D., Shelhamer, E., Darrell, T.: Deep layer aggregation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2403–2412 (2018)
Google Scholar
Zhangli, Q., et al.: Region proposal rectification towards robust instance segmentation of biological images. arXiv preprint arXiv:2203.02846 (2022)
Zhao, C., et al.: Applications of a deep learning method for anti-aliasing and super-resolution in mri. Magn. Reson. Imaging 64, 132–141 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Rutgers University, New Jeresy, USA
Di Liu, Yunhe Gao, Qilong Zhangli, Ligong Han, Xiaoxiao He, Zhaoyang Xia, Song Wen, Qi Chang & Dimitris Metaxas
SenseBrain Research, California, USA
Zhennan Yan & Mu Zhou

Authors

Di Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yunhe Gao
View author publications
You can also search for this author in PubMed Google Scholar
Qilong Zhangli
View author publications
You can also search for this author in PubMed Google Scholar
Ligong Han
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoxiao He
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyang Xia
View author publications
You can also search for this author in PubMed Google Scholar
Song Wen
View author publications
You can also search for this author in PubMed Google Scholar
Qi Chang
View author publications
You can also search for this author in PubMed Google Scholar
Zhennan Yan
View author publications
You can also search for this author in PubMed Google Scholar
Mu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Dimitris Metaxas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dimitris Metaxas .

Editor information

Editors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Linwei Wang
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
University of Virginia, Charlottesville, VA, USA
P. Thomas Fletcher
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Case Western Reserve University, Cleveland, OH, USA
Shuo Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, D. et al. (2022). TransFusion: Multi-view Divergent Fusion for Medical Image Segmentation with Transformers. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13435. Springer, Cham. https://doi.org/10.1007/978-3-031-16443-9_47

Download citation

DOI: https://doi.org/10.1007/978-3-031-16443-9_47
Published: 16 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16442-2
Online ISBN: 978-3-031-16443-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

TransFusion: Multi-view Divergent Fusion for Medical Image Segmentation with Transformers