Max-Fusion U-Net for Multi-modal Pathology Segmentation with Attention and Dynamic Resampling

Jiang, Haochuan; Wang, Chengjia; Chartsias, Agisilaos; Tsaftaris, Sotirios A.

doi:10.1007/978-3-030-65651-5_7

Haochuan Jiang¹⁰,
Chengjia Wang¹¹,
Agisilaos Chartsias¹⁰ &
…
Sotirios A. Tsaftaris^10,12

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12554))

Included in the following conference series:

Myocardial Pathology Segmentation Combining Multi-Sequence CMR Challenge

877 Accesses
5 Citations

Abstract

Automatic segmentation of multi-sequence (multi-modal) cardiac MR (CMR) images plays a significant role in diagnosis and management for a variety of cardiac diseases. However, the performance of relevant algorithms is significantly affected by the proper fusion of the multi-modal information. Furthermore, particular diseases, such as myocardial infarction, display irregular shapes on images and occupy small regions at random locations. These facts make pathology segmentation of multi-modal CMR images a challenging task. In this paper, we present the Max-Fusion U-Net that achieves improved pathology segmentation performance given aligned multi-modal images of LGE, T2-weighted, and bSSFP modalities. Specifically, modality-specific features are extracted by dedicated encoders. Then they are fused with the pixel-wise maximum operator. Together with the corresponding encoding features, these representations are propagated to decoding layers with U-Net skip-connections. Furthermore, a spatial-attention module is applied in the last decoding layer to encourage the network to focus on those small semantically meaningful pathological regions that trigger relatively high responses by the network neurons. We also use a simple image patch extraction strategy to dynamically resample training examples with varying spacial and batch sizes. With limited GPU memory, this strategy reduces the imbalance of classes and forces the model to focus on regions around the interested pathology. It further improves segmentation accuracy and reduces the mis-classification of pathology. We evaluate our methods using the Myocardial pathology segmentation (MyoPS) combining the multi-sequence CMR dataset which involves three modalities. Extensive experiments demonstrate the effectiveness of the proposed model which outperforms the related baselines. The code is available at https://github.com/falconjhc/MFU-Net.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We restrict to the case where \(N=3\) (myocardium, left ventricle, and right ventricle) and \(K=2\) (infarction and edema).
2.
For simplicity we note it as \(Enc^k\) in following sections.
3.
Since we do not have the ground truth of the testing data, the performance reported in Table 1 and Table 2 are obtained by five-fold cross validation across the training set. Relevant splits are following the description in Sect. 3.2. In addition, we also report the averaged pathology Dice scores of the both pathologies to assess the overall pathology segmentation performance.
4.
Although the anatomy segmentation performance decreases, we still think SideConv and Dilation are better choices since we are more caring about the pathology prediction in this research.

References

Chartsias, A., et al.: Disentangle, align and fuse for multimodal and zero-shot image segmentation. arXiv preprint arXiv:1911.04417 (2019)
Ding, X., Guo, Y., Ding, G., Han, J.: ACNet: strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1911–1920 (2019)
Google Scholar
Dolz, J., Gopinath, K., Yuan, J., Lombaert, H., Desrosiers, C., Ayed, I.B.: HyperDenseNet: a hyper-densely connected CNN for multi-modal image segmentation. IEEE Trans. Med. Imaging 38(5), 1116–1126 (2018)
Article Google Scholar
Fu, J., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
Google Scholar
Havaei, M., et al.: Brain tumor segmentation with deep neural networks. Med. Image Anal. 35, 18–31 (2017)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Jiang, H., Yang, G., Huang, K., Zhang, R.: W-Net: one-shot arbitrary-style chinese character generation with deep neural networks. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11305, pp. 483–493. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04221-9_43
Chapter Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Mahmood, F., Yang, Z., Ashley, T., Durr, N.J.: Multimodal DenseNet. arXiv preprint arXiv:1811.07407 (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Salehi, S.S.M., Erdogmus, D., Gholipour, A.: Tversky loss function for image segmentation using 3D fully convolutional deep networks. In: Wang, Q., Shi, Y., Suk, H.-I., Suzuki, K. (eds.) MLMI 2017. LNCS, vol. 10541, pp. 379–387. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67389-9_44
Chapter Google Scholar
Takahashi, R., Matsubara, T., Uehara, K.: RICAP : random image cropping and patching data augmentation for deep CNNs. In: Asian Conference on Machine Learning, pp. 786–798 (2018)
Google Scholar
Tseng, K.L., Lin, Y.L., Hsu, W., Huang, C.Y.: Joint sequence learning and cross-modality convolution for 3D biomedical segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6393–6400 (2017)
Google Scholar
Vesal, S., Ravikumar, N., Maier, A.: A 2D dilated residual U-Net for multi-organ segmentation in thoracic CT . arXiv preprint arXiv:1905.07710 (2019)
Zhuang, X.: Multivariate mixture model for cardiac segmentation from multi-sequence MRI. In: Ourselin, S., Joskowicz, L., Sabuncu, M., Unal, G., Wells, W. (eds.) MICCAI 2016. Lecture Notes in Computer Science, vol. 9901, pp. 581–588. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_67
Chapter Google Scholar
Zhuang, X.: Multivariate mixture model for myocardial segmentation combining multi-source images. IEEE Trans. Pattern Anal. Mach. Intell. 41(12), 2933–2946 (2018)
Article Google Scholar

Download references

Acknowledgement

This work was supported by US National Institutes of Health (1R01HL136578-01). This work used resources provided by the Edinburgh Compute and Data Facility (http://www.ecdf.ed.ac.uk/). S.A. Tsaftaris acknowledges the Royal Academy of Engineering and the Research Chairs and Senior Research Fellowships scheme.

Author information

Authors and Affiliations

School of Engineering, University of Edinburgh, Edinburgh, UK
Haochuan Jiang, Agisilaos Chartsias & Sotirios A. Tsaftaris
Center for Cardiovascular Science, University of Edinburgh, Edinburgh, UK
Chengjia Wang
The Alan Turing Institute, London, UK
Sotirios A. Tsaftaris

Authors

Haochuan Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Chengjia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Agisilaos Chartsias
View author publications
You can also search for this author in PubMed Google Scholar
Sotirios A. Tsaftaris
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengjia Wang .

Editor information

Editors and Affiliations

Fudan University, Shanghai, China
Xiahai Zhuang
Shanghai Jiao Tong University, Shanghai, China
Lei Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, H., Wang, C., Chartsias, A., Tsaftaris, S.A. (2020). Max-Fusion U-Net for Multi-modal Pathology Segmentation with Attention and Dynamic Resampling. In: Zhuang, X., Li, L. (eds) Myocardial Pathology Segmentation Combining Multi-Sequence Cardiac Magnetic Resonance Images. MyoPS 2020. Lecture Notes in Computer Science(), vol 12554. Springer, Cham. https://doi.org/10.1007/978-3-030-65651-5_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-65651-5_7
Published: 21 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65650-8
Online ISBN: 978-3-030-65651-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics