Skip to main content
Log in

MSAA-Net: a multi-scale attention-aware U-Net is used to segment the liver

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Automatic segmentation of the liver from CT images is a very challenging task because the shape of the liver in the abdominal cavity varies from person to person and it also often fits closely with other organs. In recent years, with the continuous development of deep learning and the proposal of CNN, the neural network-based segmentation models have shown good performance in the field of image segmentation. Among the many network models, U-Net stands out in the task of medical image segmentation. In this paper, we propose a segmentation network MSAA-Net combining multi-scale features and an improved attention-aware U-Net. We extracted features of different scales on a single feature layer and performed attention perception in the channel dimension. We demonstrate that this architecture improves the performance of U-Net, while significantly reducing computational costs. To address the problem that U-Net’s skip connection is difficult to optimize for merging objects of different sizes, we designed a multi-scale attention gate structure (MAG), which allows the model to automatically learn to focus on targets of different sizes. In addition, MAG can be extended to all structures which contain skip connections, such as U-Net and FCN variants. Our structure was extensively evaluated on the 3Dircadb dataset, and the DICE similarity coefficient of the method for the liver segmentation task was 94.42%, with a much smaller number of model parameters than other attentional models. The experimental results show that MSAA-Net achieves very competitive performance in liver segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and materials

The source code supporting the study will be available from the corresponding author upon reasonable request.

References

  1. Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R.L., Torre, L.A., Jemal, A.: Global cancer statistics 2018: globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68(6), 394–424 (2018)

    Article  Google Scholar 

  2. Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., Van Der Laak, J.A., Van Ginneken, B., Sánchez, C.I.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)

    Article  Google Scholar 

  3. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241 (2015). Springer

  4. Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3d u-net: learning dense volumetric segmentation from sparse annotation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 424–432 (2016). Springer

  5. Fang, X., Yan, P.: Multi-organ segmentation over partially labeled datasets with multi-scale feature abstraction. IEEE Trans. Med. Imaging 39(11), 3619–3629 (2020)

    Article  Google Scholar 

  6. Christ, P.F., Ettlinger, F., Grün, F., Elshaera, M.E.A., Lipkova, J., Schlecht, S., Ahmaddy, F., Tatavarty, S., Bickel, M., Bilic, P., et al.: Automatic liver and tumor segmentation of ct and mri volumes using cascaded fully convolutional neural networks. arXiv:1702.05970 (2017)

  7. Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., Kainz, B., et al.: Attention u-net: Learning where to look for the pancreas. arXiv:1804.03999 (2018)

  8. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

  9. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

  10. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth \(16\) words: transformers for image recognition at scale. arXiv:2010.11929 (2020)

  11. Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: Transunet: Transformers make strong encoders for medical image segmentation. arXiv:2102.04306 (2021)

  12. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

  13. Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., Wang, M.: Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv:2105.05537 (2021)

  14. Gao, S.-H., Cheng, M.-M., Zhao, K., Zhang, X.-Y., Yang, M.-H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal. Mach. Intell 43(2), 652–662 (2019)

    Article  Google Scholar 

  15. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012)

  16. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  17. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)

  18. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  19. Milletari, F., Navab, N., Ahmadi, S.-A.: V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571 (2016). IEEE

  20. Wang, Z., Zou, N., Shen, D., Ji, S.: Non-local u-nets for biomedical image segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 6315–6322 (2020)

  21. Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: Unet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 39(6), 1856–1867 (2019)

    Article  Google Scholar 

  22. Li, C., Tan, Y., Chen, W., Luo, X., Gao, Y., Jia, X., Wang, Z.: Attention unet++: A nested attention-aware u-net for liver CT image segmentation. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 345–349 (2020). IEEE

  23. Kushnure, D.T., Talbar, S.N.: MS-UNet: A multi-scale UNet with feature recalibration approach for automatic liver and tumor segmentation in CT images. Comput. Med. Imaging Graph. 89, 101885 (2021)

    Article  Google Scholar 

  24. Schlemper, J., Oktay, O., Schaap, M., Heinrich, M., Kainz, B., Glocker, B., Rueckert, D.: Attention gated networks: learning to leverage salient regions in medical images. Med. Image Anal. 53, 197–207 (2019)

    Article  Google Scholar 

  25. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  26. Qin, Z., Zhang, P., Wu, F., Li, X.: Fcanet: Frequency channel attention networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 783–792 (2021)

  27. Sun, H., Zeng, X., Xu, T., Peng, G., Ma, Y.: Computer-aided diagnosis in histopathological images of the endometrium using a convolutional neural network and attention mechanisms. IEEE J. Biomed. Health Inf. 24(6), 1664–1676 (2019)

    Article  Google Scholar 

  28. Su, R., Liu, J., Zhang, D., Cheng, C., Ye, M.: Multimodal glioma image segmentation using dual encoder structure and channel spatial attention block. Front. Neurosci. 1063 (2020)

  29. Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017)

  30. Pang, Y., Zhao, X., Zhang, L., Lu, H.: Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9413–9422 (2020)

  31. Ni, J., Wu, J., Tong, J., Chen, Z., Zhao, J.: Gc-net: global context network for medical image segmentation. Comput. Methods Programs Biomed. 190, 105121 (2020)

    Article  Google Scholar 

  32. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019)

  33. Soler, L., Hostettler, A., Agnus, V., Charnoz, A., Fasquel, J., Moreau, J., Osswald, A., Bouhadjar, M., Marescaux, J.: 3d image reconstruction for comparison of algorithm database: a patient specific anatomical and medical image database. IRCAD, Strasbourg, France, Tech. Rep (2010)

  34. Crum, W.R., Camara, O., Hill, D.L.: Generalized overlap measures for evaluation and validation in medical image analysis. IEEE Trans. Med. Imaging 25(11), 1451–1461 (2006)

    Article  Google Scholar 

Download references

Acknowledgements

We thank the authors for the 3Dircadb dataset.

Funding

This research is supported by the Jilin Department of Ecology and Environment Research Project (Grant No sd10185454oh), Jilin Province Science and Technology Development Plan Key R & D Projects(Grant No 20210204050YY).

Author information

Authors and Affiliations

Authors

Contributions

Lijuan Zhang and Jiajun Liu prepared the main manuscript text, Jinyuan Liu and Xiangkun Liu prepared most of the charts, and Liu Jiajun completed most of the experiments and evaluations.

Corresponding author

Correspondence to Dongming Li.

Ethics declarations

Ethics approval and consent to participate

The data for our experiments were obtained from the public dataset 3Dircadb, and ethical standards were observed.

Consent for publication

All authors gave their consent for publication.

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., Liu, J., Li, D. et al. MSAA-Net: a multi-scale attention-aware U-Net is used to segment the liver. SIViP 17, 1001–1009 (2023). https://doi.org/10.1007/s11760-022-02305-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02305-0

Keywords

Navigation