Skip to main content

CoalUMLP: Slice and Dice! A Fast, MLP-Like 3D Medical Image Segmentation Network

  • Conference paper
  • First Online:
PRICAI 2023: Trends in Artificial Intelligence (PRICAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14327))

Included in the following conference series:

  • 586 Accesses

Abstract

3D medical image segmentation tasks play a crucial role in clinical diagnosis. However, Handling vast data and intricate structures in Point-of-Care (POC) devices is challenging. While current methods use CNNs and Transformer models, their high computational demands and limited real-time capabilities limit their POC application. Recent studies have explored the application of Multilayer Perceptrons (MLP) to medical image segmentation tasks. However, these studies overlook the significance of local and global image features and multi-scale contextual information. To overcome these limitations, we propose CoalUMLP, an efficient vision MLP architecture designed specifically for 3D medical image segmentation tasks. CoalUMLP combines the strengths of CNN, Transformer, and MLP, incorporating three key components: the Multi-Scale Axial Permute Encoder (MSAP), Masked Axial Permute Decoder (MAP), and Semantic Bridging Connection (SBC). We reframe the medical image segmentation problem as a sequence-to-sequence prediction problem and evaluate the performance of our approach on the Medical Segmentation Decathlon (MSD) dataset. CoalUMLP showcases a state-of-the-art performance by significantly reducing the parameter count by 32.8% and computational complexity by 48.5%, all while maintaining a compact structure. Our results highlight the potential of CoalUMLP as a promising backbone for real-time medical image applications. It achieves a superior trade-off between accuracy and efficiency compared to previous Transformer and CNN-based models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Antonelli, M., et al.: The medical segmentation decathlon. Nature Commun. 13(1), 4128 (2022)

    Article  Google Scholar 

  2. Bertels, J., et al.: Optimizing the Dice Score and Jaccard Index for medical image segmentation: theory and practice. In: Shen, D., et al. (eds.) Medical Image Computing and Computer Assisted Intervention – MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part II, pp. 92–100. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32245-8_11

    Chapter  Google Scholar 

  3. Cao, H., et al.: Swin-Unet: Unet-like pure transformer for medical image segmentation. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds.) Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III, pp. 205–218. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-25066-8_9

    Chapter  Google Scholar 

  4. Cardoso, M.J., et al.: Monai: An open-source framework for deep learning in healthcare. arXiv preprint arXiv:2211.02701 (2022)

  5. Chen, J., et al.: Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)

  6. Chen, S., Xie, E., Ge, C., Chen, R., Liang, D., Luo, P.: Cyclemlp: A MLP-like architecture for dense prediction. arXiv preprint arXiv:2107.10224 (2021)

  7. Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016: 19th International Conference, Athens, Greece, October 17-21, 2016, Proceedings, Part II, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49

    Chapter  Google Scholar 

  8. Gu, J., et al.: Multi-scale high-resolution vision transformer for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12094–12103 (2022)

    Google Scholar 

  9. Guo, J., et al.: Hire-MLP: Vision MLP via hierarchical rearrangement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 826–836 (2022)

    Google Scholar 

  10. Hatamizadeh, A., et al.: UNETR: Transformers for 3D medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference On Applications of Computer Vision, pp. 574–584 (2022)

    Google Scholar 

  11. He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  13. Huttenlocher, D.P., Klanderman, G.A., Rucklidge, W.J.: Comparing images using the Hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 850–863 (1993)

    Article  Google Scholar 

  14. Jha, D., et al.: Resunet++: An advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM), pp. 225–2255. IEEE (2019)

    Google Scholar 

  15. Jiang, W., Trulls, E., Hosang, J., Tagliasacchi, A., Yi, K.M.: Cotr: Correspondence transformer for matching across images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6207–6217 (2021)

    Google Scholar 

  16. Li, M., Wei, M., He, X., Shen, F.: Enhancing part features via contrastive attention module for vehicle re-identification. In: Conference on International Conference on Image Processing. IEEE (2022)

    Google Scholar 

  17. Lian, D., Yu, Z., Sun, X., Gao, S.: As-MLP: An axial shifted MLP architecture for vision. arXiv preprint arXiv:2107.08391 (2021)

  18. Liu, Y., Qin, G., Lyu, K., Huang, Y.: Mixed-net: A mixed architecture for medical image segmentation. In: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2095–2102. IEEE (2022)

    Google Scholar 

  19. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  20. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)

  21. Paszke, A., et al.: Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32 (2019)

    Google Scholar 

  22. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  23. Shen, F., Du, X., Zhang, L., Tang, J.: Triplet contrastive learning for unsupervised vehicle re-identification. arXiv preprint arXiv:2301.09498 (2023)

  24. Shen, F., Peng, X., Wang, L., Zhang, X., Shu, M., Wang, Y.: Hsgm: A hierarchical similarity graph module for object re-identification. In: 2022 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2022)

    Google Scholar 

  25. Shen, F., Xiangbo, S., Du, X., Tang, J.: Pedestrian-specific bipartite-aware similarity learning for text-based person retrieval. In: Proceedings of the 31th ACM International Conference on Multimedia (2023)

    Google Scholar 

  26. Shen, F., Xie, Y., Zhu, J., Zhu, X., Zeng, H.: Git: Graph interactive transformer for vehicle re-identification. IEEE Trans. Image Process. 32, 1039–1051 (2023)

    Google Scholar 

  27. Shen, F., Zhu, J., Zhu, X., Huang, J., Zeng, H., Lei, Z., Cai, C.: An efficient multi-resolution network for vehicle re-identification. IEEE Internet of Things Journal (2021)

    Google Scholar 

  28. Shen, F., Zhu, J., Zhu, X., Xie, Y., Huang, J.: Exploring spatial significance via hybrid pyramidal graph network for vehicle re-identification. IEEE Trans. Intell. Transport. Syst. 23(7), 8793–8804 (2021)

    Google Scholar 

  29. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  30. Tolstikhin, I.O., et al.: MLP-mixer: An all-MLP architecture for vision. Adv. Neural. Inf. Process. Syst. 34, 24261–24272 (2021)

    Google Scholar 

  31. Valanarasu, J.M.J., Patel, V.M.: Unext: MLP-based rapid medical image segmentation network. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part V, pp. 23–33. Springer (2022). https://doi.org/10.1007/978-3-031-16443-9_3

  32. Yu, T., Li, X., Cai, Y., Sun, M., Li, P.: S2-MLP: Spatial-shift MLP architecture for vision. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 297–306 (2022)

    Google Scholar 

Download references

Acknowledgments.

This work is being supported by the National Natural Science Foundation of China under the Grant No. 52074299.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, R., Wu, Z., Hu, X., Zhang, L. (2024). CoalUMLP: Slice and Dice! A Fast, MLP-Like 3D Medical Image Segmentation Network. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14327. Springer, Singapore. https://doi.org/10.1007/978-981-99-7025-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7025-4_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7024-7

  • Online ISBN: 978-981-99-7025-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics