Advertisement

Regression of Instance Boundary by Aggregated CNN and GCN

  • Yanda Meng
  • Wei Meng
  • Dongxu Gao
  • Yitian Zhao
  • Xiaoyun Yang
  • Xiaowei Huang
  • Yalin ZhengEmail author
Conference paper
  • 214 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12353)

Abstract

This paper proposes a straightforward, intuitive deep learning approach for (biomedical) image segmentation tasks. Different from the existing dense pixel classification methods, we develop a novel multi-level aggregation network to directly regress the coordinates of the boundary of instances in an end-to-end manner. The network seamlessly combines standard convolution neural network (CNN) with Attention Refinement Module (ARM) and Graph Convolution Network (GCN). By iteratively and hierarchically fusing the features across different layers of the CNN, our approach gains sufficient semantic information from the input image and pays special attention to the local boundaries with the help of ARM and GCN. In particular, thanks to the proposed aggregation GCN, our network benefits from direct feature learning of the instances’ boundary locations and the spatial information propagation across the image. Experiments on several challenging datasets demonstrate that our method achieves comparable results with state-of-the-art approaches but requires less inference time on the segmentation of fetal head in ultrasound images and of optic disc and optic cup in color fundus images.

Keywords

Regression Semantic segmentation CNN GCN Attention Aggregation 

Notes

Acknowledgement

Y. Meng thanks the China Science IntelliCloud Technology Co., Ltd. for the studentship. D. Gao is supported by EPSRC Grant (EP/R014094/1). We thank NVIDIA for the donation of GPU cards. This work was undertaken on Barkla, part of the High Performance Computing facilities at the University of Liverpool, UK.

Supplementary material

504445_1_En_12_MOESM1_ESM.pdf (11.1 mb)
Supplementary material 1 (pdf 11324 KB)

References

  1. 1.
    Almazroa, A., et al.: Retinal fundus images for glaucoma analysis: the RIGA dataset. In: Imaging Informatics for Healthcare, Research, and Applications, Medical Imaging 2018, vol. 10579, p. 105790B. International Society for Optics and Photonics (2018)Google Scholar
  2. 2.
    Arbab, A., et al.: Conditional random fields meet deep neural networks for semantic segmentation: combining probabilistic graphical models with deep learning for structured prediction. IEEE Sig. Process. Mag. 35(1), 37–52 (2018)CrossRefGoogle Scholar
  3. 3.
    Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203 (2013)
  4. 4.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)CrossRefGoogle Scholar
  5. 5.
    Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
  6. 6.
    Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01234-2_49CrossRefGoogle Scholar
  7. 7.
    Chen, X., Williams, B.M., Vallabhaneni, S.R., Czanner, G., Williams, R., Zheng, Y.: Learning active contour models for medical image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11632–11640 (2019)Google Scholar
  8. 8.
    Chen, Y., Zhao, D., Lv, L., Zhang, Q.: Multi-task learning for dangerous object detection in autonomous driving. Inf. Sci. 432, 559–571 (2018)CrossRefGoogle Scholar
  9. 9.
    Cheng, D., Liao, R., Fidler, S., Urtasun, R.: DARNet: deep active ray network for building segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7431–7439 (2019)Google Scholar
  10. 10.
    Chung, F.R., Graham, F.C.: Spectral Graph Theory, vol. 92. American Mathematical Society, Rhode Island (1997)Google Scholar
  11. 11.
    Decencière, E., et al.: Feedback on a publicly distributed image database: the Messidor database. Image Anal. Stereol. 33(3), 231–234 (2014)CrossRefGoogle Scholar
  12. 12.
    Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp. 3844–3852 (2016)Google Scholar
  13. 13.
    Feng, Z.H., Kittler, J., Awais, M., Huber, P., Wu, X.J.: Wing loss for robust facial landmark localisation with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2235–2245 (2018)Google Scholar
  14. 14.
    Fu, H., Cheng, J., Xu, Y., Wong, D.W.K., Liu, J., Cao, X.: Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Trans. Med. Imaging 37(7), 1597–1605 (2018)CrossRefGoogle Scholar
  15. 15.
    Fu, J., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)Google Scholar
  16. 16.
    Fumero, F., Alayón, S., Sanchez, J.L., Sigut, J., Gonzalez-Hernandez, M.: RIM-ONE: an open retinal image database for optic nerve evaluation. In: 2011 24th International Symposium on Computer-Based Medical Systems (CBMS), pp. 1–6. IEEE (2011)Google Scholar
  17. 17.
    Garland, M., Heckbert, P.S.: Surface simplification using quadric error metrics. In: Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques, pp. 209–216. ACM Press/Addison-Wesley Publishing Co. (1997)Google Scholar
  18. 18.
    Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)Google Scholar
  19. 19.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)Google Scholar
  20. 20.
    Gur, S., Shaharabany, T., Wolf, L.: End to end trainable active contours via differentiable rendering. arXiv preprint arXiv:1912.00367 (2019)
  21. 21.
    Gur, S., Wolf, L., Golgher, L., Blinder, P.: Unsupervised microvascular image segmentation using an active contours mimicking neural network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10722–10731 (2019)Google Scholar
  22. 22.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE international Conference on Computer Vision, pp. 2961–2969 (2017)Google Scholar
  23. 23.
    He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRefGoogle Scholar
  24. 24.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  25. 25.
    van den Heuvel, T.L., de Bruijn, D., de Korte, C.L., van Ginneken, B.: Automated measurement of fetal head circumference using 2D ultrasound images. PLoS ONE 13(8), e0200412 (2018)CrossRefGoogle Scholar
  26. 26.
    Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
  27. 27.
    Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. Int. J. Comput. Vis. 1(4), 321–331 (1988).  https://doi.org/10.1007/BF00133570CrossRefzbMATHGoogle Scholar
  28. 28.
    Li, G., Müller, M., Thabet, A., Ghanem, B.: Can GCNs go as deep as CNNs? arXiv preprint arXiv:1904.03751 (2019)
  29. 29.
    Li, Y., Gupta, A.: Beyond grids: learning graph representations for visual recognition. In: Advances in Neural Information Processing Systems, pp. 9225–9235 (2018)Google Scholar
  30. 30.
    Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1377–1385 (2015)Google Scholar
  31. 31.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar
  32. 32.
    Marcos, D., et al.: Learning deep structured active contours end-to-end. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8877–8885 (2018)Google Scholar
  33. 33.
    Meng, Y., et al.: CNN-GCN aggregation enabled boundary regression for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention (2020, in press)Google Scholar
  34. 34.
    Mou, L., et al.: CS-Net: channel and spatial attention network for curvilinear structure segmentation. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11764, pp. 721–730. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-32239-7_80CrossRefGoogle Scholar
  35. 35.
    Orlando, J.I., et al.: REFUGE challenge: a unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Med. Image Anal. 59, 101570 (2020)CrossRefGoogle Scholar
  36. 36.
    Ranjan, A., Bolkart, T., Sanyal, S., Black, M.J.: Generating 3D faces using convolutional mesh autoencoders. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 725–741. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01219-9_43CrossRefGoogle Scholar
  37. 37.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)Google Scholar
  38. 38.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24574-4_28CrossRefGoogle Scholar
  39. 39.
    Shin, S.Y., Lee, S., Yun, I.D., Lee, K.M.: Deep vessel segmentation by learning graphical connectivity. Med. Image Anal. 58, 101556 (2019)CrossRefGoogle Scholar
  40. 40.
    Sivaswamy, J., Krishnadas, S., Joshi, G.D., Jain, M., Tabish, A.U.S.: Drishti-GS: retinal image dataset for optic nerve head (ONH) segmentation. In: 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI), pp. 53–56. IEEE (2014)Google Scholar
  41. 41.
    Sun, K., et al.: High-resolution representations for labeling pixels and regions. arXiv preprint arXiv:1904.04514 (2019)
  42. 42.
    Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660 (2014)Google Scholar
  43. 43.
    Xie, E., et al.: PolarMask: Single shot instance segmentation with polar representation. arXiv preprint arXiv:1909.13226 (2019)
  44. 44.
    Xu, W., Wang, H., Qi, F., Lu, C.: Explicit shape encoding for real-time instance segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5168–5177 (2019)Google Scholar
  45. 45.
    Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: BiSeNet: bilateral segmentation network for real-time semantic segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 334–349. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01261-8_20CrossRefGoogle Scholar
  46. 46.
    Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)
  47. 47.
    Yu, F., Wang, D., Shelhamer, E., Darrell, T.: Deep layer aggregation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2403–2412 (2018)Google Scholar
  48. 48.
    Zhang, L., Li, X., Arnab, A., Yang, K., Tong, Y., Torr, P.H.: Dual graph convolutional network for semantic segmentation. arXiv preprint arXiv:1909.06121 (2019)
  49. 49.
    Zhang, Z., et al.: ORIGA-light: an online retinal fundus image database for glaucoma analysis and research. In: 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, pp. 3065–3068. IEEE (2010)Google Scholar
  50. 50.
    Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)Google Scholar
  51. 51.
    Zhao, H., et al.: PSANet: point-wise spatial attention network for scene parsing. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 270–286. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01240-3_17CrossRefGoogle Scholar
  52. 52.
    Zhao, L., Peng, X., Tian, Y., Kapadia, M., Metaxas, D.N.: Semantic graph convolutional networks for 3D human pose regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3425–3435 (2019)Google Scholar
  53. 53.
    Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: a nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-00889-5_1CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Yanda Meng
    • 1
  • Wei Meng
    • 1
  • Dongxu Gao
    • 1
  • Yitian Zhao
    • 2
  • Xiaoyun Yang
    • 3
  • Xiaowei Huang
    • 4
  • Yalin Zheng
    • 1
    Email author
  1. 1.Department of Eye and Vision Science, Institute of Life Course and Medical SciencesUniversity of LiverpoolLiverpoolUK
  2. 2.Cixi Institute of Biomedical Engineering, Ningbo Institute of Industrial Technology, Chinese Academy of SciencesNingboChina
  3. 3.China Science IntelliCloud Technology Co., Ltd.ShanghaiChina
  4. 4.Department of Computer ScienceUniversity of LiverpoolLiverpoolUK

Personalised recommendations