Advertisement

Neighborhood Encoding Network for Semantic Segmentation

  • Xiaotian Lou
  • Xiaoyu Chen
  • Lianfa Bai
  • Jing HanEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11903)

Abstract

With recent advances of deep neural networks, semantic segmentation algorithms are in rapid development. However, as pixel-level semantic segmentation is often treated as pixel-wise classification task where the neighbor correlation is ignored during inference, the entirety of results is inevitably impaired. In order to increase the correlation ship among the pixels in neural networks, we propose neighborhood encoding network (NENet) to extract the semantics and encode the pixel-level correlation of inputs in a backbone network. In NENet, we use neighborhood prediction module (NPM) to decode the pixel-level correlation and get the result. The NPM can also help the backbone network encode the correlation during training phase. We also design a stage-wise training strategy with NPM for correlation transmission, which eases the training process and increases the performance effectively. The structure of NENet can be expanded to other encoder-decoder network. We evaluate the proposed NENet on CamVid and Cityscpaes datasets, and the NENet achieves impressive results.

Keywords

Semantic segmentation Neural network Neighborhood encoding Neighborhood prediction 

Notes

Acknowledgement

This work is supported by The Natural Science Foundations of China 61727802, Key Research & Development programs in Jiangsu China, BE2018126.

References

  1. 1.
    He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  2. 2.
    Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: Advances in Neural Information Processing Systems, pp. 109–117 (2011)Google Scholar
  3. 3.
    Brostow, G.J., Fauqueur, J., Cipolla, R.: Semantic object classes in video: a high-definition ground truth database. Pattern Recogn. Lett. 30(2), 88–97 (2009)CrossRefGoogle Scholar
  4. 4.
    Cordts, M., Omran, M., Ramos, S., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)Google Scholar
  5. 5.
    Chen, L.C., Papandreou, G., Schroff, F., et al.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
  6. 6.
    Yang, M., Yu, K., Zhang, C., et al.: Denseaspp for semantic segmentation in street scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3684–3692 (2018)Google Scholar
  7. 7.
    Huang, G., Liu, Z., Van Der Maaten, L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)Google Scholar
  8. 8.
    He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRefGoogle Scholar
  9. 9.
    Zhao, H., Shi, J., Qi, X., et al.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)Google Scholar
  10. 10.
    Ke, T.-W., Hwang, J.-J., Liu, Z., Yu, S.X.: Adaptive affinity fields for semantic segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 605–621. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01246-5_36CrossRefGoogle Scholar
  11. 11.
    Zhang, H., Dana, K., Shi, J., et al.: Context encoding for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7151–7160 (2018)Google Scholar
  12. 12.
    Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: BiSeNet: bilateral segmentation network for real-time semantic segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 334–349. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01261-8_20CrossRefGoogle Scholar
  13. 13.
    Paszke, A., Chaurasia, A., Kim, S., et al.: Enet: a deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147 (2016)
  14. 14.
    Everingham, M., Van Gool, L., Williams, C.K.I., et al.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)CrossRefGoogle Scholar
  15. 15.
    Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method: a Unified Approach to Combinatorial Optimization. Monte-Carlo Simulation and Machine Learning. Springer Science & Business Media, New York (2013)Google Scholar
  16. 16.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
  17. 17.
    Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)Google Scholar
  18. 18.
    Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)CrossRefGoogle Scholar
  19. 19.
    Lin, G., Milan, A., Shen, C., et al.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1925–1934 (2017)Google Scholar
  20. 20.
    Kato, Z., Pong, T.C.: A Markov random field image segmentation model for color textured images. Image Vis. Comput. 24(10), 1103–1114 (2006)CrossRefGoogle Scholar
  21. 21.
    Shen, L., Lin, Z., Huang, Q.: Relay backpropagation for effective learning of deep convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 467–482. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46478-7_29CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Xiaotian Lou
    • 1
  • Xiaoyu Chen
    • 1
  • Lianfa Bai
    • 1
  • Jing Han
    • 1
    Email author
  1. 1.Jiangsu Key Laboratory of Spectral Imaging and Intelligent SenseNanjing University of Science and TechnologyNanjingChina

Personalised recommendations