Skip to main content
Log in

Real-time semantic segmentation network based on parallel atrous convolution for short-term dense concatenate and attention feature fusion

  • Research
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

To address the problem of incomplete segmentation of large objects and miss-segmentation of tiny objects that is universally existing in semantic segmentation algorithms, PACAMNet, a real-time segmentation network based on short-term dense concatenate of parallel atrous convolution and fusion of attentional features is proposed, called PACAMNet. First, parallel atrous convolution is introduced to improve the short-term dense concatenate module. By adjusting the size of the atrous factor, multi-scale semantic information is obtained to ensure that the last layer of the module can also obtain rich input feature maps. Second, attention feature fusion module is proposed to align the receptive fields of deep and shallow feature maps via depth-separable convolutions with different sizes, and the channel attention mechanism is used to generate weights to effectively fuse the deep and shallow feature maps. Finally, experiments are carried out based on both Cityscapes and CamVid datasets, and the segmentation accuracy achieve 77.4% and 74.0% at the inference speeds of 98.7 FPS and 134.6 FPS, respectively. Compared with other methods, PACAMNet improves the inference speed of the model while ensuring higher segmentation accuracy, so PACAMNet achieve a better balance between segmentation accuracy and inference speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data access statement

The data that support the findings of this study are available at https://arxiv.org/abs/1604.01685. These data were derived from the following resources available in the public domain: https://www.cityscapes-dataset.com/.

References

  1. Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)

  2. Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(04), 834–848 (2018)

    Article  Google Scholar 

  3. Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017)

  4. Ding, P., Qian, H.: Light-deeplabv3+: a lightweight real-time semantic segmentation method for complex environment perception. J. Real-Time Image Proc. 21(1), 1 (2024)

    Article  MathSciNet  Google Scholar 

  5. Ding, P., Qian, H., Zhou, Y., Yan, S., Feng, S., Yu, S.: Real-time efficient semantic segmentation network based on improved aspp and parallel fusion module in complex scenes. J. Real-Time Image Proc. 20(3), 41 (2023)

    Article  Google Scholar 

  6. Dong, Y., Yang, H., Pei, Y., Shen, L., Zheng, L., Li, P.: Compact interactive dual-branch network for real-time semantic segmentation. Complex Intell. Syst. 1–14 (2023)

  7. Fan, M., Lai, S., Huang, J., Wei, X., Chai, Z., Luo, J., Wei, X.: Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9716–9725 (2021)

  8. Gao, G., Xu, G., Li, J., Yu, Y., Lu, H., Yang, J.: Fbsnet: a fast bilateral symmetrical network for real-time semantic segmentation. IEEE Trans. Multimedia (2022)

  9. Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020)

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  11. Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)

  12. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and$< 0$.5 mb model size (2016). arXiv preprint arXiv:1602.07360

  13. Kumaar, S., Lyu, Y., Nex, F., Yang, M.Y.: Cabinet: Efficient context aggregation network for low-latency semantic segmentation. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13517–13524. IEEE (2021)

  14. Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 510–519 (2019)

  15. Li, H., Xiong, P., Fan, H., Sun, J.: Dfanet: deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9522–9531 (2019)

  16. Li, X., You, A., Zhu, Z., Zhao, H., Yang, M., Yang, K., Tan, S., Tong, Y.: Semantic flow for fast and accurate scene parsing. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, pp. 775–793. Springer (2020)

  17. Li, G., Jiang, S., Yun, I., Kim, J., Kim, J.: Depth-wise asymmetric bottleneck with point-wise aggregation decoder for real-time semantic segmentation in urban scenes. Ieee Access 8, 27495–27506 (2020)

    Article  Google Scholar 

  18. Li, L., Zhou, T., Wang, W., Li, J., Yang, Y.: Deep hierarchical semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1246–1257 (2022)

  19. Liang-Chieh, C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.: Semantic image segmentation with deep convolutional nets and fully connected crfs. In: International Conference on Learning Representations (2015)

  20. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

  21. Liu, J., Xu, X., Shi, Y., Deng, C., Shi, M.: Relaxnet: residual efficient learning and attention expected fusion network for real-time semantic segmentation. Neurocomputing 474, 115–127 (2022)

    Article  Google Scholar 

  22. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  23. Nirkin, Y., Wolf, L., Hassner, T.: Hyperseg: Patch-wise hypernetwork for real-time semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4061–4070 (2021)

  24. Pan, H., Hong, Y., Sun, W., Jia, Y.: Deep dual-resolution networks for real-time and accurate semantic segmentation of traffic scenes. IEEE Trans. Intell. Transp. Syst. 24(3), 3448–3460 (2022)

    Article  Google Scholar 

  25. Paszke, A., Chaurasia, A., Kim, S., Culurciello, E.: Enet: a deep neural network architecture for real-time semantic segmentation (2016). arXiv preprint arXiv:1606.02147

  26. Peng, J., Liu, Y., Tang, S., Hao, Y., Chu, L., Chen, G., Wu, Z., Chen, Z., Yu, Z., Du, Y., et al.: Pp-liteseg: a superior real-time semantic segmentation model (2022). arXiv preprint arXiv:2204.02681

  27. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

  28. Song, Q., Mei, K., Huang, R.: Attanet: attention-augmented network for fast and accurate scene parsing. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2567–2575 (2021)

  29. Tsai, T.H., Tseng, Y.W.: Bisenet v3: bilateral segmentation network with coordinate attention for real-time semantic segmentation. Neurocomputing 532, 33–42 (2023)

    Article  Google Scholar 

  30. Wan, Q., Huang, Z., Lu, J., Gang, Y., Zhang, L.: Seaformer: squeeze-enhanced axial transformer for mobile semantic segmentation. In: The Eleventh International Conference on Learning Representations (2022)

  31. Wang, J., Xiong, H., Wang, H., Nian, X.: Adscnet: asymmetric depthwise separable convolution for semantic segmentation in real-time. Appl. Intell. 50, 1045–1056 (2020)

    Article  Google Scholar 

  32. Wang, W., Zhou, T., Yu, F., Dai, J., Konukoglu, E., Van Gool, L.: Exploring cross-image pixel contrast for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7303–7313 (2021)

  33. Wang, J., Gou, C., Wu, Q., Feng, H., Han, J., Ding, E., Wang, J.: Rtformer: efficient design for real-time semantic segmentation with transformer. Adv. Neural. Inf. Process. Syst. 35, 7423–7436 (2022)

    Google Scholar 

  34. Wang, C., Zhong, J., Dai, Q., Qi, Y., Shi, F., Fang, B., Li, X.: Multi-view knowledge distillation for efficient semantic segmentation. J. Real-Time Image Proc. 20(2), 39 (2023)

    Article  Google Scholar 

  35. Wu, Y., Jiang, J., Huang, Z., Tian, Y.: Fpanet: feature pyramid aggregation network for real-time semantic segmentation. Appl. Intell. 52, 1–18 (2022)

    Google Scholar 

  36. Xiao, C., Hao, X., Li, H., Li, Y., Zhang, W.: Real-time semantic segmentation with local spatial pixel adjustment. Image Vis. Comput. 123, 104470 (2022)

    Article  Google Scholar 

  37. Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers. Adv. Neural. Inf. Process. Syst. 34, 12077–12090 (2021)

    Google Scholar 

  38. Xiong, J., Po, L.M., Yu, W.Y., Zhou, C., Xian, P., Ou, W.: Csrnet: cascaded selective resolution network for real-time semantic segmentation. Expert Syst. Appl. 211, 118537 (2023)

    Article  Google Scholar 

  39. Xu, J., Xiong, Z., Bhattacharyya, S.P.: Pidnet: a real-time semantic segmentation network inspired by pid controllers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19529–19539 (2023)

  40. Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: Bisenet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp. 325–341 (2018)

  41. Yu, C., Gao, C., Wang, J., Yu, G., Shen, C., Sang, N.: Bisenet v2: bilateral network with guided aggregation for real-time semantic segmentation. Int. J. Comput. Vis. 129, 3051–3068 (2021)

    Article  Google Scholar 

  42. Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)

  43. Zhang, W., Huang, Z., Luo, G., Chen, T., Wang, X., Liu, W., Yu, G., Shen, C.: Topformer: Token pyramid transformer for mobile semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12083–12093 (2022)

  44. Zhang, X., Du, B., Wu, Z., Wan, T.: Laanet: lightweight attention-guided asymmetric network for real-time semantic segmentation. Neural Comput. Appl. 34(5), 3573–3587 (2022)

    Article  Google Scholar 

  45. Zhang, F., Zhou, T., Li, B., He, H., Ma, C., Zhang, T., Yao, J., Zhang, Y., Wang, Y.: Uncovering prototypical knowledge for weakly open-vocabulary semantic segmentation. Adv. Neural Inf. Process. Syst. 36 (2024)

  46. Zhao, H., Qi, X., Shen, X., Shi, J., Jia, J.: Icnet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 405–420 (2018)

  47. Zhao, Q., Ji, T., Liang, S., Yu, W., Yan, C.: Real-time power line segmentation detection based on multi-attention with strong semantic feature extractor. J. Real-Time Image Proc. 20(6), 117 (2023)

    Article  Google Scholar 

  48. Zhou, T., Wang, W., Konukoglu, E., Van Gool, L.: Rethinking semantic segmentation: a prototype view. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2582–2593 (2022)

Download references

Acknowledgements

This work is financially supported in parts by the National Natural Science Foundation of China (Grant nos. 51508105 and 62271151), and the Foundation of Fujian Natural Science (Grant no. 2021J01580).

Author information

Authors and Affiliations

Authors

Contributions

Funding acquisition, LJW and ZCC. Methodology, LJW, SDQ. Project administration, LJW, SDQ and ZCC. Resources, LJW, and ZCC. Visualization, SDQ. Writing—original draft, LJW, SDQ. Writing—review and editing, LJW, SDQ and ZCC.

Corresponding author

Correspondence to Zhicong Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Research involving human and animal participants

This paper does not contain any studies with human or animal subjects.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, L., Qiu, S. & Chen, Z. Real-time semantic segmentation network based on parallel atrous convolution for short-term dense concatenate and attention feature fusion. J Real-Time Image Proc 21, 74 (2024). https://doi.org/10.1007/s11554-024-01453-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11554-024-01453-5

Keywords

Navigation