Light-Deeplabv3+: a lightweight real-time semantic segmentation method for complex environment perception

Ding, Peng; Qian, Huaming

doi:10.1007/s11554-023-01380-x

Light-Deeplabv3+: a lightweight real-time semantic segmentation method for complex environment perception

Research
Published: 17 November 2023

Volume 21, article number 1, (2024)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Peng Ding¹ &
Huaming Qian¹

570 Accesses
2 Citations
Explore all metrics

Abstract

Current semantic segmentation methods have high accuracy. However, it has the disadvantage of high computational complexity and time consumption, which makes it difficult to meet the application requirements in complex environments. To achieve fast and accurate semantic segmentation of images, we propose a lightweight semantic segmentation method called Light-Deeplabv3+. First, a MobileNetV2Lite-SE architecture with SE module is proposed as the backbone network of the model, which can reduce the number of model parameters and improve the segmentation speed. Second, we propose an ACsc-ASPP module based on asymmetric dilated convolution block (ADCB) and scSE module to solve the semantic information loss during feature extraction. Our improvements can obtain more semantic features and improve segmentation accuracy. Finally, we propose a DSC-Blaze module to replace the original \(3\times 3\) standard convolution. It consists of depthwise separable convolution (DSC) and Blaze module, which can improve the model segmentation speed while maintaining the receptive field. The experimental results prove that the Mean Intersection over Union (MIoU) of Light-Deeplabv3+ on the PASCAL VOC2012 dataset is 73.15\(\%\), and the parameter size is only 6.291MB. Its calculation amount is only 20.883G, and the speed on the 3060Ti platform is 37.66 frames per second (FPS). Compared with traditional Deeplabv3+, Light-Deeplabv3+ can achieve efficient and accurate image segmentation results with less computational overhead, and its performance is comparable to state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SPSSNet: a real-time network for image semantic segmentation

Article 23 December 2020

Dense-scale dynamic network with filter-varying atrous convolution for semantic segmentation

Article 29 August 2023

Real-time efficient semantic segmentation network based on improved ASPP and parallel fusion module in complex scenes

Article 06 April 2023

Data availability

The data used to support the findings of this study are available from the corresponding author upon request.

References

Bazarevsky, V., Kartynnik, Y., Vakunov, A., et al.: Blazeface: Sub-millisecond Neural Face Detection on Mobile gpus. arXiv preprint arXiv:1907.05047 (2019)
Chen, J., Liu, Z., Jin, D., et al.: Light transport induced domain adaptation for semantic segmentation in thermal infrared urban scenes. IEEE Trans. Intell. Transp. Syst. 23(12), 23194–23211 (2022)
Article Google Scholar
Chen, L., Papandreou, G., Kokkinos, I.: Semantic image segmentation with deep convolutional nets and fully connected crfs. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015). arXiv:1412.7062
Chen, L.C., Papandreou, G., Kokkinos, I., et al.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Article PubMed Google Scholar
Chen, L.C., Papandreou, G., Schroff, F., et al.: Rethinking Atrous Convolution for Semantic Image Segmentation, vol. 2. arXiv preprint arXiv:1706.05587 (2019)
Chen, L.C., Zhu, Y., Papandreou, G., et al.: Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017, pp. 1800–1807. IEEE Computer Society (2017). https://doi.org/10.1109/CVPR.2017.195
Ding, X., Guo, Y., Ding, G.: Acnet: Strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019, pp. 1911–1920. IEEE (2019). https://doi.org/10.1109/ICCV.2019.00200
Fu, J., Liu, J., Tian, H.: Dual attention network for scene segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 3146–3154. Computer Vision Foundation/IEEE (2019)
Gao, X., Bai, H., Xiong, Y., et al.: Robust lane line segmentation based on group feature enhancement. Eng. Appl. Artif. Intell. 117, 105568 (2023)
Article Google Scholar
He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Article PubMed Google Scholar
Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv preprint arXiv:1704.04861 (2017)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Li, H., Xiong, P., Fan, H.: Dfanet: Deep feature aggregation for real-time semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp. 9522–9531 (2019)
Lin, Z., Sun, W., Tang, B., et al.: Semantic segmentation network with multi-path structure, attention reweighting and multi-scale encoding. Vis. Comput. 39(2), 597–608 (2023)
Article Google Scholar
Minaee, S., Boykov, Y., Porikli, F., et al.: Image segmentation using deep learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3523–3542 (2021)
Google Scholar
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7–13, 2015, pp. 1520–1528. IEEE Computer Society (2015). https://doi.org/10.1109/ICCV.2015.178
Paszke, A., Chaurasia, A., Kim, S., et al.: Enet: A Deep Neural Network Architecture for Real-time Semantic Segmentation. arXiv preprint arXiv:1606.02147 (2016)
Qureshi, I., Yan, J., Abbas, Q., et al.: Medical image segmentation using deep semantic-based methods: a review of techniques, applications and emerging trends. Inf. Fusion 90, 316–352 (2022)
Article Google Scholar
Roy, A.G., Navab, N., Wachinger, C.: Concurrent spatial and channel ’squeeze & excitation’ in fully convolutional networks. In: Medical Image Computing and Computer Assisted Intervention—MICCAI 2018—21st International Conference, Granada, Spain, September 16–20, 2018, Proceedings, Part I, Lecture Notes in Computer Science, vol. 11070, pp. 421–429. Springer (2018). https://doi.org/10.1007/978-3-030-00928-1_48
Sandler, M., Howard, A.G., Zhu, M.: Mobilenetv2: inverted residuals and linear bottlenecks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018, pp. 4510–4520. Computer Vision Foundation/IEEE Computer Society (2018)
Wang, Z., Wang, J., Yang, K., et al.: Semantic segmentation of high-resolution remote sensing images based on a class feature attention mechanism fused with deeplabv3+. Comput. Geosci. 158, 104969 (2022)
Article Google Scholar
Xu, H., Wang, S., Huang, Y.: Fpanet: feature-enhanced position attention network for semantic segmentation. Mach. Vis. Appl. 32, 1–9 (2021)
Article Google Scholar
Yi, Q., Dai, G., Shi, M.: Elanet: effective lightweight attention-guided network for real-time semantic segmentation. Neural Process. Lett. 55(12), 1–18 (2023)
ADS Google Scholar
You, L., Jiang, H., Hu, J., et al.: Gpu-accelerated faster mean shift with Euclidean distance metrics. In: 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 211–216. IEEE (2022)
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, Conference Track Proceedings (2016). arXiv:1511.07122
Zhao, H., Shi, J., Qi, X.: Pyramid scene parsing network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017, pp. 6230–6239. IEEE Computer Society (2017). https://doi.org/10.1109/CVPR.2017.660
Zhao, M., Jha, A., Liu, Q., et al.: Faster mean-shift: Gpu-accelerated clustering for cosine embedding-based cell segmentation and tracking. Med. Image Anal. 71, 102048 (2021)
Article PubMed PubMed Central Google Scholar
Zhao, M., Liu, Q., Jha, A., et al.: Voxelembed: 3d instance segmentation and tracking with voxel embedding based deep learning. In: Machine Learning in Medical Imaging: 12th International Workshop, MLMI 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, September 27, 2021, Proceedings vol. 12, pp. 437–446. Springer (2021)
Zheng, Z., Hu, Y., Guo, T., et al.: Aghrnet: An attention ghost-hrnet for confirmation of catch-and-shake locations in jujube fruits vibration harvesting. Comput. Electron. Agric. 210, 107921 (2023)
Article ADS Google Scholar
Zhou, E., Xu, X., Xu, B., et al.: An enhancement model based on dense atrous and inception convolution for image semantic segmentation. Appl. Intell. 53(5), 5519–5531 (2023)
Google Scholar

Download references

Acknowledgements

This work is supported by the Key-Area Research and Development Program of Guangdong Province under Grant 2020B0909020001, the National Natural Science Foundation of China under Grant No.61573113.

Author information

Authors and Affiliations

College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, 150001, China
Peng Ding & Huaming Qian

Authors

Peng Ding
View author publications
You can also search for this author in PubMed Google Scholar
Huaming Qian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huaming Qian.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ding, P., Qian, H. Light-Deeplabv3+: a lightweight real-time semantic segmentation method for complex environment perception. J Real-Time Image Proc 21, 1 (2024). https://doi.org/10.1007/s11554-023-01380-x

Download citation

Received: 11 June 2023
Accepted: 26 October 2023
Published: 17 November 2023
DOI: https://doi.org/10.1007/s11554-023-01380-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Light-Deeplabv3+: a lightweight real-time semantic segmentation method for complex environment perception

Abstract

Access this article

Similar content being viewed by others

SPSSNet: a real-time network for image semantic segmentation

Dense-scale dynamic network with filter-varying atrous convolution for semantic segmentation

Real-time efficient semantic segmentation network based on improved ASPP and parallel fusion module in complex scenes

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Light-Deeplabv3+: a lightweight real-time semantic segmentation method for complex environment perception

Abstract

Access this article

Similar content being viewed by others

SPSSNet: a real-time network for image semantic segmentation

Dense-scale dynamic network with filter-varying atrous convolution for semantic segmentation

Real-time efficient semantic segmentation network based on improved ASPP and parallel fusion module in complex scenes

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation