ADSCNet: asymmetric depthwise separable convolution for semantic segmentation in real-time

Wang, Jiawei; Xiong, Hongyun; Wang, Haibo; Nian, Xiaohong

doi:10.1007/s10489-019-01587-1

ADSCNet: asymmetric depthwise separable convolution for semantic segmentation in real-time

Published: 28 November 2019

Volume 50, pages 1045–1056, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Jiawei Wang¹,
Hongyun Xiong ORCID: orcid.org/0000-0002-1855-2845²,
Haibo Wang² &
…
Xiaohong Nian²

1860 Accesses
42 Citations
Explore all metrics

Abstract

Semantic segmentation can be considered as a per-pixel localization and classification problem, which gives a meaningful label to each pixel in an input image. Deep convolutional neural networks have made extremely successful in semantic segmentation in recent years. However, some challenges still exist. The first challenge task is that most current networks are complex and it is hard to deploy these models on mobile devices because of the limitation of computational cost and memory. Getting more contextual information from downsampled feature maps is another challenging task. To this end, we propose an asymmetric depthwise separable convolution network (ADSCNet) which is a lightweight neural network for real-time semantic segmentation. To facilitating information propagation, Dense Dilated Convolution Connections (DDCC), which connects a set of dilated convolutional layers in a dense way, is introduced in the network. Pooling operation is inserted before ADSCNet unit to cover more contextual information in prediction. Extensive experimental results validate the superior performance of our proposed method compared with other network architectures. Our approach achieves mean intersection over union (mIOU) of 67.5% on Cityscapes dataset at 76.9 frames per second.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Dense-scale dynamic network with filter-varying atrous convolution for semantic segmentation

Article 29 August 2023

Zhiqiang Li, Jie Jiang, … Min Zhang

Semantic segmentation using reinforced fully convolutional densenet with multiscale kernel

Article 09 April 2019

Sourour Brahimi, Najib Ben Aoun, … Chokri Ben Amar

Learning More Accurate Features for Semantic Segmentation in CycleNet

References

Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40 (4):834–848
Article Google Scholar
Chen L, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587
Chen L.-C., Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818
Chapter Google Scholar
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3213–3223
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
He Y, Han S (2018) Adc: Automated deep compression and acceleration with reinforcement learning. arXiv:1802.03494
Howard A, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and < 0.5 mb model size. arXiv:1602.07360
Ioannou Y, Robertson D, Shotton J, Cipolla R, Criminisi A (2015) Training cnns with low-rank filters for efficient image classification. arXiv:1511.06744
Wei J, He J, Zhou Y, Chen K, Tang Z, Xiong Z (2019) Enhanced object detection with deep convolutional neural networks for advanced driving assistance. IEEE Transactions on Intelligent Transportation Systems
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the international conference on machine learning (ICML), pp 448–456
Jaderberg M, Vedaldi A, Zisserman A (2014) Speeding up convolutional neural networks with low rank expansions. arXiv:1405.38661405.3866
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2016) Pruning filters for efficient convnets. arXiv:1608.08710
Liu Z, Li J, Shen Z, Huang G, Yan S, Zhang C (2017) Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2755–2763
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3431–3440
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147
Peng C, Zhang X, Yu G, Luo G, Sun J (2017) Large kernel matters – improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1743–1751
Romera E, Alvarez JM, Bergasa LM, Arroyo R (2018) Erfnet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272
Article Google Scholar
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Inverted residuals and linear bottlenecks: mobile networks for classification, detection and segmentation. arXiv:1801.04381
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Wang P, Hu Q, Zhang Y, Zhang C, Liu Y, Cheng J (2018) Two-step quantization for low-bit neural networks. Proc IEEE Conf Comput Vis Pattern Recognit, 4376–4384
Xie G, Wang J, Zhang T, Lai J, Hong R, Qi GJ (2018) Interleaved structured sparse convolutional neural networks. Proc IEEE Conf Comput Vis Pattern Recognit, 8847–8856
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5987–5995
Yoon J, Hwang SJ (2017) Combined group and exclusive sparsity for deep neural networks. In: Proceedings of the international conference on machine learning (ICML), pp 3958–3966
Yu H, Yang Z, Tan L, Wang Y, Sun W, Sun M, Tang Y (2018) Methods and datasets on semantic segmentation: a review. Neurocomputing 304:82–103
Article Google Scholar
Yu X, Yu Z, Ramalingam S (2018) Learning strict identity mappings in deep residual networks. Proc IEEE Conf Comput Vis Pattern Recognit, 4432–4440
Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. Proc IEEE Conf Comput Vis Pattern Recognit, 6848–6856
Zhang X, Zou J, He K, Sun J (2016) Accelerating very deep convolutional networks for classification and detection. IEEE Trans Pattern Anal Mach Intell 38(10):1943–1955
Article Google Scholar
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2881–2890
Maggiori E, Tarabalka Y, Charpiat G, Alliez P (2016) Convolutional neural networks for large-scale remote-sensing image classification. IEEE Trans Geosci Remote Sens 55(2):645–657
Article Google Scholar
Everingham M, Eslami A, Van Gool L, Williams K, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136
Article Google Scholar
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3156–3164
Alhaija A, Mustikovela K, Mescheder L, Geiger A, Rother C (2018) Augmented reality meets computer vision: efficient data generation for urban driving scenes. Int J Comput Vis 126(9):961–972
Article Google Scholar
Xie D, Deng C, Wang H, Li C, Tao D (2018) Semantic adversarial network with multi-scale pyramid attention for video classification. Association for the Advancement of Artificial Intelligence (AAAI)
Deng C, Yang E, Liu T, Liu W, Li J, Tao D (2019) Unsupervised semantic-preserving adversarial hashing for image search. IEEE Trans Image Process 28(8):4032–4044
Article MathSciNet Google Scholar
Li N, Li C, Deng C, Liu X, Gao X (2018) Deep joint semantic-embedding hashing. Int Joint Conf Artif Intell, 2397–2403
Rawat W, Wang Z (2017) Deep convolutional neural networks for image classification: a comprehensive review. Neur Comput 29(9):2352–2449
Article MathSciNet Google Scholar
Cai Z, Fan Q, Feris R, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 354–370
Chapter Google Scholar
Li Y, Zhang Y, Huang X, Ma J (2018) Learning source-invariant deep hashing convolutional neural networks for cross-source remote sensing image retrieval. IEEE Trans Geosci Remote Sens 56(11):6521–6536
Article Google Scholar
Liu C, Chen L, Schroff F, Adam H, Hua W, Yuille A, Fei-Fei L (2019) Auto-deeplab: hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 82–92
Bischke B, Helber P, Folz J, Borth D, Dengel A (2019) Multi-task learning for segmentation of building footprints with deep neural networks. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 1480–1484
Lowe D (1999) Object recognition from local scale-invariant features. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1150–1157
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 886–893
Li J, Allinson N (2008) A comprehensive review of current local features for computer vision. Neurocomputing 71(10):1771–1787
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Farabet C, Couprie C, Najman L, LeCun Y (2012) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Article Google Scholar
Mostajabi M, Yadollahpour P, Shakhnarovich G (2015) Feedforward semantic segmentation with zoom-out features. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3376–3385
Vezhnevets A, Ferrari V, Buhmann J (2012) Weakly supervised structured output learning for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 845–852
Papandreou G, Chen L, Murphy K, Yuille A (2015) Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: Proceedings of the IEEE International conference on computer vision (ICCV), pp 1742–1750
Liu S, Yan S, Zhang T, Xu C, Liu J, Lu H (2011) Weakly supervised graph propagation towards collective image parsing. IEEE Trans Multimed 14(2):361–373
Article Google Scholar

Download references

Acknowledgements

This work is supported by the Fundamental Research Funds for the Central Universities of Central South University under grant 2017zzts730. We appreciate Xiangyu Zhang for helping on the discussion.

Author information

Authors and Affiliations

School of Software, Central South University, Changsha, Hunan, 410075, People’s Republic of China
Jiawei Wang
School of Information Science and Engineering, Central South University, Changsha, Hunan, 410075, People’s Republic of China
Hongyun Xiong, Haibo Wang & Xiaohong Nian

Authors

Jiawei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongyun Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Haibo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohong Nian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongyun Xiong.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, J., Xiong, H., Wang, H. et al. ADSCNet: asymmetric depthwise separable convolution for semantic segmentation in real-time. Appl Intell 50, 1045–1056 (2020). https://doi.org/10.1007/s10489-019-01587-1

Download citation

Published: 28 November 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s10489-019-01587-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

ADSCNet: asymmetric depthwise separable convolution for semantic segmentation in real-time

Abstract

Access this article

Similar content being viewed by others

Dense-scale dynamic network with filter-varying atrous convolution for semantic segmentation

Semantic segmentation using reinforced fully convolutional densenet with multiscale kernel

Learning More Accurate Features for Semantic Segmentation in CycleNet

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

ADSCNet: asymmetric depthwise separable convolution for semantic segmentation in real-time

Abstract

Access this article

Similar content being viewed by others

Dense-scale dynamic network with filter-varying atrous convolution for semantic segmentation

Semantic segmentation using reinforced fully convolutional densenet with multiscale kernel

Learning More Accurate Features for Semantic Segmentation in CycleNet

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation