An enhancement model based on dense atrous and inception convolution for image semantic segmentation

Zhou, Erjing; Xu, Xiang; Xu, Baomin; Wu, Hongwei

doi:10.1007/s10489-022-03448-w

An enhancement model based on dense atrous and inception convolution for image semantic segmentation

Published: 25 June 2022

Volume 53, pages 5519–5531, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Erjing Zhou¹,
Xiang Xu²,
Baomin Xu ORCID: orcid.org/0000-0002-9087-9631¹ &
…
Hongwei Wu¹

645 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The goal of semantic segmentation is to classify each pixel in the image, so as to segment out the specific contour of the target. Most previous semantic segmentation models cannot generate enough semantic information for each pixel to understand the content of complex scenes. In this paper, we propose a novel semantic segmentation model Ince-DResAsppNet based on dense convoluted separation convolution. Unlike the previous model, our model revolves around reducing semantic information loss and enhancing detailed information. In the feature extraction part of the model, the idea of Dense and Ince is introduced to expand the number of channels on the basis of feature reuse. In the feature fusion part, Dense and Atrous’s idea of dense dilated based on coprime factors is introduced, combined with multi-scale feature information to expand the receptive field and collect more dense pixels. Experiments conducted on the dataset PASCAL VOC 2012 and the CityScapes dataset show that our method performs better than the existing semantic segmentation model. Our model achieves 83.3% and 78.1% segmentation accuracy on the mIoU indicator, which surpasses many classical semantic segmentation models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AS-TransUnet: Combining ASPP and Transformer for Semantic Segmentation

LDANet: the laplace-guided detail-constrained asymmetric network for real-time semantic segmentation

Article 27 November 2023

Semantic Image Segmentation with Feature Fusion Based on Laplacian Pyramid

Article 26 March 2022

References

Chen LC, Papandreou G, Kokkinos I (2015) Semantic image segmentation with deep convolutional nets and fully connected CRFs. Comput Sci 4:357–361
Google Scholar
Chen LC, Papandreou G, Kokkinos I (2016) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Xu X, Joo H, Mori G, Savva M (2021) D3D-HOI: dynamic 3D human-object interactions from videos. arXiv:2108.08420
Chen LC, Zhu Y, Papandreou G (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. arXiv:1802.02611
Liu J, Cheng S, Xu X, Xu B, Shuangyuan Y (2019) A spatial and temporal features mixture model with body parts for video-based person re-identification. Appl Intell 49(9):3436–3446
Article Google Scholar
Xu X, Wang X, Kitani KM (2018) Error correction maximization for deep image hashing. In: British Machine Vision Conference
Yuan Y, Chen X, Wang J (2020) Object contextual representations for semantic segmentation. ECCV
Tao A, Sapra K, Catanzaro B (2020) Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821
Zhang H, Wu C, Zhang Z et al (2020) Resnest: Split attention networks. arXiv:2004.08955
Mohan R, Valada A (2020) Efficientps: efficient panoptic segmentation. Int J Comput Vis 129:1551–1579
Article Google Scholar
Hariharan B, Arbelaez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. ICCV
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. CVPR
Papandreou G, Chen LC, Murphy K, Yuille AL (2015) Weakly- and semi-supervised learning of a dcnn for semantic image segmentation. ICCV
Chen LC, Yang Y, Wang J, Xu W, Yuille AL (2016) Attention to scale: Scale-aware semantic image segmentation. CVPR
Zhang F, Xu X, Nauata N, Furukawa Y (2021) Structured outdoor architecture reconstruction by exploration and classification. ICCV
Li X, Liu Z, Luo P, Loy CC, Tang X (2017) Not all pixels are equal: difficulty-aware semantic segmentation via deep layer cascade. arXiv:1704.01344
Vemulapalli R, Tuzel O, Liu MY, Chellappa R (2016) Gaussian conditional random field network for semantic segmentation. CVPR
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P (2015) Conditional random fields as recurrent neural networks. ICCV
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. ICCV
Dai J, He K, Sun J (2015) Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. ICCV
Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions, ICLR
Chandra S, Kokkinos I (2016) Fast, exact and multi-scale inference for semantic image segmentation with deep Gaussian CRFs. arXiv:1603.08358
Kokkinos I (2016) Pushing the boundaries of boundary detection using deep learning. ICLR
Yan Z, Zhang H, Jia Y, Breuel T, Yu Y (2016) Combining the best of convolutional layers and recurrent layers: a hybrid network for semantic segmentation. arXiv:1603.04871
Ghiasi G, Fowlkes CC (2016) Laplacian reconstruction and refinement for semantic segmentation. arXiv: 1605. 02264
Lin G, Shen C, Reid I et al (2015) Efficient piecewise training of deep structured models for semantic segmentation. arXiv:1504.01013
Arnab A, Jayasumana S, Zheng S, Torr P (2015) Higher order potentials in end-to-end trainable conditional random fields. arXiv:1511.08119
Shen F, Zeng G (2016) Fast semantic image segmentation with high order context and guided filtering. arXiv:1605.04068
Wu Z, Shen C, van den Hengel A (2016) Bridging category-level and instance-level semantic image segmentation. arXiv:1605.06885
Sun H, Xie D, Pu S (2016) Mixed context networks for semantic segmentation. arXiv:1610.05854
Badrinarayanan V, Handa A, Cipolla R (2015) SegNet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. Comput Sci
Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell G (2017) Understanding convolution for semantic segmentation. arXiv:1702.08502
Uhrig J, Cordts M, Franke U, Brox T (2016) Pixel-level encoding and depth layering for instance-level semantic labeling. arXiv:1604.05096
Shellhamer E, Long J, Darrell T (2014) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651
Article Google Scholar
Fan H, Mei X, Prokhorov D, Ling H (2016) Multi-level contextual rnns with attention model for scene labeling. arXiv:1607.02537
Pohlen T, Hermans A, Mathias M, Leibe B (2016) Full resolution residual networks for semantic segmentation in street scenes. arXiv:1611.08323
Ozdemir F, Goksel O (2019) Extending pretrained segmentation networks with additional anatomical structures. Int J Comput Assisted Radiol Surg
Li X, Jie Z, Wang W, Liu C, Yang J, Shen X, Lin Z, Chen Q, Yan S, Feng J (2017) Foveanet: perspective-aware urban scene parsing. arXiv:1708.02421
Kreso I, Segvi’c S, Krapac J (2017) Ladder-style densenets for semantic segmentation of large natural images. ICCV
Jin X, Li X, Xiao H, Shen X, Lin Z, Yang J, Chen Y, Dong J, Liu L, Jie Z, Feng J, Yan S (2017) Video scene parsing with predictive feature learning, ICCV
Liang X, Zhou H, Xing E (2018) Dynamic-structured semantic propagation network. CVPR
Peng C, Zhang X, Yu G, Luo G, Sun J (2017) Large kernel matters - improve semantic segmentation by global convolutional network. CVPR
Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell GW (2018) Understanding convolution for semantic segmentation. WACV
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. arXiv preprint arXiv:1808.00897

Download references

Acknowledgments

This research was funded by Cangzhou key research and development plan [204102013].

Author information

Authors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
Erjing Zhou, Baomin Xu & Hongwei Wu
School of Computing Science, Simon Fraser University, Vancouver, Canada
Xiang Xu

Authors

Erjing Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Baomin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiang Xu or Baomin Xu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, E., Xu, X., Xu, B. et al. An enhancement model based on dense atrous and inception convolution for image semantic segmentation. Appl Intell 53, 5519–5531 (2023). https://doi.org/10.1007/s10489-022-03448-w

Download citation

Accepted: 25 February 2022
Published: 25 June 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10489-022-03448-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An enhancement model based on dense atrous and inception convolution for image semantic segmentation

Abstract

Access this article

Similar content being viewed by others

AS-TransUnet: Combining ASPP and Transformer for Semantic Segmentation

LDANet: the laplace-guided detail-constrained asymmetric network for real-time semantic segmentation

Semantic Image Segmentation with Feature Fusion Based on Laplacian Pyramid

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An enhancement model based on dense atrous and inception convolution for image semantic segmentation

Abstract

Access this article

Similar content being viewed by others

AS-TransUnet: Combining ASPP and Transformer for Semantic Segmentation

LDANet: the laplace-guided detail-constrained asymmetric network for real-time semantic segmentation

Semantic Image Segmentation with Feature Fusion Based on Laplacian Pyramid

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation