Skip to main content
Log in

An enhancement model based on dense atrous and inception convolution for image semantic segmentation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The goal of semantic segmentation is to classify each pixel in the image, so as to segment out the specific contour of the target. Most previous semantic segmentation models cannot generate enough semantic information for each pixel to understand the content of complex scenes. In this paper, we propose a novel semantic segmentation model Ince-DResAsppNet based on dense convoluted separation convolution. Unlike the previous model, our model revolves around reducing semantic information loss and enhancing detailed information. In the feature extraction part of the model, the idea of Dense and Ince is introduced to expand the number of channels on the basis of feature reuse. In the feature fusion part, Dense and Atrous’s idea of ​​dense dilated based on coprime factors is introduced, combined with multi-scale feature information to expand the receptive field and collect more dense pixels. Experiments conducted on the dataset PASCAL VOC 2012 and the CityScapes dataset show that our method performs better than the existing semantic segmentation model. Our model achieves 83.3% and 78.1% segmentation accuracy on the mIoU indicator, which surpasses many classical semantic segmentation models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Chen LC, Papandreou G, Kokkinos I (2015) Semantic image segmentation with deep convolutional nets and fully connected CRFs. Comput Sci 4:357–361

    Google Scholar 

  2. Chen LC, Papandreou G, Kokkinos I (2016) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  3. Xu X, Joo H, Mori G, Savva M (2021) D3D-HOI: dynamic 3D human-object interactions from videos. arXiv:2108.08420

  4. Chen LC, Zhu Y, Papandreou G (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. arXiv:1802.02611

  5. Liu J, Cheng S, Xu X, Xu B, Shuangyuan Y (2019) A spatial and temporal features mixture model with body parts for video-based person re-identification. Appl Intell 49(9):3436–3446

    Article  Google Scholar 

  6. Xu X, Wang X, Kitani KM (2018) Error correction maximization for deep image hashing. In: British Machine Vision Conference

  7. Yuan Y, Chen X, Wang J (2020) Object contextual representations for semantic segmentation. ECCV

  8. Tao A, Sapra K, Catanzaro B (2020) Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821

  9. Zhang H, Wu C, Zhang Z et al (2020) Resnest: Split attention networks. arXiv:2004.08955

  10. Mohan R, Valada A (2020) Efficientps: efficient panoptic segmentation. Int J Comput Vis 129:1551–1579

    Article  Google Scholar 

  11. Hariharan B, Arbelaez P, Bourdev L, Maji S, Malik J (2011) Semantic contours from inverse detectors. ICCV

  12. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. CVPR

  13. Papandreou G, Chen LC, Murphy K, Yuille AL (2015) Weakly- and semi-supervised learning of a dcnn for semantic image segmentation. ICCV

  14. Chen LC, Yang Y, Wang J, Xu W, Yuille AL (2016) Attention to scale: Scale-aware semantic image segmentation. CVPR

  15. Zhang F, Xu X, Nauata N, Furukawa Y (2021) Structured outdoor architecture reconstruction by exploration and classification. ICCV

  16. Li X, Liu Z, Luo P, Loy CC, Tang X (2017) Not all pixels are equal: difficulty-aware semantic segmentation via deep layer cascade. arXiv:1704.01344

  17. Vemulapalli R, Tuzel O, Liu MY, Chellappa R (2016) Gaussian conditional random field network for semantic segmentation. CVPR

  18. Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P (2015) Conditional random fields as recurrent neural networks. ICCV

  19. Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. ICCV

  20. Dai J, He K, Sun J (2015) Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. ICCV

  21. Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions, ICLR

  22. Chandra S, Kokkinos I (2016) Fast, exact and multi-scale inference for semantic image segmentation with deep Gaussian CRFs. arXiv:1603.08358

  23. Kokkinos I (2016) Pushing the boundaries of boundary detection using deep learning. ICLR

  24. Yan Z, Zhang H, Jia Y, Breuel T, Yu Y (2016) Combining the best of convolutional layers and recurrent layers: a hybrid network for semantic segmentation. arXiv:1603.04871

  25. Ghiasi G, Fowlkes CC (2016) Laplacian reconstruction and refinement for semantic segmentation. arXiv: 1605. 02264

  26. Lin G, Shen C, Reid I et al (2015) Efficient piecewise training of deep structured models for semantic segmentation. arXiv:1504.01013

  27. Arnab A, Jayasumana S, Zheng S, Torr P (2015) Higher order potentials in end-to-end trainable conditional random fields. arXiv:1511.08119

  28. Shen F, Zeng G (2016) Fast semantic image segmentation with high order context and guided filtering. arXiv:1605.04068

  29. Wu Z, Shen C, van den Hengel A (2016) Bridging category-level and instance-level semantic image segmentation. arXiv:1605.06885

  30. Sun H, Xie D, Pu S (2016) Mixed context networks for semantic segmentation. arXiv:1610.05854

  31. Badrinarayanan V, Handa A, Cipolla R (2015) SegNet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. Comput Sci

  32. Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell G (2017) Understanding convolution for semantic segmentation. arXiv:1702.08502

  33. Uhrig J, Cordts M, Franke U, Brox T (2016) Pixel-level encoding and depth layering for instance-level semantic labeling. arXiv:1604.05096

  34. Shellhamer E, Long J, Darrell T (2014) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651

    Article  Google Scholar 

  35. Fan H, Mei X, Prokhorov D, Ling H (2016) Multi-level contextual rnns with attention model for scene labeling. arXiv:1607.02537

  36. Pohlen T, Hermans A, Mathias M, Leibe B (2016) Full resolution residual networks for semantic segmentation in street scenes. arXiv:1611.08323

  37. Ozdemir F, Goksel O (2019) Extending pretrained segmentation networks with additional anatomical structures. Int J Comput Assisted Radiol Surg

  38. Li X, Jie Z, Wang W, Liu C, Yang J, Shen X, Lin Z, Chen Q, Yan S, Feng J (2017) Foveanet: perspective-aware urban scene parsing. arXiv:1708.02421

  39. Kreso I, Segvi’c S, Krapac J (2017) Ladder-style densenets for semantic segmentation of large natural images. ICCV

  40. Jin X, Li X, Xiao H, Shen X, Lin Z, Yang J, Chen Y, Dong J, Liu L, Jie Z, Feng J, Yan S (2017) Video scene parsing with predictive feature learning, ICCV

  41. Liang X, Zhou H, Xing E (2018) Dynamic-structured semantic propagation network. CVPR

  42. Peng C, Zhang X, Yu G, Luo G, Sun J (2017) Large kernel matters - improve semantic segmentation by global convolutional network. CVPR

  43. Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell GW (2018) Understanding convolution for semantic segmentation. WACV

  44. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. arXiv preprint arXiv:1808.00897

Download references

Acknowledgments

This research was funded by Cangzhou key research and development plan [204102013].

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xiang Xu or Baomin Xu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, E., Xu, X., Xu, B. et al. An enhancement model based on dense atrous and inception convolution for image semantic segmentation. Appl Intell 53, 5519–5531 (2023). https://doi.org/10.1007/s10489-022-03448-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03448-w

Keywords

Navigation