Skip to main content
Log in

CNN-EFF: CNN Based Edge Feature Fusion in Semantic Image Labelling and Parsing

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Semantic segmentation and image parsing have rapidly become an eminent research area in computer vision and machine learning domain. Many applications have required a robust mechanism for segmentation, such as self-driving, augmentative reality, object recognition, etc. Due to the high applicability in the various domains, In this paper, we have introduced a two-step frame-work that parses the image into predefined labels by using a novel CNN architecture and improving the likelihood of labels. In step-1, nine-layer CNN architecture has been introduced, which trains on minimal training samples and results in the pixel-wise Soft-Max probabilities. These probabilities are the soft estimates derived from a hard classifier, i.e., MLP. Data in step-1 has been prepared in the form of a patch-label set. In step-2, we have introduced a Jacobian optimization-based label relaxation method that fuses the local extrema as an edge prior. The proposed frame-work has been denoted as CNN-EFF in this work. The CNN-EFF scheme has been evaluated two publicly available benchmark data-sets, which has arranged in the form of image and their pixel label ground-truth. The experimental results have been compared with the previously proposed state-of-the-art methods. The CNN-EFF has greatly improved semantic labeling accuracy up to a significant gain from the past techniques. The CNN-EFF process has reported 84.42%, 85.91%, 94.66%, 97.14%, and 98.27% accuracy for the Highway, House, sheep, Horse-rider, and Horse-keeper images, respectively. Conclusively, the Proposed frame-work has out-performed the previously proposed state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Garcia-Garcia A, Orts-Escolano S, Oprea S, Villena-Martinez V, Martinez-Gonzalez P, Garcia-Rodriguez J (2018) A survey on deep learning techniques for image and video semantic segmentation. Applied Soft Computing 70:41–65. https://doi.org/10.1016/j.asoc.2018.05.018, http://www.sciencedirect.com/science/article/pii/S1568494618302813

  2. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, and Lu H (2019) Dual Attention Network for Scene Segmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3141-3149,

  3. Chen, Hang (2020) ResNeSt: Split-Attention Networks. In: eprint=2004.08955 2020 arXiv , cs.CV,

  4. Liu X, Deng Z, Yang Y (2018) Recent progress in semantic image segmentation. Artificial Intelligence Review. https://doi.org/10.1007/s10462-018-9641-3

  5. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der Laak JA, van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Medical Image Analysis 42:60–88. https://doi.org/10.1016/j.media.2017.07.005, http://www.sciencedirect.com/science/article/pii/S1361841517301135

  6. Hu T, Wu W, Liu L (2014) Combination of hard and soft classification method based on adaptive threshold. In: 2014 IEEE Geoscience and Remote Sensing Symposium, pp 4180–4183, https://doi.org/10.1109/IGARSS.2014.6947409

  7. Foody GM (2002) Hard and soft classifications by a neural network with a non-exhaustively defined set of classes. Int J Remote Sens 23(18):3853–3864. https://doi.org/10.1080/01431160110109570

    Article  Google Scholar 

  8. Liu J, Wang C, Su H, Du B, Tao D (2020) Multistage GAN for fabric defect detection. IEEE Transactions on Image Processing 3388-3400 –29, https://doi.org/10.1109/TIP.2019.2959741

  9. Li X, Du B, Xu C, Zhang Y, Zhang L, Tao D (2020) Robust learning with imperfect privileged information. Artificial Intelligence 0004-3702 –282, https://doi.org/10.1016/j.artint.2020.103246, https://www.sciencedirect.com/science/article/pii/S0004370220300114

  10. Lateef F, Ruichek Y (2019) Survey on semantic segmentation using deep learning techniques. Neurocomputing 338:321–348. https://doi.org/10.1016/j.neucom.2019.02.003, http://www.sciencedirect.com/science/article/pii/S092523121930181X

  11. Li R, Wang S, Gu D (2018) Ongoing evolution of visual slam from geometry to deep learning: challenges and opportunities. Cogn Comput 10(6):875–889. https://doi.org/10.1007/s12559-018-9591-8

    Article  Google Scholar 

  12. Halstead MA, Denman S, Sridharan S, Tian Y, Fookes C (2019) Multimodal clothing recognition for semantic search in unconstrained surveillance imagery. Journal of Visual Communication and Image Representation 58:439–452. https://doi.org/10.1016/j.jvcir.2018.12.001, http://www.sciencedirect.com/science/article/pii/S1047320318303274

  13. Luo L, Wang X, Hu S, Hu X, Chen L (2017) Interactive image segmentation based on samples reconstruction and flda. Journal of Visual Communication and Image Representation 43:138–151. https://doi.org/10.1016/j.jvcir.2016.12.012, http://www.sciencedirect.com/science/article/pii/S1047320316302656

  14. Ru L, Du B, Wu C (2021) Multi-temporal scene classification and scene change detection with correlation based fusion. IEEE Trans Image Process 30:1382–1394. https://doi.org/10.1109/TIP.2020.3039328

    Article  MathSciNet  Google Scholar 

  15. Li Y, Qi H, Dai J, Ji X, Wei Y (2017) Fully convolutional instance-aware semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4438–4446, https://doi.org/10.1109/CVPR.2017.472

  16. Li T, Leng J, Kong L, Guo S, Bai G, Wang K (2019) Dcnr: deep cube cnn with random forest for hyperspectral image classification. Multimedia Tools Appl 78(3):3411–3433. https://doi.org/10.1007/s11042-018-5986-5

    Article  Google Scholar 

  17. Wu Z, Gao Y, Li L, Xue J, Li Y (2019) Semantic segmentation of high-resolution remote sensing images using fully convolutional network with adaptive threshold. Connect Sci 31(2):169–184. https://doi.org/10.1080/09540091.2018.1510902

    Article  Google Scholar 

  18. Holliday A, Barekatain M, Laurmaa J, Kandaswamy C, Prendinger H (2017) Speedup of deep learning ensembles for semantic segmentation using a model compression technique. Computer Vision and Image Understanding 164:16–26. https://doi.org/10.1016/j.cviu.2017.05.004, http://www.sciencedirect.com/science/article/pii/S1077314217300826, deep Learning for Computer Vision

  19. Hu W, Hu H (2019) Discriminant deep feature learning based on joint supervision loss and multi-layer feature fusion for heterogeneous face recognition. Computer Vision and Image Understanding 184:9–21. https://doi.org/10.1016/j.cviu.2019.04.003, http://www.sciencedirect.com/science/article/pii/S1077314219300566

  20. Farabet C, Couprie C, Najman L, LeCun Y (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929. https://doi.org/10.1109/TPAMI.2012.231

    Article  Google Scholar 

  21. Li Y, Sohel F, Bennamoun M, Lei H (2015) Outdoor scene labelling with learned features and region consistency activation. In: 2015 IEEE International Conference on Image Processing (ICIP), pp 1374–1378, https://doi.org/10.1109/ICIP.2015.7351025

  22. Jiang H, Guo Y (2019) Multi-class multimodal semantic segmentation with an improved 3d fully convolutional networks. Neurocomputing https://doi.org/10.1016/j.neucom.2018.11.103, http://www.sciencedirect.com/science/article/pii/S0925231219304187

  23. Ning Q, Zhu J, Chen C (2018) Very fast semantic image segmentation using hierarchical dilation and feature refining. Cogn Comput 10(1):62–72. https://doi.org/10.1007/s12559-017-9530-0

    Article  Google Scholar 

  24. Chen LC (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Computer Vision – ECCV 2018 Springer International Publishing , pp 833–851

  25. Niu X, Yan B, Tan W, Wang J (2019) Effective image restoration for semantic segmentation. Neurocomputing. https://doi.org/10.1016/j.neucom.2019.09.063, http://www.sciencedirect.com/science/article/pii/S0925231219313311

  26. Li R, Gu D, Liu Q, Long Z, Hu H (2018) Semantic scene mapping with spatio-temporal deep neural network for robotic applications. Cogn Comput 10(2):260–271. https://doi.org/10.1007/s12559-017-9526-9

    Article  Google Scholar 

  27. Liu X, Deng Z (2018) Segmentation of drivable road using deep fully convolutional residual network with pyramid pooling. Cogn Comput 10(2):272–281. https://doi.org/10.1007/s12559-017-9524-y

    Article  MathSciNet  Google Scholar 

  28. Zhu X, Zhang X, Zhang XY, Xue Z, Wang L (2019) A novel framework for semantic segmentation with generative adversarial network. Journal of Visual Communication and Image Representation 58:532–543. https://doi.org/10.1016/j.jvcir.2018.11.020, http://www.sciencedirect.com/science/article/pii/S1047320318302931

  29. Basaeed E, Bhaskar H, Al-Mualla M (2016) Supervised remote sensing image segmentation using boosted convolutional neural networks. Knowledge-Based Systems 99:19–27. https://doi.org/10.1016/j.knosys.2016.01.028, http://www.sciencedirect.com/science/article/pii/S0950705116000484

  30. Le TN, Nguyen TV, Nie Z, Tran MT, Sugimoto A (2019) Anabranch network for camouflaged object segmentation. Computer Vision and Image Understanding 184:45–56. https://doi.org/10.1016/j.cviu.2019.04.006, http://www.sciencedirect.com/science/article/pii/S1077314219300608

  31. Lekic V, Babic Z (2019) Automotive radar and camera fusion using generative adversarial networks. Computer Vision and Image Understanding 184:1–8. https://doi.org/10.1016/j.cviu.2019.04.002, http://www.sciencedirect.com/science/article/pii/S1077314219300530

  32. Chaudhuri U, Banerjee B, Bhattacharya A (2019) Siamese graph convolutional network for content based remote sensing image retrieval. Computer Vision and Image Understanding 184:22–30. https://doi.org/10.1016/j.cviu.2019.04.004, http://www.sciencedirect.com/science/article/pii/S1077314219300578

  33. Xie J, Yu L, Zhu L, Chen X (2017) Semantic image segmentation method with multiple adjacency trees and multiscale features. Cogn Comput 9(2):168–179. https://doi.org/10.1007/s12559-016-9441-5

    Article  Google Scholar 

  34. Kosov S, Shirahama K, Li C, Grzegorzek M (2018) Environmental microorganism classification using conditional random fields and deep convolutional neural networks. Pattern Recognition 77:248–261. https://doi.org/10.1016/j.patcog.2017.12.021, http://www.sciencedirect.com/science/article/pii/S0031320317305174

  35. Wang W, He C, Xia XG (2018) A constrained total variation model for single image dehazing. Pattern Recognition 80:196–209. https://doi.org/10.1016/j.patcog.2018.03.009, http://www.sciencedirect.com/science/article/pii/S0031320318300864

  36. Liu Y, Chen X, Zhang C, Sprague A (2009) Semantic clustering for region-based image retrieval. Journal of Visual Communication and Image Representation 20(2):157 – 166, https://doi.org/10.1016/j.jvcir.2008.11.006, http://www.sciencedirect.com/science/article/pii/S1047320308001132, special issue on Emerging Techniques for Multimedia Content Sharing, Search and Understanding

  37. shan Zhu S, Yung NH, (2014) Sub-scene segmentation using constraints based on gestalt principles. Journal of Visual Communication and Image Representation 25(5):994–1005. https://doi.org/10.1016/j.jvcir.2014.02.017, http://www.sciencedirect.com/science/article/pii/S1047320314000558

  38. Wang LL, Yung NH (2015) Hybrid graphical model for semantic image segmentation. Journal of Visual Communication and Image Representation 28:83–96. https://doi.org/10.1016/j.jvcir.2015.01.014, http://www.sciencedirect.com/science/article/pii/S1047320315000218

  39. Kittler J, Illingworth J (1985) Relaxation labelling algorithms – a review. Image and Vision Computing 3(4):206 – 216, https://doi.org/10.1016/0262-8856(85)90009-5, http://www.sciencedirect.com/science/article/pii/0262885685900095, papers from the 1985 Alvey Computer Vision and Image Interpretation Meeting

  40. Fang X, Xu Y, Li X, Lai Z, Wong WK, Fang B (2018) Regularized label relaxation linear regression. IEEE Trans Neural Netw Learn Syst 29(4):1006–1018. https://doi.org/10.1109/TNNLS.2017.2648880

    Article  Google Scholar 

  41. He X, Cai D, Yan S, Zhang HJ (2005) Neighborhood preserving embedding. In: Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, vol 2, pp 1208–1213 Vol. 2, https://doi.org/10.1109/ICCV.2005.167

  42. Li X, Lin S, Yan S, Xu D (2008) Discriminant locally linear embedding with high-order tensor data. IEEE Trans Syst, Man, Cybern, Part B (Cybern) 38(2):342–352. https://doi.org/10.1109/TSMCB.2007.911536

    Article  Google Scholar 

  43. Cai Z, Shao L (2018) Rgb-d scene classification via multi-modal feature learning. Cogn Comput. https://doi.org/10.1007/s12559-018-9580-y

    Article  Google Scholar 

  44. Marinoni A, Gamba P (2017) Unsupervised data driven feature extraction by means of mutual information maximization. IEEE Trans Comput Imaging 3(2):243–253. https://doi.org/10.1109/TCI.2017.2669731

    Article  MathSciNet  Google Scholar 

  45. Ma X, Liu W, Tao D, Zhou Y (2019) Ensemble p-laplacian regularization for scene image recognition. Cogn Comput. https://doi.org/10.1007/s12559-019-09637-z

    Article  Google Scholar 

  46. Yao Y, Guo P, Xin X, Jiang Z (2014) Image fusion by hierarchical joint sparse representation. Cogn Comput 6(3):281–292. https://doi.org/10.1007/s12559-013-9235-y

    Article  Google Scholar 

  47. Zhang L, Barnden J (2012) Affect sensing using linguistic, semantic and cognitive cues in multi-threaded improvisational dialogue. Cogn Comput 4(4):436–459. https://doi.org/10.1007/s12559-012-9170-3

    Article  Google Scholar 

  48. Bian X, Zhang T, Zhang X, Yan L, Li B (2013) Clustering-based extraction of near border data samples for remote sensing image classification. Cogn Comput 5(1):19–31. https://doi.org/10.1007/s12559-012-9147-2

    Article  Google Scholar 

  49. Zhang A, Liu S, Sun G, Huang H, Ma P, Rong J, Ma H, Lin C, Wang Z (2018) Clustering of remote sensing imagery using a social recognition-based multi-objective gravitational search algorithm. Cogn Comput. https://doi.org/10.1007/s12559-018-9582-9

    Article  Google Scholar 

  50. Sun L, Wu Z, Liu J, Wei Z (2013) Supervised hyperspectral image classification using sparse logistic regression and spatial-tv regularization. In: 2013 IEEE International Geoscience and Remote Sensing Symposium - IGARSS, pp 1019–1022, https://doi.org/10.1109/IGARSS.2013.6721336

  51. Rodriguez P (2013) Total variation regularization algorithms for images corrupted with different noise models: A review. J Electr Comput Eng. https://doi.org/10.1155/2013/217021

    Article  MathSciNet  Google Scholar 

  52. Marquina A, Osher SJ (2008) Image super-resolution by tv-regularization and bregman iteration. J Sci Comput 37(3):367–382. https://doi.org/10.1007/s10915-008-9214-8

    Article  MathSciNet  MATH  Google Scholar 

  53. Li J, Khodadadzadeh M, Plaza A, Jia X, Bmioucas-Dias JM (2016) A discontinuity preserving relaxation scheme for spectral-spatial hyperspectral image classification. IEEE J Select Topics Appl Earth Obs Remote Sens 9(2):625–639. https://doi.org/10.1109/JSTARS.2015.2470129

    Article  Google Scholar 

Download references

Acknowledgements

We have obtained the pascal-VOC dataset from "http://host.robots.ox.ac.uk/pascal/VOC/" and sift-flow dataset from "https://www.kaggle.com/quanbk/sift-flow-dataset." We have performed the CNN training on the tensor-flow with python framework. We are thankful to Cao et al, Department of Mathematics and Statistics,Xi’an Jiaoting University, China, for providing the python helper functions to design the CNN model.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vishal Srivastava.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Srivastava, V., Biswas, B. CNN-EFF: CNN Based Edge Feature Fusion in Semantic Image Labelling and Parsing. Neural Process Lett 54, 1753–1781 (2022). https://doi.org/10.1007/s11063-021-10704-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10704-6

Keywords

Navigation