Abstract
Semantic segmentation is the process to classify each pixel of an image. The current state-of-the-art semantic segmentation techniques use end-to-end trainable deep models. Generally, the training of these models is controlled by some external hyper-parameters rather to use the variation in data. In this paper, we investigate the impact of data smoothing on the training and generalization of deep semantic segmentation models. A mechanism is proposed to select the best level of smoothing to get better generalization of the deep semantic segmentation models. Furthermore, a smoothing layer is included in the deep semantic segmentation models to automatically adjust the level of smoothing. Extensive experiments are performed to validate the effectiveness of the proposed smoothing strategies.
This is a preview of subscription content, access via your institution.









References
Atick JJ (2011) Could information theory provide an ecological theory of sensory processing? Netw Comput Neural Syst 22(1–4):4–44. https://doi.org/10.3109/0954898X.2011.638888
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Barlow HB (2012) Possible principles underlying the transformations of sensory messages. In: Sensory Communication, MIT Press. https://doi.org/10.7551/mitpress/9780262518420.003.0013
Bergstra J, Cox DD (2013) Hyperparameter optimization and boosting for classifying facial expressions: How good can a “null” model be? arXiv preprint arXiv:1306.3476
Bjerrum EJ, Glahder M, Skov T (2017) Data augmentation of spectral data for convolutional neural network (CNN) based deep chemometrics. arXiv preprint arXiv:1710.01927
Burton GJ, Moorhead IR (1987) Color and spatial structure in natural scenes. Appl Opt 26(1):157–170. https://doi.org/10.1364/AO.26.000157
Cai X, Chan R, Nikolova M, Zeng T (2017) A three-stage approach for segmenting degraded color images: smoothing, lifting and thresholding (SLAT). J Sci Comput 72(3):1313–1332
Cesarei AD, Loftus GR, Mastria S, Codispoti M (2017) Understanding natural scenes: contributions of image statistics. Neurosci Biobehav Rev 74:44–57. https://doi.org/10.1016/j.neubiorev.2017.01.012
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Ciresan D, Giusti A, Gambardella LM, Schmidhuber J (2012) Deep neural networks segment neuronal membranes in electron microscopy images. In: Advances in neural information processing systems, pp 2843–2851
Couprie C, Farabet C, Najman L, LeCun Y (2013) Indoor semantic segmentation using depth information. arXiv preprint arXiv:1301.3572
Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2019) Autoaugment: learning augmentation strategies from data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 113–123
Dai J, He K, Sun J (2015) Boxsup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1635–1643
Dai J, He K, Sun J (2016) Instance-aware semantic segmentation via multi-task network cascades. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3150–3158
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159
Dvornik N, Mairal J, Schmid C (2019) On the importance of visual context for data augmentation in scene understanding. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2961896
Fan H, Zhu H (2018) Preservation of image edge feature based on snowfall model smoothing filter. EURASIP J Image Video Process 2018(1):67
Farabet C, Couprie C, Najman L, LeCun Y (2012) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Fawzi A, Samulowitz H, Turaga D, Frossard P (2016) Adaptive data augmentation for image classification. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 3688–3692
Fergus R, Singh B, Hertzmann AP, Roweis ST, Roweis ST, Freeman WT (2006) Removing camera shake from a single photograph. ACM Trans Graph 25(3):787–794
Field DJ (1987) Relations between the statistics of natural images and the response properties of cortical cells. J Opt Soc Am A 4(12):2379–2394. https://doi.org/10.1364/JOSAA.4.002379
Field DJ, Hayes A, Hess RF (1993) Contour integration by the human visual system: evidence for a local “association field”. Vis Res 33(2):173–193. https://doi.org/10.1016/0042-6989(93)90156-Q
Fu J, Liu J, Wang Y, Zhou J, Wang C, Lu H (2019) Stacked deconvolutional network for semantic segmentation. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2019.2895460
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Graves A, Mohamed Ar, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing, pp 6645–6649. IEEE (2013)
Grill-Spector K, Malach R (2004) The human visual cortex. Annu Rev Neurosci 27:649–677
Gu Z, Ju M, Zhang D (2017) A novel retinex image enhancement approach via brightness channel prior and change of detail prior. Pattern Recognit Image Anal 27(2):234–242
Guo L, Chen L, Chen CP, Zhou J (2018) Integrating guided filter into fuzzy clustering for noisy image segmentation. Digit Signal Process 83:235–248
Gupta S, Girshick R, Arbeláez P, Malik J (2014) Learning rich features from RGB-D images for object detection and segmentation. In: European conference on computer vision. Springer, Berlin, pp 345–360
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Hugelier S, Vitale R, Ruckebusch C (2018) Edge-preserving image smoothing constraint in multivariate curve resolution-alternating least squares (MCR-ALS) of hyperspectral data. Appl Spectrosc 72(3):420–431
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Khan A, Jaffar MA (2015) Genetic algorithm and self organizing map based fuzzy hybrid intelligent method for color image segmentation. Appl Soft Comput 32:300–310
Khan A, Jaffar MA, Choi TS (2013) Som and fuzzy based color image segmentation. Multimed Tools Appl 64(2):331–344
Khan A, Jaffar MA, Shao L (2015) A modified adaptive differential evolution algorithm for color image segmentation. Knowl Inf Syst 43(3):583–597
Khan A, ur Rehman Z, Jaffar MA, Ullah J, Din A, Ali A, Ullah N (2019) Color image segmentation using genetic algorithm with aggregation-based clustering validity index (CVI). Signal Image Video Process 13(5):833–841
Khan A, Ullah J, Jaffar MA, Choi TS (2014) Color image segmentation: a novel spatial fuzzy genetic algorithm. Signal Image Video Process 8(7):1233–1243
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Levin A, Weiss Y, Durand F, Freeman WT (2009) Understanding and evaluating blind deconvolution algorithms. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 1964–1971
Li Y, Qi H, Dai J, Ji X, Wei Y (2017) Fully convolutional instance-aware semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2359–2367
Liu D, Wen B, Liu X, Wang Z, Huang TS (2017) When image denoising meets high-level vision tasks: a deep learning approach. arXiv preprint arXiv:1706.04284
Liu Q, Xiong B, Zhang M (2014) Adaptive sparse norm and nonlocal total variation methods for image smoothing. Math Probl Eng. https://doi.org/10.1155/2014/426125
Liu S, Zhang J, Chen Y, Liu Y, Qin Z, Wan T (2019) Pixel level data augmentation for semantic image segmentation using generative adversarial networks. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1902–1906
Liu W, Rabinovich A, Berg AC (2015) Parsenet: looking wider to see better. arXiv preprint arXiv:1506.04579
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Ma R, Tao P, Tang H (2019) Optimizing data augmentation for semantic segmentation on small-scale dataset. In: Proceedings of the 2nd international conference on control and computer vision, pp 77–81
Marmanis D, Schindler K, Wegner J, Galliani S, Datcu M, Stilla U (2018) Classification with an edge: improving semantic image segmentation with boundary detection. ISPRS J Photogramm Remote Sens 135:158–172. https://doi.org/10.1016/j.isprsjprs.2017.11.009
Michaeli T, Irani M (2014) Blind deblurring using internal patch recurrence. In: European conference on computer vision. Springer, Berlin, pp 783–798
Mostajabi M, Yadollahpour P, Shakhnarovich G (2015) Feedforward semantic segmentation with zoom-out features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3376–3385
Neuhold G, Ollmann T, Rota Bulo S, Kontschieder P (2017) The mapillary vistas dataset for semantic understanding of street scenes. In: Proceedings of the IEEE international conference on computer vision, pp 4990–4999
Ning F, Delhomme D, LeCun Y, Piano F, Bottou L, Barbano PE (2005) Toward automatic phenotyping of developing embryos from videos. IEEE Trans Image Process 14(9):1360–1371
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: a deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147
Perez L, Wang J (2017) The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621
Qummar S, Khan FG, Shah S, Khan A, Shamshirband S, Rehman ZU, Khan IA, Jadoon W (2019) A deep learning ensemble approach for diabetic retinopathy detection. IEEE Access 7:150530–150539
Rafeeq MJ, ur Rehman Z, Khan A, Khan IA, Jadoon W (2019) Ligature categorization based Nastaliq Urdu recognition using deep neural networks. Comput Math Organ Theory 25(2):184–195
Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Schaul T, Zhang S, LeCun Y (2013) No more pesky learning rates. In: International conference on machine learning, pp 343–351
Shao L, Zhu F, Li X (2014) Transfer learning for visual categorization: a survey. IEEE Trans Neural Netw Learn Syst 26(5):1019–1034
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Ann Hist Comput 04:640–651
Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):60
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Smith LN (2017) Cyclical learning rates for training neural networks. In: 2017 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 464–472
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tang M, Valipour S, Zhang Z, Cobzas D, Jagersand M (2017) A deep level set method for image segmentation. In: Cardoso MJ, Arbel T, Carneiro G, Syeda-Mahmood T, Tavares JMR, Moradi M, Bradley A, Greenspan H, Papa JP, Madabhushi A, Nascimento JC, Cardoso JS, Belagiannis V, Lu Z (eds) Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Cham, pp 126–134
Tolhurst DJ, Tadmor Y, Chao T (1992) Amplitude spectra of natural images. Ophthal Physiol Opt 12(2):229–232
Wang C, Yang B, Liao Y (2017) Unsupervised image segmentation using convolutional autoencoder with total variation regularization as preprocessing. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1877–1881
Watson DM, Hymers M, Hartley T, Andrews TJ (2016) Patterns of neural response in scene-selective regions of the human brain are affected by low-level manipulations of spatial frequency. NeuroImage 124:107–117. https://doi.org/10.1016/j.neuroimage.2015.08.058
Wolfe JM, Horowitz TS (2004) What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci 5(6):495–501
Wong SC, Gatt A, Stamatescu V, McDonnell MD (2016) Understanding data augmentation for classification: when to warp? In: 2016 International conference on digital image computing: techniques and applications (DICTA). IEEE, pp 1–6
Xian Y, Choudhury S, He Y, Schiele B, Akata Z (2019) Semantic projection network for zero-and few-label semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8256–8265
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122
Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr PH (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1529–1537
Zoran D, Weiss Y (2011) From learning models of natural image patches to whole image restoration. In: 2011 International conference on computer vision. IEEE, pp 479–486
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ul Haq, N., ur Rehman, Z., Khan, A. et al. Impact of data smoothing on semantic segmentation. Neural Comput & Applic 34, 8345–8354 (2022). https://doi.org/10.1007/s00521-020-05341-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05341-4
Keywords
- Semantic segmentation
- SegNet
- Smoothing
- Deep learning