Abstract
Few researches have been proposed specifically for real-time semantic segmentation in rainy environments. However, the demand in this area is huge and it is challenging for lightweight networks. Therefore, this paper proposes a lightweight network which is specially designed for the foreground segmentation in rainy environments, named De-raining Semantic Segmentation Network (DRSNet). By analyzing the characteristics of raindrops, the MultiScaleSE Block is targetedly designed to encode the input image, it uses multi-scale dilated convolutions to increase the receptive field, and SE attention mechanism to learn the weights of each channels. To combine semantic information between different encoder and decoder layers, it is proposed to use Asymmetric Skip, that is, the higher semantic layer of encoder employs bilinear interpolation and the output passes through pointwise convolution, then added element-wise to the lower semantic layer of the decoder. According to the control experiments, the performances of MultiScaleSE Block and Asymmetric Skip compared with SEResNet18 and Symmetric Skip respectively are improved to a certain degree on the Foreground Accuracy index. The parameters and the floating point of operations (FLOPs) of DRSNet are only 0.54M and 0.20GFLOPs separately. The state-of-the-art results and real-time performances are achieved on both the UESTC all-day Scenery add rain (UAS-add-rain) and the Baidu People Segmentation add rain (BPS-add-rain) benchmarks with the input sizes of 192*128, 384*256 and 768*512. The speed of DRSNet exceeds all the networks within 1GFLOPs, and Foreground Accuracy index is also the best among the similar magnitude networks on both benchmarks.
Similar content being viewed by others
Availability of code and data
The source code is released at https://github.com/dandingbudanding/DRSNet.
References
Paszke, A., Chaurasia, A., Kim, S., Culurciello, E.: ENet: a deep neural network architecture for real-time semantic segmentation. arXiv: CVPR, (2016).
Wu, T., Tang, S., Zhang, R., Zhang, Y.: CGNet: a light-weight context guided network for semantic segmentation. arXiv: CVPR, (2018).
Kennamer, N., Kirkby, D., Ihler, A., Sanchezlopez, F. J.: ContextNet: deep learning for star galaxy classification. International conference on machine learning, pp 2582–2590 (2018).
Wang, Y., Zhou, Q., Liu, J., Xiong, J., Gao, G., Wu, X., Latecki, L. J.: Lednet: a lightweight encoder-decoder network for real-time semantic segmentation. ICIP, pp 1860–1864 (2019).
Li, H., Xiong, P., Fan, H.,Sun, J.: DFANet: deep feature aggregatio-n for real-time semantic segmentation. CVPR, pp 9514–9523 (2019).
Liu, J., Zhou, Q., Qiang, Y., Kang, B., Wu, X., Zheng, B.: FDDWNet: a lightweight convolutional neural network for real-time sementic segmentation. arXiv: CVPR, (2019).
Romera, E., Alvarez, J.M., Bergasa, L.M., Arroyo, R.: ERFNe-t: ef-ficient residual factorized convnet for real-time semantic segmentation. IEEE Trans. Intell. Transp. Syst. 19(1), 263–272 (2018)
Chen, P., Lo, S., Hang, H., Chan, S., Lin, J.: Efficient road lane marking detection with deep learning. 2018 IEEE 23rd International Conference on Digital Signal Processing (DSP), pp 1–5 (2018).
Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. CVPR, pp 5168–5177 (2017).
Li, J., Zhao, Y., Fu, J., Wu, J., Liu, J.: Attention-guided network for semantic video segmentation. IEEE Access, pp 140680–140689 (2019).
Chen, L., Ding, Q., Zou, Q., Chen, Z., Li, L.: DenseLightNet: a light-weight vehicle detection network for autonomous driving. IEEE Transactions On Industrial Electronics, p 1 (2020).
Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: BiSeNet: bilateral segmentation network for real-time semantic segmentation. ECCV, pp 334–349 (2018).
Chaurasia, A., Culurciello, E.: LinkNet: Exploiting encoder R- epresentations for efficient semantic segmentation. visual communications and image processing, pp 1–4 (2017).
Berman, M., Triki, A. R., Blaschko, M. B.: The Lovasz-Softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks. CVPR, pp 4413–4421 (2018).
Tian, Z., He, T., Shen, C., Yan, Y.: Decoders matter for semantic segmentation: data-dependent decoding enables flexible feature aggregation. CVPR, pp 3126–3135 (2019).
Lafferty, J., Mccallum, A., Pereira, F. C. N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. ICML, pp 282–289 (2001).
Wu, J., Chung, A.C.S.: A segmentation model using compound Markov random fields based on a boundary model. IEEE Trans. Image Process. 16(1), 241–252 (2006)
Yang, F., Jiang, T.: Pixon-based image segmentation with Markov random fields. IEEE Trans. Image Process. Publ. IEEE Signal Process. Soc. 12(12), 1552–1559 (2003)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional N- etworks for biomedical image segmentation. Medical image computing and computer assisted intervention, pp 234–241 (2015).
Chollet, F.,: Xception: deep learning with depthwise separable co- nvolutions. CVPR, pp 1800–1807 (2017).
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. CVPR, pp 5987–5995 (2017).
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-ResNet and the impact of residual connections on learning. National conference on artificial intelligence, pp 4278–4284 (2016).
L. C. Chen, Y. Z. G. P.: Encoder decoder with atrous separable convolution for semantic image segmentation. ECCV, pp 801–818 (2018).
Zhang, Y., Chen, H., He, Y., Ye, M., Cai, X., Zhang, D.: Road segmentation for all-day outdoor robot navigation. Neurocomputing 314, 316–325 (2018)
Wu, Z., Huang, Y., Yu, Y., Liang, W., Tan, T.: Early hierarchical contexts learned by convolutional networks for image segmentation. International Conference on Pattern Recognition, pp 1538–1543 (2014).
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing n- etwork. CVPR, pp 6230–6239 (2017).
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. CVPR, pp 3213–3223 (2016).
Brostow, G. J., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation a-nd recognition using structure from motion point clouds. ECCV, pp 44–57 (2008).
Jiao, S., Li, X., Lu, X.: An improved ostu method for image segmentation. 2006 8th International Conference on Signal Processing, p 1 (2006).
He, R., Datta, S., Sajja, B. R., Mehta, M., Narayana, P. A.: Adaptive FCM with contextual constrains for segmentation of multi-spectral MRI. The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp 1660–1663 (2004).
Chien, S., Huang, Y., Chen, L.: Predictive watershed: a fast watershed algorithm for video segmentation. IEEE Trans. Circuits Syst. Video Technol. 13(5), 453–461 (2003)
Yu, M.T., Sein, M.M.: Automatic image captioning system using integration of N-cut and color-based segmentation method. SICE Annu. Conf. 2011, 28–31 (2011)
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)
Fitzgerald, D.F., Fitzgerald, D.F., Wills, D.S., Wills, D.S., Wills, L.M., Wills, L.M.: Real-time, parallel segmentation of high-resolution images on multi-core platforms. J. Real-Time Image Proc. 13(4), 685–702 (2017)
Kryjak, T., Kryjak, T., Komorkiewicz, M., Komorkiewicz, M., Gorgon, M., Gorgon, M.: Real-time background generation and foreground object segmentation for high-definition colour video stream in FPGA device. J. Real-Time Image Proc. 9(1), 61–77 (2014)
Wang, S., Wang, S., Sun, J., Sun, J., Phillips, P., Phillips, P., Zhao, G., Zhao, G., Zhang, Y., Zhang, Y.: Polarimetric synthetic aperture radar image segmentation by convolutional neural network using graphical processing units. J. Real-Time Image Proc. 15(3), 631–642 (2018)
Graca, C., Graca, C., Falcao, G., Falcao, G., Figueiredo, I.N., Figueiredo, I.N., Kumar, S., Kumar, S.: Hybrid multi-GPU computing: accelerated kernels for segmentation and object detection with medical image processing applications. J. Real-Time Image Proc. 13(1), 227–244 (2017)
Bendaoudi, H., Bendaoudi, H., Cheriet, F., Cheriet, F., Manraj, A., Man-raj, A., Ben Tahar, H., Ben Tahar, H., Langlois, J.M.P., Langlois, J.M.P.: Flexible architectures for retinal blood vessel segmentation in high-resolution fundus images. J. Real-Time Image Process. 15(1), 31–42 (2018)
Yao, C., Hu, J., Min, W., Deng, Z., Zou, S., Min, W.: A novel real-time fall detection method based on head segmentation and convolutional neural network. J. Real-Time Image Process (2020).
Chen, L. C., G. P. I. K.: DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 4(40), 834–848 (2017).
Chen, L. C., G. P. I. K.,: Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv: CVPR, (2014).
Chen, L. C., G. P. F. S.,: Rethinking atrous convolution for semantic image segmentation. arXiv: CVPR, (2017).
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. international conference on computer vision, pp 1026–1034 (2015).
Yang, W., Tan, R. T., Feng, J., Liu, J., Guo, Z., Yan, S.: deep joint rain detection and removal from a single image. CVPR, pp 1685–1694 (2016).
Zhang, H., Patel, V. M.: Density-aware Single Image De-raining usin-g a Multi-stream Dense Network. CVPR, pp 695–704 (2018).
Li, Y., Tan, R.T., Guo, X., Lu, J., Brown, M.S.: Single image rain streak decomposition using layer priors. IEEE Trans. Image Process. 26(8), 3874–3885 (2017)
Li, B., Peng, X., Wang, Z., Xu, J., Feng, D.: An all-in-one network for dehazing and beyond. arXiv: CVPR, (2017).
Bolun, C., Xiangmin, X., Kui, J., Chunmei, Q., Dacheng, T.: DehazeNet: an end-to-end system for single image haze removal. IEEE Trans Image Process. 25(11), 5187–5198 (2016)
Riaz, I., Yu, T., Rehman, Y., Shin, H.: Single image dehazing via reliability guided fusion. J. Vis. Commun. Image Represent. 40, 85–97 (2016)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethink- ing the inception architecture for computer vision, pp 2818–2826 (2016).
Sindagi, V. A., Oza, P., Yasarla, R., Patel, V. M.: Prior-based domain adaptive object detection for hazy and rainy conditions (2019).
Acknowledgements
The authors would like to thank the Associate Editor and the Reviewers for their constructive comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, F., Zhang, Y. A De-raining semantic segmentation network for real-time foreground segmentation. J Real-Time Image Proc 18, 873–887 (2021). https://doi.org/10.1007/s11554-020-01042-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-020-01042-2