Robust semantic segmentation method of urban scenes in snowy environment

Yin, Hanqi; Yin, Guisheng; Sun, Yiming; Zhang, Liguo; Tian, Ye

doi:10.1007/s00138-024-01540-4

Robust semantic segmentation method of urban scenes in snowy environment

Research
Published: 29 April 2024

Volume 35, article number 59, (2024)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Hanqi Yin¹,
Guisheng Yin¹,
Yiming Sun^2,3,4,
Liguo Zhang¹ &
…
Ye Tian⁵

102 Accesses
Explore all metrics

Abstract

Semantic segmentation plays a crucial role in various computer vision tasks, such as autonomous driving in urban scenes. The related researches have made significant progress. However, since most of the researches focus on how to enhance the performance of semantic segmentation models, there is a noticeable lack of attention given to the performance deterioration of these models in severe weather. To address this issue, we study the robustness of the multimodal semantic segmentation model in snowy environment, which represents a subset of severe weather conditions. The proposed method generates realistically simulated snowy environment images by combining unpaired image translation with adversarial snowflake generation, effectively misleading the segmentation model’s predictions. These generated adversarial images are then utilized for model robustness learning, enabling the model to adapt to the harshest snowy environment and enhancing its robustness to artificially adversarial perturbance to some extent. The experimental visualization results show that the proposed method can generate approximately realistic snowy environment images, and yield satisfactory visual effects for both daytime and nighttime scenes. Moreover, the experimental quantitation results generated by MFNet Dataset indicate that compared with the model without enhancement, the proposed method achieves average improvements of 4.82% and 3.95% on mAcc and mIoU, respectively. These improvements enhance the adaptability of the multimodal semantic segmentation model to snowy environments and contribute to road safety. Furthermore, the proposed method demonstrates excellent applicability, as it can be seamlessly integrated into various multimodal semantic segmentation models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Road Surface Translation Under Snow-Covered and Semantic Segmentation for Snow Hazard Index

High-Level Task-Driven Single Image Deraining: Segmentation in Rainy Days

The Role of Ground Truth Annotation in Semantic Image Segmentation Performance for Autonomous Driving

Data availibility

All data generated or analyzed during this study are included in [13] and [26].

Code availibility

The original codes of the current study are available from the corresponding author on reasonable request.

References

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.-Y., et al.: Segment anything. arXiv preprint arXiv:2304.02643 (2023)
Kim, E., Medioni, G.: Urban scene understanding from aerial and ground lidar data. Mach. Vis. Appl. 22(4), 691–703 (2011)
Article Google Scholar
Gupta, S., Dileep, A.D., Thenkanidiyoor, V.: Recognition of varying size scene images using semantic analysis of deep activation maps. Mach. Vis. Appl. 32(2), 52 (2021)
Article Google Scholar
Tan, X., Xu, K., Cao, Y., Zhang, Y., Ma, L., Lau, R.W.: Night-time scene parsing with a large real dataset. IEEE Trans. Image Process. 30, 9085–9098 (2021)
Article Google Scholar
Xie, Z., Wang, S., Xu, K., Zhang, Z., Tan, X., Xie, Y., Ma, L.: Boosting night-time scene parsing with learnable frequency. IEEE Trans. Image Process. 32, 2386–2398 (2023)
Article Google Scholar
Yin, H., Xie, W., Zhang, J., Zhang, Y., Zhu, W., Gao, J., Shao, Y., Li, Y.: Dual context network for real-time semantic segmentation. Mach. Vis. Appl. 34(2), 22 (2023)
Article Google Scholar
Rizzoli, G., Barbato, F., Zanuttigh, P.: Multimodal semantic segmentation in autonomous driving: a review of current approaches and future perspectives. Technologies 10(4), 90 (2022)
Article Google Scholar
Gao, J., Yi, J., Murphey, Y.L.: Attention-based global context network for driving maneuvers prediction. Mach. Vis. Appl. 33(4), 53 (2022)
Article Google Scholar
Tan, X., Lin, J., Xu, K., Chen, P., Ma, L., Lau, R.W.: Mirror detection with the visual chirality cue. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3492–3504 (2022)
Google Scholar
Tan, X., Ma, Q., Gong, J., Xu, J., Zhang, Z., Song, H., Qu, Y., Xie, Y., Ma, L.: Positive-negative receptive field reasoning for omni-supervised 3d segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 45(12), 15328–15344 (2023)
Article Google Scholar
Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., Cottrell, G.: Understanding convolution for semantic segmentatfion. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1451–1460 (2018)
Yang, Z., Wang, Q., Zeng, J., Qin, P., Chai, R., Sun, D.: RAU-Net: U-Net network based on residual multi-scale fusion and attention skip layer for overall spine segmentation. Mach. Vis. Appl. 34(1), 10 (2023)
Article Google Scholar
Ha, Q., Watanabe, K., Karasawa, T., Ushiku, Y., Harada, T.: MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5108–5115 (2017)
Sun, Y., Zuo, W., Liu, M.: RTFNet: RGB-thermal fusion network for semantic segmentation of urban scenes. IEEE Robotics and Automation Letters 4(3), 2576–2583 (2019)
Article Google Scholar
Houben, T., Huisman, T., Pisarenco, M., Sommen, F., With, P.H.: Depth estimation from a single sem image using pixel-wise fine-tuning with multimodal data. Mach. Vis. Appl. 33(4), 56 (2022)
Article Google Scholar
McEnroe, P., Wang, S., Liyanage, M.: A survey on the convergence of edge computing and AI for UAVs: opportunities and challenges. IEEE Internet Things J. 9(17), 15435–15459 (2022)
Article Google Scholar
Carrillo, H., Quiroga, J., Zapata, L., Maldonado, E.: Automatic football video production system with edge processing. Mach. Vis. Appl. 33(2), 32 (2022)
Article Google Scholar
Asghar, K., Sun, X., Rosin, P.L., Saddique, M., Hussain, M., Habib, Z.: Edge-texture feature-based image forgery detection with cross-dataset evaluation. Mach. Vis. Appl. 30(7–8), 1243–1262 (2019)
Article Google Scholar
Hu, C., Tiliwalidi, K.: Adversarial neon beam: Robust physical-world adversarial attack to DNNs. arXiv preprint arXiv:2204.00853 (2022)
Duan, R., Mao, X., Qin, A.K., Chen, Y., Ye, S., He, Y., Yang, Y.: Adversarial laser beam: Effective physical-world attack to DNNs in a blink. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 16062–16071 (2021)
Tremblay, M., Halder, S.S., De Charette, R., Lalonde, J.-F.: Rain rendering for evaluating and improving robustness to bad weather. Int. J. Comput. Vision 129(2), 341–360 (2020)
Article Google Scholar
Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., Beijbom, O.: nuScenes: A multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11621–11631 (2020)
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213–3223 (2016)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 3354–3361 (2012)
Choi, S., Jung, S., Yun, H., Kim, J.T., Kim, S., Choo, J.: RobustNet: Improving domain generalization in urban-scene segmentation via instance selective whitening. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11580–11590 (2021)
Pitropov, M., Garcia, D.E., Rebello, J., Smart, M., Wang, C., Czarnecki, K., Waslander, S.: Canadian adverse driving conditions dataset. Int. J. Robot. Res. 40(4–5), 681–690 (2021)
Article Google Scholar
Liu, M.-Y., Tuzel, O.: Coupled generative adversarial networks. In: Advances in neural information processing systems, vol. 29 (2016)
Liu, M.-Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Advances in neural information processing systems, vol. 30 (2017)
Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3722–3731 (2017)
Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2107–2116 (2017)
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232 (2017)
Yi, Z., Zhang, H., Tan, P., Gong, M.: DualGAN: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp. 2849–2857 (2017)
Choi, Y., Uh, Y., Yoo, J., Ha, J.-W.: StarGAN v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8188–8197 (2020)
Liu, M.-Y., Huang, X., Mallya, A., Karras, T., Aila, T., Lehtinen, J., Kautz, J.: Few-shot unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10551–10560 (2019)
Pizzati, F., Charette, R.d., Zaccaria, M., Cerri, P.: Domain bridge for unpaired image-to-image translation and unsupervised domain adaptation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 2990–2998 (2020)
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
Yi-de, M., Qing, L., Zhi-Bai, Q.: Automated image segmentation using improved PCNN model based on cross-entropy. In: Proceedings of 2004 international symposium on intelligent multimedia, video and speech processing, pp. 743–746 (2004)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 2980–2988 (2017)
Sudre, C.H., Li, W., Vercauteren, T., Ourselin, S., Jorge Cardoso, M.: Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Deep learning in medical image analysis and multimodal learning for clinical decision support, pp. 240–248 (2017)
Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.: UnitBox: an advanced object detection network. In: Proceedings of the 24th ACM international conference on multimedia, pp. 516–520 (2016)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Fan, X., Wang, Q., Ke, J., Yang, F., Gong, B., Zhou, M.: Adversarially adaptive normalization for single domain generalization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8208–8217 (2021)
Volpi, R., Namkoong, H., Sener, O., Duchi, J.C., Murino, V., Savarese, S.: Generalizing to unseen domains via adversarial data augmentation. In: Advances in neural information processing systems, vol. 31 (2018)
Qiao, F., Peng, X.: Uncertainty-guided model generalization to unseen domains. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6790–6800 (2021)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention–MICCAI 2015, pp. 234–241 (2015)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Hazirbas, C., Ma, L., Domokos, C., Cremers, D.: FuseNet: Incorporating depth into semantic segmentation via fusion-based cnn architecture. In: Computer Vision–ACCV 2016: 13th Asian conference on computer vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part I 13, pp. 213–228 (2017)
Deng, F., Feng, H., Liang, M., Wang, H., Yang, Y., Gao, Y., Chen, J., Hu, J., Guo, X., Lam, T.L.: FEANet: Feature-enhanced attention network for rgb-thermal real-time semantic segmentation. In: 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp. 4467–4473 (2021)
Zhou, W., Liu, J., Lei, J., Yu, L., Hwang, J.-N.: GMNet: graded-feature multilabel-learning network for RGB-thermal urban scene semantic segmentation. IEEE Trans. Image Process. 30, 7790–7802 (2021)
Article Google Scholar
Zhou, W., Zhu, Y., Lei, J., Yang, R., Yu, L.: LSNet: lightweight spatial boosting network for detecting salient objects in RGB-thermal images. IEEE Trans. Image Process. 32, 1329–1340 (2023)
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Key R &D Program of China under Grant No. 2021YFC3320302, in part by the National Natural Science Foundation of China under Grant No. 62303449, in part by Fundamental Research Funds for the Central Universities under Grant No. 3072022TS0604, and in part by the Natural Science Foundation of Liaoning Province under Grant No. 2023-BS-029.

Author information

Authors and Affiliations

College of Computer Science and Technology, Harbin Engineering University, Harbin, 150001, Heilongjiang, China
Hanqi Yin, Guisheng Yin & Liguo Zhang
Key Laboratory of Networked Control Systems, Chinese Academy of Sciences, Shenyang, 110016, Liaoning, China
Yiming Sun
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, Liaoning, China
Yiming Sun
Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, 110016, Liaoning, China
Yiming Sun
Hangzhou Institute of Technology, Xidian University, Hangzhou, 311200, Zhejiang, China
Ye Tian

Authors

Hanqi Yin
View author publications
You can also search for this author in PubMed Google Scholar
Guisheng Yin
View author publications
You can also search for this author in PubMed Google Scholar
Yiming Sun
View author publications
You can also search for this author in PubMed Google Scholar
Liguo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ye Tian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HY Methodology, writing original draft. LZ and YS methodology, validation, review and editing. Guisheng Yin and Ye Tian: methodology, validation, review. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Liguo Zhang.

Ethics declarations

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yin, H., Yin, G., Sun, Y. et al. Robust semantic segmentation method of urban scenes in snowy environment. Machine Vision and Applications 35, 59 (2024). https://doi.org/10.1007/s00138-024-01540-4

Download citation

Received: 28 October 2023
Revised: 18 March 2024
Accepted: 31 March 2024
Published: 29 April 2024
DOI: https://doi.org/10.1007/s00138-024-01540-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust semantic segmentation method of urban scenes in snowy environment

Abstract

Access this article

Similar content being viewed by others

Road Surface Translation Under Snow-Covered and Semantic Segmentation for Snow Hazard Index

High-Level Task-Driven Single Image Deraining: Segmentation in Rainy Days

The Role of Ground Truth Annotation in Semantic Image Segmentation Performance for Autonomous Driving

Data availibility

Code availibility

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust semantic segmentation method of urban scenes in snowy environment

Abstract

Access this article

Similar content being viewed by others

Road Surface Translation Under Snow-Covered and Semantic Segmentation for Snow Hazard Index

High-Level Task-Driven Single Image Deraining: Segmentation in Rainy Days

The Role of Ground Truth Annotation in Semantic Image Segmentation Performance for Autonomous Driving

Data availibility

Code availibility

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation