Skip to main content
Log in

Generative Semantic Domain Adaptation for Perception in Autonomous Driving

  • Original Paper
  • Published:
Journal of Big Data Analytics in Transportation Aims and scope Submit manuscript

Abstract

Autonomous driving systems depend on their ability to perceive and understand their environments for navigation. Neural networks are the building blocks of such perception systems, and training these networks requires vast amounts of diverse training data that includes different kinds of driving scenarios in terms of terrains, object categories, and adverse illumination/weather conditions. However, most publicly available traffic datasets suffer from having been sampled under clean weather and illumination conditions. Data augmentation is often used as a strategy to improve the diversity of training data for training machine learning-based perception systems. However, standard augmentation techniques (such as translation and flipping) help neural networks to generalize over simple spatial transformations and more nuanced techniques are required to accurately combat semantic variations in novel test scenarios. We propose a new data augmentation method called “semantic domain adaptation” that relies on the use of attribute-conditioned generative models. We show that such data augmentation improves the generalization capability of deep networks by analyzing their performance in perception-based tasks such as classification and detection on different datasets of traffic objects that are captured (i) at different times of the day and (ii) across different weather conditions, and comparing with models trained using traditional augmentation methods. We further show that GAN-based augmented classification models are more robust against parametric adversarial attacks than the non-GAN-based augmentation models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Almahairi A, Rajeswar S, Sordoni A, Bachman P, Courville A.C (2018) Augmented cyclegan: Learning many-to-many mappings from unpaired data. In: ICML

  • Behrendt K (2019) Boxy vehicle detection in large images. In: ICCVW

  • Bowles C, Chen L, Guerrero R, Bentley P, Gunn RN, Hammers A, Dickie DA, del Valdés CHM, Wardlaw JM, Rueckert D (2018) Gan augmentation: augmenting training data using generative adversarial networks. arXiV preprint arXiv:1810.10863

  • Cabon Y, Murray N, Humenberger M (2020) Virtual kitti 2. arXiV preprint arXiv:2001.10773

  • Caesar H, Bankiti V, Lang A.H, Vora S, Liong V.E, Xu Q, Krishnan A, Pan Y, Baldan G, Beijbom O (2020) nuscenes: A multimodal dataset for autonomous driving. In: CVPR

  • Chakraborty P, Adu-Gyamfi YO, Poddar S, Ahsani V, Sharma A, Sarkar S (2018) Traffic congestion detection from camera images using deep convolution neural networks. Transp Res Rec 2672(45):222–231

    Article  Google Scholar 

  • Choi Y, Choi M, Kim M, Ha J-W, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: CVPR

  • Choi J, Kim T.-K, Kim C (2019) Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation. In: ICCV

  • Cordts M, Omran M, Ramos S, Scharwächter T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2015) The cityscapes dataset. In: CVPR

  • Dai D, Sakaridis C, Hecker S, Gool LV (2019) Curriculum model adaptation with synthetic and real data for semantic foggy scene understanding. IJCV 128:1182–1204

    Article  Google Scholar 

  • Dosovitskiy A, Ros G, Codevilla F, Lopez A, Koltun V (2017) CARLA: an open urban driving simulator. In: CoRL

  • Engstrom L, Tsipras D, Schmidt L, Madry A (2017) A rotation and a translation suffice: fooling CNNS with simple transformations. arxiv preprint arXiv:1712.02779

  • Fauqueur J, Brostow G, Cipolla R (2007) Assisted video object labeling by joint tracking of regions and keypoints. In: ICCVW

  • Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: CVPR

  • Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the KITTI dataset. IJRR 32:1231–1237

    Google Scholar 

  • Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: NeurIPS

  • Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT Press, London

    MATH  Google Scholar 

  • Griffin GP, Mulhall M, Simek C, Riggs WW (2020) Mitigating bias in big data for transportation. J Big Data Anal Transp 2:49–51

    Article  Google Scholar 

  • Haghighat AK, Ravichandra-Mouli V, Chakraborty P, Esfandiari Y, Arabi S, Sharma A (2019) Applications of deep learning in intelligent transportation systems. J Big Data Anal Transp 2:115–145

    Article  Google Scholar 

  • He Z, Zuo W, Kan M, Shan S, Chen X (2019) Attgan: facial attribute editing by only changing what you want. IEEE T Image Process 28:5464–5478

    Article  MathSciNet  MATH  Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. In: CVPR

  • He Z, Zuo W, Kan M, Shan S, Chen X (2017) Attgan: facial attribute editing by only changing what you want

  • Hoffman J, Tzeng E, Park T, Zhu J-Y, Isola P, Saenko K, Efros A.A, Darrell T (2018) Cycada: cycle-consistent adversarial domain adaptation. In: ICML

  • Houston J, Zuidhof G, Bergamini L, Ye Y, Jain A, Omari S, Iglovikov V, Ondruska P (2020) One thousand and one hours: self-driving motion prediction dataset

  • Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

  • Huang S-W, Lin C-T, Chen S-P, Wu Y-Y, Hsu P-H, Lai S-H (2018) Auggan: cross domain adaptation with gan-based data augmentation. In: ECCV

  • Joshi A, Mukherjee A, Sarkar S, Hegde C (2019) Semantic adversarial attacks: parametric transformations that fool deep classifiers. In: ICCV

  • Kang D, Sun Y, Hendrycks D, Brown T, Steinhardt J (2019) Testing robustness against unforeseen adversaries

  • Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196

  • Khoshgoftaar TM, Shorten C (2019) A survey on image data augmentation for deep learning. J Big Data 6:1–48

    Google Scholar 

  • Kong Z, Guo J, Li A, Liu C (2020) Physgan: generating physical-world-resilient adversarial examples for autonomous driving. In: CVPR

  • Lample G, Zeghidour N, Usunier N, Bordes A, Denoyer L, et al. (2017) Fader networks: manipulating images by sliding attributes. In: NeurIPS

  • Lee D, Liu S, Gu J, Liu M-Y, Yang M-H, Kautz J (2018) Context-aware synthesis and placement of object instances. In: NeurIPS

  • Li J, Liang X, Wei Y, Xu T, Feng J, Yan S (2017) Perceptual generative adversarial networks for small object detection. In: CVPR

  • Lin T-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: ICCV

  • Lin C, Huang S, Wu Y, Lai S (2020) Gan-based day-to-night image style transfer for nighttime vehicle detection. IEEE trans Intell Transp Syst 22:951–963

    Article  Google Scholar 

  • Liu M-Y, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: NeurIPS

  • Liu Y-F, Jaw D-W, Huang S-C, Hwang J-N (2018) Desnownet: context-aware deep network for snow removal. IEEE T Image Process 27:3064–3073

    Article  MathSciNet  Google Scholar 

  • Liu Z, Miao Z, Pan X, Zhan X, Lin D, Yu S.X, Gong B (2020) Open compound domain adaptation. In: CVPR

  • Lore KG, Akintayo A, Sarkar S (2015) Llnet: a deep autoencoder approach to natural low-light image enhancement. Pattern Recognit. 61:650–662

    Article  Google Scholar 

  • Machiraju H, Balasubramanian VN (2020) A little fog for a large turn. arXiv preprint arXiv:2001.05873

  • Maddern W, Pascoe G, Linegar C, Newman P (2017) 1 Year, 1000 km: the Oxford RobotCar dataset. IJRR 36:3–15

    Google Scholar 

  • Mukherjee A, Joshi A, Sarkar S, Hegde C (2019) Attribute-controlled traffic data augmentation using conditional generative models. In: CVPRW

  • Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch. In: NeurIPS-W

  • Richter SR, Vineet V, Roth S, Koltun V (2016) Playing for data: ground truth from computer games. In: ECCV

  • Ros G, Sellart L, Materzynska J, Vazquez D, Lopez A (2016) The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: CVPR

  • Sakaridis C, Dai D, Gool LV (2018) Semantic foggy scene understanding with synthetic data. IJCV 126:973–992

    Article  Google Scholar 

  • Sakaridis C, Dai D, Gool LV (2019) Semantic nighttime image segmentation with synthetic stylized data, gradual adaptation and uncertainty-aware evaluation. arXiv preprint arXiv:1901.05946

  • Sallab AE, Sobh I, Zahran M, Essam N (2019) Lidar sensor modeling and data augmentation with gans for autonomous driving. arXiv preprint arXiv:1905.07290

  • Sudo F, Hashimoto Y, Lisi G (2020) Object detection to evaluate image-to-image translation on different road conditions. Intelligent human systems integration. Springer, Cham

    Google Scholar 

  • Sun P, Kretzschmar H, Dotiwalla X, Chouard A, Patnaik V, Tsui P, Guo J, Zhou Y, Chai Y, Caine B, Vasudevan V, Han W, Ngiam J, Zhao H, Timofeev A, Ettinger S, Krivokon M, Gao A, Joshi A, Zhang Y, Shlens J, Chen Z, Anguelov D (2020) Scalability in perception for autonomous driving: Waymo open dataset. In: CVPR

  • Tian Y, Pei K, Jana S, Ray B (2018) Deeptest: automated testing of deep-neural-network-driven autonomous cars. In: ICSE

  • Uricar M, Krizek P, Hurych D, Sobh IM, Yogamani S, Denny P (2019) Yes, we gan: applying adversarial techniques for autonomous driving. Electr. Imaging 2019(15):1–48

    Google Scholar 

  • Uricar M, Sistu G, Rashed H, Vobecký A, Krízek P, Burger F, Yogamani S.K (2019) Let’s get dirty: gan based data augmentation for soiling and adverse weather classification in autonomous driving. arXiv preprint arXiv:1912.02249

  • Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2017) High-resolution image synthesis and semantic manipulation with conditional gans. In: CVPR

  • Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: CVPR

  • Wang L, Cho W, Yoon K-J (2020) Deceiving image-to-image translation networks for autonomous driving with adversarial perturbations. IEEE Robot Autom Lett 5:1421–1428

    Article  Google Scholar 

  • Yang Z, Chai Y, Anguelov D, Zhou Y, Sun P, Erhan D, Rafferty S, Kretzschmar H (2020) Surfelgan: synthesizing realistic sensor data for autonomous driving. In: CVPR

  • Yoo J, Eom H, Choi YS (2019) Image-to-image translation using a cross-domain auto-encoder and decoder. Appl Sci 9:4780

    Article  Google Scholar 

  • Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, Madhavan V, Darrell T (2020) Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: CVPR

  • Yu W, Sun Y, Zhou R, Liu X (2019) Gan based method for labeled image augmentation in autonomous driving. In: ICCVE

  • Zhang M, Zhang Y, Zhang L, Liu C, Khurshid S (2018) Deeproad: gan-based metamorphic autonomous driving system testing. arXiv preprint arXiv:1802.02295

  • Zheng Y, Huang D, Liu S, Wang Y (2020) Cross-domain object detection through coarse-to-fine feature adaptation. In: CVPR

  • Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV

  • Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros A.A, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. In: NeurIPS,

Download references

Acknowledgements

This work was supported in part by NSF grants CCF-2005804, CNS-1845969, DARPA AIRA grant PA-18-02-02, AFOSR YIP grant FA9550-17-1-0220, an ERP grant from ISU, a GPU gift grant from NVIDIA Corporation, and faculty fellowships from the Black and Veatch Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amitangshu Mukherjee.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mukherjee, A., Joshi, A., Sharma, A. et al. Generative Semantic Domain Adaptation for Perception in Autonomous Driving. J. Big Data Anal. Transp. 4, 103–117 (2022). https://doi.org/10.1007/s42421-022-00057-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42421-022-00057-4

Keywords

Navigation