FIT: Frequency-Based Image Translation for Domain Adaptive Object Detection

Zhang, Siqi; Zhang, Lu; Liu, Zhiyong; Feng, Hangtao

doi:10.1007/978-3-031-30111-7_21

Siqi Zhang^12,13,
Lu Zhang¹²,
Zhiyong Liu^12,13 &
…
Hangtao Feng^12,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13625))

Included in the following conference series:

International Conference on Neural Information Processing

894 Accesses
2 Citations

Abstract

Domain adaptive object detection (DAOD) aims to adapt the detector from a labelled source domain to an unlabelled target domain. In recent years, DAOD has attracted massive attention since it can alleviate performance degradation due to the large shift of data distributions in the wild. To align distributions between domains, adversarial learning is widely used in existing DAOD methods. However, the decision boundary for the adversarial domain discriminator may be inaccurate, causing the model biased towards the source domain. To alleviate this bias, we propose a novel Frequency-based Image Translation (FIT) framework for DAOD. First, by keeping domain-invariant frequency components and swapping domain-specific ones, we conduct image translation to reduce domain shift at the input level. Second, hierarchical adversarial feature learning is utilized to further mitigate the domain gap at the feature level. Finally, we design a joint loss to train the entire network in an end-to-end manner without extra training to obtain translated images. Extensive experiments on three challenging DAOD benchmarks demonstrate the effectiveness of our method.

This work was supported in part by the National Key Research and Development Plan of China under Grant 2020AAA0108902 and the Strategic Priority Research Program of Chinese Academy of Science under Grant XDB32050100.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Hierarchical contrastive adaptation for cross-domain object detection

Article 09 July 2022

AugGAN: Cross Domain Adaptation with GAN-Based Data Augmentation

Unsupervised Domain Adaptive Object Detection Using Forward-Backward Cyclic Adaptation

References

Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural. Inf. Process. Syst. 28, 91–99 (2015)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9627–9636 (2019)
Google Scholar
Saito, K., Ushiku, Y., Harada, T., Saenko, K.: Strong-weak distribution alignment for adaptive object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6956–6965 (2019)
Google Scholar
Chen, C., Zheng, Z., Ding, X., Huang, Y., Dou, Q.: Harmonizing transferability and discriminability for adapting object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8869–8878 (2020)
Google Scholar
Chen, Y., Li, W., Sakaridis, C., Dai, D., Van Gool, L.: Domain adaptive faster R-CNN for object detection in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3339–3348 (2018)
Google Scholar
Su, P., et al.: Adapting object detectors with conditional domain normalization. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 403–419. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_24
Chapter Google Scholar
Ganin, Y., Lempitsky, V.: Unsupervised domain adaptation by backpropagation. In: International Conference on Machine Learning, pp. 1180–1189. PMLR (2015)
Google Scholar
Kim, T., Jeong, M., Kim, S., Choi, S., Kim, C.: Diversify and match: a domain adaptive representation learning paradigm for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12456–12465 (2019)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Google Scholar
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
Google Scholar
Sakaridis, C., Dai, D., Van Gool, L.: Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 126(9), 973–992 (2018). https://doi.org/10.1007/s11263-018-1072-8
Article Google Scholar
Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Rosaen, K., Vasudevan, R.: Driving in the matrix: can virtual worlds replace human-generated annotations for real world tasks? In: 2017 IEEE International Conference on Robotics and Automation, pp. 746–753. IEEE (2017)
Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Article Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Hsu, H.K., et al.: Progressive domain adaptation for object detection. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, pp. 749–757 (2020)
Google Scholar
Oppenheim, A.V.: Discrete-Time Signal Processing. Pearson Education India (1999)
Google Scholar
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-Softmax. arXiv preprint arXiv:1611.01144 (2016)
Li, J., Duan, L.Y., Chen, X., Huang, T., Tian, Y.: Finding the secret of image saliency in the frequency domain. IEEE Trans. Pattern Anal. Mach. Intell. 37(12), 2428–2440 (2015)
Article Google Scholar
He, Zhenwei, Zhang, Lei: Domain adaptive object detection via asymmetric tri-way faster-RCNN. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12369, pp. 309–324. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58586-0_19
Chapter Google Scholar
Shen, Z., et al.: CDTD: a large-scale cross-domain benchmark for instance-level image-to-image translation and domain adaptive object detection. Int. J. Comput. Vis. 129(3), 761–780 (2021). https://doi.org/10.1007/s11263-020-01394-z
Article MathSciNet Google Scholar
Wu, A., Liu, R., Han, Y., Zhu, L., Yang, Y.: Vector-decomposed disentanglement for domain-invariant object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9342–9351 (2021)
Google Scholar
Zhang, Y., Wang, Z., Mao, Y.: RPN prototype alignment for domain adaptive object detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12425–12434 (2021)
Google Scholar
Liu, D., et al.: Decompose to adapt: cross-domain object detection via feature disentanglement. IEEE Trans. Multimed. (2022)
Google Scholar
Chen, Y., Li, G., Jin, C., Liu, S., Li, T.: SSD-GAN: measuring the realness in the spatial and spectral domains. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 1105–1112 (2021)
Google Scholar
Piotrowski, L.N., Campbell, F.W.: A demonstration of the visual importance and flexibility of spatial-frequency amplitude and phase. Perception 11(3), 337–346 (1982)
Article Google Scholar
Sankaranarayanan, S., Balaji, Y., Jain, A., Lim, S.N., Chellappa, R.: Learning from synthetic data: addressing domain shift for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3752–3761 (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Plan of China under Grant 2020AAA0108902 and the Strategic Priority Research Program of Chinese Academy of Science under Grant XDB32050100.

Author information

Authors and Affiliations

State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Siqi Zhang, Lu Zhang, Zhiyong Liu & Hangtao Feng
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100190, China
Siqi Zhang, Zhiyong Liu & Hangtao Feng

Authors

Siqi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hangtao Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiyong Liu .

Editor information

Editors and Affiliations

Indian Institute of Technology Indore, Indore, India
Mohammad Tanveer
Indian Institute of Information Technology - Allahabad, Prayagraj, India
Sonali Agarwal
Kobe University, Kobe, Japan
Seiichi Ozawa
Indian Institute of Technology Patna, Patna, India
Asif Ekbal
University of Innsbruck, Innsbruck, Austria
Adam Jatowt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, S., Zhang, L., Liu, Z., Feng, H. (2023). FIT: Frequency-Based Image Translation for Domain Adaptive Object Detection. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Lecture Notes in Computer Science, vol 13625. Springer, Cham. https://doi.org/10.1007/978-3-031-30111-7_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-30111-7_21
Published: 13 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30110-0
Online ISBN: 978-3-031-30111-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FIT: Frequency-Based Image Translation for Domain Adaptive Object Detection

Abstract

Access this chapter

Similar content being viewed by others

Hierarchical contrastive adaptation for cross-domain object detection

AugGAN: Cross Domain Adaptation with GAN-Based Data Augmentation

Unsupervised Domain Adaptive Object Detection Using Forward-Backward Cyclic Adaptation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

FIT: Frequency-Based Image Translation for Domain Adaptive Object Detection

Abstract

Access this chapter

Similar content being viewed by others

Hierarchical contrastive adaptation for cross-domain object detection

AugGAN: Cross Domain Adaptation with GAN-Based Data Augmentation

Unsupervised Domain Adaptive Object Detection Using Forward-Backward Cyclic Adaptation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation