Skip to main content

Self-Supervised CycleGAN for Object-Preserving Image-to-Image Domain Adaptation

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12365))

Included in the following conference series:

Abstract

Recent generative adversarial network (GAN) based methods (e.g., CycleGAN) are prone to fail at preserving image-objects in image-to-image translation, which reduces their practicality on tasks such as domain adaptation. Some frameworks have been proposed to adopt a segmentation network as the auxiliary regularization to prevent the content distortion. However, all of them require extra pixel-wise annotations, which is difficult to fulfill in practical applications. In this paper, we propose a novel GAN (namely OP-GAN) to address the problem, which involves a self-supervised module to enforce the image content consistency during image-to-image translations without any extra annotations. We evaluate the proposed OP-GAN on three publicly available datasets. The experimental results demonstrate that our OP-GAN can yield visually plausible translated images and significantly improve the semantic segmentation accuracy in different domain adaptation scenarios with off-the-shelf deep learning networks such as PSPNet and U-Net.

X. Xie and J. Chen—The first two authors are equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The analysis of grid size can be found in Appendix.

  2. 2.

    The network architecture of the shared-weight encoder is presented in Appendix.

  3. 3.

    After excluding the domain specific information, the content features \(\tilde{p}\) from different domains are directly comparable. So, we use the simple mean squared error to measure the difference.

  4. 4.

    The detailed training process with self-supervised loss can be found in Appendix.

  5. 5.

    The top-1 solution (without extra training data) on the leaderboard of semantic segmentation on CamVid: https://paperswithcode.com/sota/semantic-segmentation-on-camvid.

References

  1. Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 44–57. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_5

    Chapter  Google Scholar 

  2. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49

    Chapter  Google Scholar 

  3. Chen, T., Zhai, X., Ritter, M., Lucic, M., Houlsby, N.: Self-supervised GANs via auxiliary rotation loss. In: CVPR (2019)

    Google Scholar 

  4. Chen, Y., Lai, Y.K., Liu, Y.J.: CartoonGAN: generative adversarial networks for photo cartoonization. In: CVPR (2018)

    Google Scholar 

  5. Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., Jiao, J.: Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: CVPR (2018)

    Google Scholar 

  6. Fu, H., Gong, M., Wang, C., Batmanghelich, K., Zhang, K., Tao, D.: Geometry-consistent generative adversarial networks for one-sided unsupervised domain mapping. In: CVPR (2019)

    Google Scholar 

  7. Gidaris, S., Bursuc, A., Komodakis, N., Pérez, P., Cord, M.: Boosting few-shot visual learning with self-supervision. In: ICCV (2019)

    Google Scholar 

  8. Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2014)

    Google Scholar 

  9. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2018)

    Google Scholar 

  10. Huang, S.-W., Lin, C.-T., Chen, S.-P., Wu, Y.-Y., Hsu, P.-H., Lai, S.-H.: AugGAN: cross domain adaptation with GAN-based data augmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 731–744. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_44

    Chapter  Google Scholar 

  11. Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: CCNet: criss-cross attention for semantic segmentation. In: ICCV (2019)

    Google Scholar 

  12. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)

    Google Scholar 

  13. Kim, T., Cha, M., Kim, H., Lee, J., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: ICML (2017)

    Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  15. Larsson, G., Maire, M., Shakhnarovich, G.: Colorization as a proxy task for visual understanding. In: CVPR (2017)

    Google Scholar 

  16. Lee, H.Y., Huang, J.B., Singh, M., Yang, M.H.: Unsupervised representation learning by sorting sequences. In: ICCV (2017)

    Google Scholar 

  17. Lee, H.-Y., Tseng, H.-Y., Huang, J.-B., Singh, M., Yang, M.-H.: Diverse image-to-image translation via disentangled representations. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 36–52. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_3

    Chapter  Google Scholar 

  18. Li, C., Wand, M.: Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 702–716. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_43

    Chapter  Google Scholar 

  19. Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks. In: CVPR (2019)

    Google Scholar 

  20. Li, Y., Xie, X., Liu, S., Li, X., Shen, L.: GT-Net: A deep learning network for gastric tumor diagnosis. In: ICTAI (2018)

    Google Scholar 

  21. Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NeurIPS (2017)

    Google Scholar 

  22. Ma, S., Fu, J., Wen Chen, C., Mei, T.: DA-GAN: instance-level image translation by deep attention generative adversarial networks. In: CVPR (2018)

    Google Scholar 

  23. Noroozi, M., Vinjimoor, A., Favaro, P., Pirsiavash, H.: Boosting self-supervised learning via knowledge transfer. In: CVPR (2018)

    Google Scholar 

  24. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing & Computer Assisted Intervention (2015)

    Google Scholar 

  25. Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. In: CVPR (2017)

    Google Scholar 

  26. Silva, J., Histace, A., Romain, O., Dray, X., Granado, B.: Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer. Int. J. Comput. Assist. Radiol. Surg. 9(2), 283–293 (2013). https://doi.org/10.1007/s11548-013-0926-3

    Article  Google Scholar 

  27. Sun, L., Wang, K., Yang, K., Xiang, K.: See clearer at night: towards robust nighttime semantic segmentation through day-night image conversion. arXiv preprint arXiv:1908.05868 (2019)

  28. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-ResNet and the impact of residual connections on learning. In: AAAI (2017)

    Google Scholar 

  29. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)

  30. Vázquez, D., et al.: A benchmark for endoluminal scene segmentation of colonoscopy images. J. Healthc. Eng. 2017, 9 (2017)

    Article  Google Scholar 

  31. Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer GAN to bridge domain gap for person re-identification. In: CVPR (2018)

    Google Scholar 

  32. Yi, Z., Zhang, H., Tan, P., Gong, M.: DualGAN: unsupervised dual learning for image-to-image translation. In: ICCV (2017)

    Google Scholar 

  33. Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: Learning a discriminative feature network for semantic segmentation. In: CVPR (2018)

    Google Scholar 

  34. Zhang, Z., Yang, L., Zheng, Y.: Translating and segmenting multimodal medical volumes with cycle- and shape-consistency generative adversarial network. In: CVPR (2018)

    Google Scholar 

  35. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)

    Google Scholar 

  36. Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)

    Google Scholar 

  37. Zhu, Y., et al.: Improving semantic segmentation via video propagation and label relaxation. In: CVPR (2019)

    Google Scholar 

  38. Zolfaghari Bengar, J., et al.: Temporal coherence for active learning in videos. arXiv preprint arXiv:1908.11757 (2019)

Download references

Acknowledge

This work is supported by the Natural Science Foundation of China (No. 91959108 and 61702339), the Key Area Research and Development Program of Guangdong Province, China (No. 2018B010111001), National Key Research and Development Project (2018YFC2000702) and Science and Technology Program of Shenzhen, China (No. ZDSYS201802021814180).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuexiang Li .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2726 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xie, X., Chen, J., Li, Y., Shen, L., Ma, K., Zheng, Y. (2020). Self-Supervised CycleGAN for Object-Preserving Image-to-Image Domain Adaptation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12365. Springer, Cham. https://doi.org/10.1007/978-3-030-58565-5_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58565-5_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58564-8

  • Online ISBN: 978-3-030-58565-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics