Abstract
Image-to-image translation is affected by entanglement phenomena, which may occur in case of target data encompassing occlusions such as raindrops, dirt, etc. Our unsupervised model-based learning disentangles scene and occlusions, while benefiting from an adversarial pipeline to regress physical parameters of the occlusion model. The experiments demonstrate our method is able to handle varying types of occlusions and generate highly realistic translations, qualitatively and quantitatively outperforming the state-of-the-art on multiple datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Note that averaging through the dataset implies similar image aspects and viewpoints. Image-wise guidance could be envisaged at the cost of less reliable guidance.
- 2.
Note that WoodScape provides soiling mask which we do not use.
References
Rain drops on screen. https://www.shadertoy.com/view/ldSBWW
Alletto, S., Carlin, C., Rigazio, L., Ishii, Y., Tsukizawa, S.: Adherent raindrop removal with self-supervised attention maps and spatio-temporal generative adversarial networks. In: ICCV Workshops (2019)
Anoosheh, A., Agustsson, E., Timofte, R., Van Gool, L.: ComboGAN: unrestrained scalability for image domain translation. In: CVPR Workshops (2018)
Bi, S., Sunkavalli, K., Perazzi, F., Shechtman, E., Kim, V.G., Ramamoorthi, R.: Deep CG2Real: Synthetic-to-real translation via image disentanglement. In: ICCV (2019)
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: CVPR (2020)
Cherian, A., Sullivan, A.: Sem-GAN: semantically-consistent image-to-image translation. In: WACV (2019)
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: CVPR (2018)
Cord, A., Aubert, D.: Towards rain detection through use of in-vehicle multipurpose cameras. In: IV (2011)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)
Gu, J., Ramamoorthi, R., Belhumeur, P., Nayar, S.: Removing image artifacts due to dirty camera lenses and thin occluders. In: SIGGRAPH Asia (2009)
Halder, S.S., Lalonde, J.F., de Charette, R.: Physics-based rendering for improving robustness to rain. In: ICCV (2019)
Halimeh, J.C., Roser, M.: Raindrop detection on car windshields using geometric-photometric environment construction and intensity-based correlation. In: IV (2009)
Hao, Z., You, S., Li, Y., Li, K., Lu, F.: Learning from synthetic photorealistic raindrop for single image raindrop removal. In: ICCV Workshops (2019)
Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11
Hui, L., Li, X., Chen, J., He, H., Yang, J.: Unsupervised multi-domain image translation with domain-specific encoders/decoders. In: ICPR (2018)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)
Kim, J., Kim, M., Kang, H., Lee, K.: U-GAT-IT: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: ICLR (2020)
Lee, H.Y., et al.: DRIT++: diverse image-to-image translation via disentangled representations. arXiv preprint arXiv:1905.01270 (2019)
Li, P., Liang, X., Jia, D., Xing, E.P.: Semantic-aware grad-GAN for virtual-to-real urban scene adaption. BMVC (2018)
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NeurIPS (2017)
Liu, M.Y., et al.: Few-shot unsupervised image-to-image translation. In: ICCV (2019)
Ma, S., Fu, J., Wen Chen, C., Mei, T.: DA-GAN: instance-level image translation by deep attention generative adversarial networks. In: CVPR (2018)
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: ICCV (2017)
Mejjati, Y.A., Richardt, C., Tompkin, J., Cosker, D., Kim, K.I.: Unsupervised attention-guided image-to-image translation. In: NeurIPS (2018)
Mo, S., Cho, M., Shin, J.: InstaGAN: instance-aware image-to-image translation. In: ICLR (2019)
Pentland, A.P.: A new sense for depth of field. T-PAMI (1987)
Pizzati, F., de Charette, R., Zaccaria, M., Cerri, P.: Domain bridge for unpaired image-to-image translation and unsupervised domain adaptation. In: WACV (2020)
Porav, H., Bruls, T., Newman, P.: I can see clearly now: image restoration via de-raining. In: ICRA (2019)
Qu, Y., Chen, Y., Huang, J., Xie, Y.: Enhanced pix2pix dehazing network. In: CVPR (2019)
Ramirez, P.Z., Tonioni, A., Di Stefano, L.: Exploiting semantics in adversarial training for image-level domain adaptation. In: IPAS (2018)
Riba, E., Mishkin, D., Ponsa, D., Rublee, E., Bradski, G.: Kornia: an open source differentiable computer vision library for PyTorch. In: WACV (2020)
Romero, A., Arbeláez, P., Van Gool, L., Timofte, R.: SMIT: stochastic multi-label image-to-image translation. In: ICCV Workshops (2019)
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: CVPR (2016)
Roser, M., Geiger, A.: Video-based raindrop detection for improved image registration. In: ICCV Workshops (2009)
Roser, M., Kurz, J., Geiger, A.: Realistic modeling of water droplets for monocular adherent raindrop recognition using Bezier curves. In: ACCV (2010)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: NeurIPS (2016)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: ICCV (2017)
Shen, Z., Huang, M., Shi, J., Xue, X., Huang, T.S.: Towards instance-level image-to-image translation. In: CVPR (2019)
Singh, K.K., Ojha, U., Lee, Y.J.: FineGAN: unsupervised hierarchical disentanglement for fine-grained object generation and discovery. In: CVPR (2019)
Tang, H., Xu, D., Sebe, N., Yan, Y.: Attention-guided generative adversarial networks for unsupervised image-to-image translation. In: International Joint Conference on Neural Networks (IJCNN) (2019)
Tang, H., Xu, D., Yan, Y., Corso, J.J., Torr, P.H., Sebe, N.: Multi-channel attention selection GANs for guided image-to-image translation. In: CVPR (2019)
Uricar, M., et al.: Let’s get dirty: GAN based data augmentation for soiling and adverse weather classification in autonomous driving. arXiv preprint arXiv:1912.02249 (2019)
Xiao, T., Hong, J., Ma, J.: DNA-GAN: learning disentangled representations from multi-attribute images. In: ICLR Workshops (2018)
Xiao, T., Hong, J., Ma, J.: ELEGANT: exchanging latent encodings with GAN for transferring multiple face attributes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 172–187. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_11
Xie, Y., Franz, E., Chu, M., Thuerey, N.: tempoGAN: a temporally coherent, volumetric GAN for super-resolution fluid flow. In: SIGGRAPH (2018)
Yang, X., Xu, Z., Luo, J.: Towards perceptual image dehazing by physics-based disentanglement and adversarial training. In: AAAI (2018)
Yang, X., Xie, D., Wang, X.: Crossing-domain generative adversarial networks for unsupervised multi-domain image-to-image translation. In: MM (2018)
Yi, Z., Zhang, H., Tan, P., Gong, M.: DualGAN: unsupervised dual learning for image-to-image translation. In: ICCV (2017)
Yogamani, S., et al.: WoodScape: a multi-task, multi-camera fisheye dataset for autonomous driving. In: ICCV (2019)
You, S., Tan, R.T., Kawakami, R., Mukaigawa, Y., Ikeuchi, K.: Adherent raindrop modeling, detectionand removal in video. T-PAMI (2015)
Zhang, J., Huang, Y., Li, Y., Zhao, W., Zhang, L.: Multi-attribute transfer via disentangled representation. In: AAAI (2019)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: CVPR (2017)
Zhu, J.Y., et al.: Toward multimodal image-to-image translation. In: NeurIPS (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Pizzati, F., Cerri, P., de Charette, R. (2020). Model-Based Occlusion Disentanglement for Image-to-Image Translation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12365. Springer, Cham. https://doi.org/10.1007/978-3-030-58565-5_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-58565-5_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58564-8
Online ISBN: 978-3-030-58565-5
eBook Packages: Computer ScienceComputer Science (R0)