Abstract
We study conditional image repainting where a model is trained to generate visual content conditioned on user inputs, and composite the generated content seamlessly onto a user provided image while preserving the semantics of users’ inputs. The content generation community has been pursuing to lower the skill barriers. The usage of human language is the rose among thorns for this purpose, because the language is friendly to users but poses great difficulties for the model in associating relevant words with the semantically ambiguous regions. To resolve this issue, we propose a delicate mechanism which bridges the semantic chasm between the language input and the generated visual content. The state-of-the-art image compositing techniques pose a latent ceiling of fidelity for the composited content during the adversarial training process. In this work, we improve the compositing by breaking through the latent ceiling using a novel piecewise value function. We demonstrate on two datasets that the proposed techniques can better assist tackling conditional image repainting compared to the existing ones.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
FaceApp. https://www.faceapp.com/
Caesar, H., Uijlings, J.R.R., Ferrari, V.: Coco-stuff: thing and stuff classes in context. In: CVPR (2018)
Chen, B., Kae, A.: Toward realistic image compositing with adversarial learning. In: CVPR (2019)
Chen, X., Qing, L., He, X., Luo, X., Xu, Y.: FTGAN: a fully-trained generative adversarial networks for text to face generation. CoRR abs/1904.05729 (2019)
Cong, W., et al.: Deep image harmonization via domain verification. CoRR abs/1911.13239 (2019)
Cun, X., Pun, C.: Improving the harmony of the composite image by spatial-separated attention module. CoRR abs/1907.06406 (2019)
Goodfellow, I.J., et al.: Generative adversarial nets. In: NIPS (2014)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Klambauer, G., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a nash equilibrium. In: NIPS (2017)
Huang, H., Xu, S., Cai, J., Liu, W., Hu, S.: Temporally coherent video harmonization using adversarial networks. TIP 29, 214–224 (2020)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019)
Li, W., et al.: Object-driven text-to-image synthesis via adversarial training. In: CVPR (2019)
Li, Y., Singh, K.K., Ojha, U., Lee, Y.J.: Mixnmatch: multifactor disentanglement and encoding for conditional image generation. CoRR abs/1911.11758 (2019)
Park, T., Liu, M., Wang, T., Zhu, J.: Semantic image synthesis with spatially-adaptive normalization. In: CVPR (2019)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP (2014)
Qiao, T., Zhang, J., Xu, D., Tao, D.: Mirrorgan: learning text-to-image generation by redescription. In: CVPR (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Tan, H., Liu, X., Li, X., Zhang, Y., Yin, B.: Semantics-enhanced adversarial nets for text-to-image synthesis. In: ICCV (2019)
Tripathi, S., Chandra, S., Agrawal, A., Tyagi, A., Rehg, J.M., Chari, V.: Learning to generate synthetic data via compositing. In: CVPR (2019)
Tsai, Y., Shen, X., Lin, Z., Sunkavalli, K., Lu, X., Yang, M.: Deep image harmonization. In: CVPR (2017)
Tsai, Y., Shen, X., Lin, Z., Sunkavalli, K., Yang, M.: Sky is not the limit: semantic-aware sky replacement. TOG 35, 149 (2016)
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Technical report CNS-TR-2011-001, California Institute of Technology (2011)
Wang, T., Liu, M., Zhu, J., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: CVPR (2018)
Weng, S., Li, W., Li, D., Jin, H., Shi, B.: Misc: multi-condition injection and spatially-adaptive compositing for conditional person image synthesis. In: CVPR (2020)
Wu, H., Zheng, S., Zhang, J., Huang, K.: GP-GAN: towards realistic high-resolution image blending. In: ACM MM (2019)
Xu, T., et al.: Attngan: fine-grained text to image generation with attentional generative adversarial networks. In: CVPR (2018)
Yin, G., Liu, B., Sheng, L., Yu, N., Wang, X., Shao, J.: Semantics disentangling for text-to-image generation. In: CVPR (2019)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: ICCV (2019)
Zhou, P., Han, X., Morariu, V.I., Davis, L.S.: Learning rich features for image manipulation detection. In: CVPR (2018)
Zhu, M., Pan, P., Chen, W., Yang, Y.: DM-GAN: dynamic memory generative adversarial networks for text-to-image synthesis. In: CVPR (2019)
Acknowledgements
PKU affiliated authors are supported by National Natural Science Foundation of China under Grant No. 61872012, National Key R&D Program of China (2019YFF0302902), and Beijing Academy of Artificial Intelligence (BAAI).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Weng, S., Li, W., Li, D., Jin, H., Shi, B. (2020). Conditional Image Repainting via Semantic Bridge and Piecewise Value Function. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12354. Springer, Cham. https://doi.org/10.1007/978-3-030-58545-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-58545-7_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58544-0
Online ISBN: 978-3-030-58545-7
eBook Packages: Computer ScienceComputer Science (R0)