Advertisement

Image Generation from Sketch Constraint Using Contextual GAN

  • Yongyi Lu
  • Shangzhe Wu
  • Yu-Wing Tai
  • Chi-Keung Tang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11220)

Abstract

In this paper we investigate image generation guided by hand sketch. When the input sketch is badly drawn, the output of common image-to-image translation follows the input edges due to the hard condition imposed by the translation process. Instead, we propose to use sketch as weak constraint, where the output edges do not necessarily follow the input edges. We address this problem using a novel joint image completion approach, where the sketch provides the image context for completing, or generating the output image. We train a generated adversarial network, i.e, contextual GAN to learn the joint distribution of sketch and the corresponding image by using joint images. Our contextual GAN has several advantages. First, the simple joint image representation allows for simple and effective learning of joint distribution in the same image-sketch space, which avoids complicated issues in cross-domain learning. Second, while the output is related to its input overall, the generated features exhibit more freedom in appearance and do not strictly align with the input features as previous conditional GANs do. Third, from the joint image’s point of view, image and sketch are of no difference, thus exactly the same deep joint image completion network can be used for image-to-sketch generation. Experiments evaluated on three different datasets show that our contextual GAN can generate more realistic images than state-of-the-art conditional GANs on challenging inputs and generalize well on common categories.

Keywords

Image generation Contextual completion 

Notes

Acknowledgement

This work was supported in part by Tencent Youtu.

References

  1. 1.
    Create filter gallery photocopy effect with single step in photoshop. https://www.youtube.com/watch?v=QNmniB_5Nz0
  2. 2.
    Amos, B., Ludwiczuk, B., Satyanarayanan, M.: Openface: a general-purpose face recognition library with mobile applications. Technical report, CMU-CS-16-118 (2016)Google Scholar
  3. 3.
    Bimbo, A.D., Pala, P.: Visual image retrieval by elastic matching of user sketches. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 121–132 (1997).  https://doi.org/10.1109/34.574790CrossRefGoogle Scholar
  4. 4.
    Bui, T., Ribeiro, L., Ponti, M., Collomosse, J.P.: Generalisation and sharing in triplet convnets for sketch based visual search. CoRR abs/1611.05301 (2016). arXiv:1611.05301
  5. 5.
    Cao, X., Zhang, H., Liu, S., Guo, X., Lin, L.: Sym-fish: a symmetry-aware flip invariant sketch histogram shape descriptor. In: IEEE ICCV, December 2013Google Scholar
  6. 6.
    Cao, Y., Wang, C., Zhang, L., Zhang, L.: Edgel index for large-scale sketch-based image search. In: IEEE CVPR, pp. 761–768 (2011).  https://doi.org/10.1109/CVPR.2011.5995460
  7. 7.
    Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: CVPR, vol. 1, pp. 539–546, June 2005.  https://doi.org/10.1109/CVPR.2005.202
  8. 8.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS, pp. 2672–2680 (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
  9. 9.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)Google Scholar
  10. 10.
    Kang, H., Lee, S., Chui, C.K.: Coherent line drawing. In: ACM Symposium on Non-Photorealistic Animation and Rendering (NPAR), pp. 43–50, August 2007Google Scholar
  11. 11.
    Kato, T., Kurita, T., Otsu, N., Hirata, K.: A sketch retrieval method for full color image database-query by visual example. In: ICPR, pp. 530–533, August 1992.  https://doi.org/10.1109/ICPR.1992.201616
  12. 12.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014). arXiv:1412.6980
  13. 13.
    Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine-grained categorization. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 554–561, December 2013.  https://doi.org/10.1109/ICCVW.2013.77
  14. 14.
    Li, Y., Su, H., Qi, C.R., Fish, N., Cohen-Or, D., Guibas, L.J.: Joint embeddings of shapes and images via cnn image purification. ACM TOG 34(6), 234:1–234:12 (2015).  https://doi.org/10.1145/2816795.2818071CrossRefGoogle Scholar
  15. 15.
    Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: The IEEE International Conference on Computer Vision (ICCV), December 2015Google Scholar
  16. 16.
    Oord, A.v.d., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., Kavukcuoglu, K.: Conditional image generation with pixelcnn decoders. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 4797–4805. Curran Associates Inc. (2016)Google Scholar
  17. 17.
    Pathak, D., Krähenbühl, P., Donahue, J., Darrell, T., Efros, A.: Context encoders: Feature learning by inpainting. In: CVPR (2016)Google Scholar
  18. 18.
    Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR abs/1511.06434 (2015)Google Scholar
  19. 19.
    Saavedra, J.M., Barrios, J.M.: Sketch based image retrieval using learned keyshapes (lks). In: BMVC, pp. 164.1–164.11. BMVA Press, September 2015Google Scholar
  20. 20.
    Sangkloy, P., Burnell, N., Ham, C., Hays, J.: The sketchy database: learning to retrieve badly drawn bunnies. ACM TOG 35, 1–12 (2016)CrossRefGoogle Scholar
  21. 21.
    Sclaroff, S.: Deformable prototypes for encoding shape categories in image databases. PR 30(4), 627–641 (1997).  https://doi.org/10.1016/S0031-3203(96)00108-2CrossRefGoogle Scholar
  22. 22.
    Shrivastava, A., Malisiewicz, T., Gupta, A., Efros, A.A.: Data-driven visual similarity for cross-domain image matching. ACM TOG 30(6), 1 (2011)CrossRefGoogle Scholar
  23. 23.
    Simo-Serra, E., Iizuka, S., Sasaki, K., Ishikawa, H.: Learning to simplify: fully convolutional networks for rough sketch cleanup. ACM TOG 35(4), 1–11 (2016)CrossRefGoogle Scholar
  24. 24.
    Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD birds-200-2011 dataset. Technical report, CNS-TR-2011-001, California Institute of Technology (2011)Google Scholar
  25. 25.
    Wang, F., Kang, L., Li, Y.: Sketch-based 3d shape retrieval using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1875–1883 (2015)Google Scholar
  26. 26.
    Wang, J., et al.: Learning fine-grained image similarity with deep ranking. In: CVPR, pp. 1386–1393 (2014)Google Scholar
  27. 27.
    Wang, X., Gupta, A.: Generative image modeling using style and structure adversarial networks. In: ECCV (2016)CrossRefGoogle Scholar
  28. 28.
    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. Ieee Trans. Image Process. 13(4), 600–612 (2004)CrossRefGoogle Scholar
  29. 29.
    Winnemller, H., Kyprianidis, J.E., Olsen, S.C.: Xdog: an extended difference-of-gaussians compendium including advanced image stylization. Comput. Graph. 36(6), 740–753 (2012).  https://doi.org/10.1016/j.cag.2012.03.004. http://www.sciencedirect.com/science/article/pii/S009784931200043XCrossRefGoogle Scholar
  30. 30.
    Wu, X., He, R., Sun, Z., Tan, T.: A light cnn for deep face representation with noisy labels. arXiv preprint arXiv:1511.02683 (2015)
  31. 31.
    Xu, L., Yan, Q., Xia, Y., Jia, J.: Structure extraction from texture via natural variation measure. ACM TOG 31 (2012)Google Scholar
  32. 32.
    Yan, X., Yang, J., Sohn, K., Lee, H.: Attribute2Image: conditional image generation from visual attributes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 776–791. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_47CrossRefGoogle Scholar
  33. 33.
    Yeh, R.A., Chen, C., Lim, T.Y., Schwing, A.G., Hasegawa-Johnson, M., Do, M.N.: Semantic image inpainting with deep generative models. In: CVPR (2017)Google Scholar
  34. 34.
    Yu, Q., Liu, F., SonG, Y.Z., Xiang, T., Hospedales, T., Loy, C.C.: Sketch me that shoe. In: CVPR (2016)Google Scholar
  35. 35.
    Zhang, H., Xu, T., Li, H., Zhang, S., Huang, X., Wang, X., Metaxas, D.: Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 5907–5915 (2017)Google Scholar
  36. 36.
    Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 597–613. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_36CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.The Hong Kong University of Science and TechnologyKowloonHong Kong
  2. 2.Tencent YoutuShanghaiChina
  3. 3.University of OxfordOxfordUK

Personalised recommendations