Advertisement

Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn Sketches

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12360)

Abstract

Sketch-based image editing aims to synthesize and modify photos based on the structural information provided by the human-drawn sketches. Since sketches are difficult to collect, previous methods mainly use edge maps instead of sketches to train models (referred to as edge-based models). However, human-drawn sketches display great structural discrepancy with edge maps, thus failing edge-based models. Moreover, sketches often demonstrate huge variety among different users, demanding even higher generalizability and robustness for the editing model to work. In this paper, we propose Deep Plastic Surgery, a novel, robust and controllable image editing framework that allows users to interactively edit images using hand-drawn sketch inputs. We present a sketch refinement strategy, as inspired by the coarse-to-fine drawing process of the artists, which we show can help our model well adapt to casual and varied sketches without the need for real sketch training data. Our model further provides a refinement level control parameter that enables users to flexibly define how “reliable” the input sketch should be considered for the final output, balancing between sketch faithfulness and output verisimilitude (as the two goals might contradict if the input sketch is drawn poorly). To achieve the multi-level refinement, we introduce a style-based module for level conditioning, which allows adaptive feature representations for different levels in a singe network. Extensive experimental results demonstrate the superiority of our approach in improving the visual quality and user controllablity of image editing over the state-of-the-art methods. Our project and code are available at https://github.com/TAMU-VITA/DeepPS.

Keywords

Image editing Sketch-to-image translation User control 

Notes

Acknowledgement

This work was supported in part by National Natural Science Foundation of China under contract No. 61772043, and in part by Beijing Natural Science Foundation under contract No. L182002 and No. 4192025. The research of Z. Wang was partially supported by NSF Award RI-1755701. This work was supported by China Scholarship Council.

Supplementary material

504470_1_En_36_MOESM1_ESM.pdf (7.5 mb)
Supplementary material 1 (pdf 7717 KB)

References

  1. 1.
    Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24 (2009)CrossRefGoogle Scholar
  2. 2.
    Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Proceedings of ACM SIGGPRAH, pp. 417–424 (2000)Google Scholar
  3. 3.
    Chen, W., Hays, J.: SketchyGAN: towards diverse and realistic sketch to image synthesis. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition, pp. 9416–9425 (2018)Google Scholar
  4. 4.
    Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition (2018)Google Scholar
  5. 5.
    Criminisi, A., Pérez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE Trans. Image Process. 13(9), 1200–1212 (2004)CrossRefGoogle Scholar
  6. 6.
    Dekel, T., Gan, C., Krishnan, D., Liu, C., Freeman, W.T.: Sparse, smart contours to represent and edit images. In: Proc. IEEE International Conference Computer Vision and Pattern Recognition, pp. 3511–3520 (2018)Google Scholar
  7. 7.
    Ghosh, A., et al.: Interactive sketch & fill: multiclass sketch-to-image translation. In: Proceedings of International Conference Computer Vision (2019)Google Scholar
  8. 8.
    Hays, J., Efros, A.A.: Scene completion using millions of photographs. ACM Trans. Graph. 26(3), 4 (2007)CrossRefGoogle Scholar
  9. 9.
    Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: roceedings of International Conference Computer Vision, pp. 1510–1519 (2017)Google Scholar
  10. 10.
    Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01219-9_11CrossRefGoogle Scholar
  11. 11.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition, pp. 5967–5976 (2017)Google Scholar
  12. 12.
    Jo, Y., Park, J.: SC-FEGAN: face editing generative adversarial network with user’s sketch and color. In: Proceedings of International Conference Computer Vision (2019)Google Scholar
  13. 13.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_43CrossRefGoogle Scholar
  14. 14.
    Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: Proceedings of International Conference, Learning Representations (2018)Google Scholar
  15. 15.
    Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)Google Scholar
  16. 16.
    Liu, J., Yang, S., Fang, Y., Guo, Z.: Structure-guided image inpainting using homography transformation. IEEE Trans. Multimedia 20(12), 3252–3265 (2018)CrossRefGoogle Scholar
  17. 17.
    Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Advances in Neural Information Processing Systems, pp. 700–708 (2017)Google Scholar
  18. 18.
    Liu, R., Yu, Q., Yu, S.: An unpaired sketch-to-photo translation model (2019). arXiv:1909.08313
  19. 19.
    Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)Google Scholar
  20. 20.
    Lu, Y., Wu, S., Tai, Y.-W., Tang, C.-K.: Image generation from sketch constraint using contextual GAN. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 213–228. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01270-0_13CrossRefGoogle Scholar
  21. 21.
    Pathak, D., Krähenbühl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: Feature learning by inpainting. In: Proceedings IEEE International Conference Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)Google Scholar
  22. 22.
    Portenier, T., Hu, Q., Szabo, A., Bigdeli, S.A., Favaro, P., Zwicker, M.: Faceshop: deep sketch-based face image editing. ACM Trans. Graph. 37(4), 99 (2018)CrossRefGoogle Scholar
  23. 23.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Sangkloy, P., Burnell, N., Ham, C., Hays, J.: The sketchy database: learning to retrieve badly drawn bunnies. ACM Trans. Graph. 35(4), 119:1–119:12 (2016)Google Scholar
  25. 25.
    Sangkloy, P., Lu, J., Fang, C., Yu, F., Hays, J.: Scribbler: controlling deep image synthesis with sketch and color. In: Proceedings IEEE International Conference Computer Vision and Pattern Recognition, pp. 5400–5409 (2017)Google Scholar
  26. 26.
    Shan, Q., Curless, B., Furukawa, Y., Hernandez, C., Seitz, S.M.: Photo uncrop. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 16–31. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10599-4_2CrossRefGoogle Scholar
  27. 27.
    Simo-Serra, E., Iizuka, S., Ishikawa, H.: Real-time data-driven interactive rough sketch inking. ACM Trans. Graph. 37(4), 98 (2018)CrossRefGoogle Scholar
  28. 28.
    Sun, J., Yuan, L., Jia, J., Shum, H.Y.: Image completion with structure propagation. ACM Trans. Graph. 24(3), 861–868 (2005)CrossRefGoogle Scholar
  29. 29.
    Wang, M., Lai, Y., Liang, Y., Martin, R.R., Hu, S.M.: Biggerpicture: data-driven image extrapolation using graph matching. ACM Trans. Graph. 33(6) (2014)Google Scholar
  30. 30.
    Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition (2018)Google Scholar
  31. 31.
    Wexler, Y., Shechtman, E., Irani, M.: Space-time completion of video. IEEE Trans. Pattern Anal. Mach. Intell. 3, 463–476 (2007)CrossRefGoogle Scholar
  32. 32.
    Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition, pp. 1395–1403 (2015)Google Scholar
  33. 33.
    Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., Li, H.: High-resolution image inpainting using multi-scale neural patch synthesis. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition (2017)Google Scholar
  34. 34.
    Yang, S., Liu, J., Wang, W., Guo, Z.: TET-GAN: text effects transfer via stylization and destylization. Proc. AAAI Conf. Artif. Intell. 33, 1238–1245 (2019)Google Scholar
  35. 35.
    Yang, S., Wang, W., Liu, J.: TE141K: artistic text benchmark for text effect transfer. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1–15 (2020).  https://doi.org/10.1109/TPAMI.2020.2983697CrossRefGoogle Scholar
  36. 36.
    Yang, S., Wang, Z., Wang, Z., Xu, N., Liu, J., Guo, Z.: Controllable artistic text style transfer via shape-matching GAN. In: Proceedings of International Conference Computer Vision, pp. 4442–4451 (2019)Google Scholar
  37. 37.
    Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition, pp. 5505–5514 (2018)Google Scholar
  38. 38.
    Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of International Conference Computer Vision (2019)Google Scholar
  39. 39.
    Yu, Q., Liu, F., Song, Y.Z., Xiang, T., Hospedales, T.M., Loy, C.C.: Sketch me that shoe. In: Proceedings of IEEE International Conference Computer Vision and Pattern Recognition, pp. 799–807 (2016)Google Scholar
  40. 40.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of International Conference Computer Vision, pp. 2242–2251 (2017)Google Scholar
  41. 41.
    Zhu, J.Y., et al.: Toward multimodal image-to-image translation. In: Advances in Neural Information Processing Systems, pp. 465–476 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Wangxuan Institute of Computer TechnologyPeking UniversityBeijingChina
  2. 2.Department of Electrical and Computer EngineeringUniversity of Texas at AustinAustinUSA

Personalised recommendations