Modeling Artistic Workflows for Image Generation and Editing

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12363)


People often create art by following an artistic workflow involving multiple stages that inform the overall design. If an artist wishes to modify an earlier decision, significant work may be required to propagate this new decision forward to the final artwork. Motivated by the above observations, we propose a generative model that follows a given artistic workflow, enabling both multi-stage image generation as well as multi-stage image editing of an existing piece of art. Furthermore, for the editing scenario, we introduce an optimization process along with learning-based regularization to ensure the edited image produced by the model closely aligns with the originally provided image. Qualitative and quantitative results on three different artistic datasets demonstrate the effectiveness of the proposed framework on both image generation and editing tasks.



This work is supported in part by the NSF CAREER Grant #1149783.

Supplementary material

504473_1_En_10_MOESM1_ESM.pdf (6.2 mb)
Supplementary material 1 (pdf 6370 KB)


  1. 1.
    Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN: how to embed images into the styleGAN latent space? In: ICCV (2019)Google Scholar
  2. 2.
    Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. In: ICML (2017)Google Scholar
  3. 3.
    Balaji, Y., Sankaranarayanan, S., Chellappa, R.: MetaReg: towards domain generalization using meta-regularization. In: NIPS (2018)Google Scholar
  4. 4.
    Bau, D., et al.: Semantic photo manipulation with a generative image prior. ACM TOG (Proc. SIGGRAPH) 38(4), 59 (2019)Google Scholar
  5. 5.
    Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. In: ICLR (2019)Google Scholar
  6. 6.
    Chang, H., Lu, J., Yu, F., Finkelstein, A.: PairedCycleGAN: asymmetric style transfer for applying and removing makeup. In: CVPR (2018)Google Scholar
  7. 7.
    Chen, W., Hays, J.: SketchyGAN: towards diverse and realistic sketch to image synthesis. In: CVPR (2018)Google Scholar
  8. 8.
    Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: NIPS (2016)Google Scholar
  9. 9.
    Cheng, Y.C., Lee, H.Y., Sun, M., Yang, M.H.: Controllable image synthesis via SegVAE. In: ECCV (2020)Google Scholar
  10. 10.
    Donahue, J., Simonyan, K.: Large scale adversarial representation learning. In: NIPS (2019)Google Scholar
  11. 11.
    Ghiasi, G., Lin, T.Y., Le, Q.V.: DropBlock: a regularization method for convolutional networks. In: NIPS (2018)Google Scholar
  12. 12.
    Ghosh, A., et al.: Interactive sketch & fill: multiclass sketch-to-image translation. In: CVPR (2019)Google Scholar
  13. 13.
    Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)Google Scholar
  14. 14.
    Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: NIPS (2017)Google Scholar
  15. 15.
    Huang, H.-P., Tseng, H.-Y., Lee, H.-Y., Huang, J.-B.: Semantic view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 592–608. Springer, Cham (2020). Scholar
  16. 16.
    Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: ICCV (2017)Google Scholar
  17. 17.
    Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: ECCV (2018)Google Scholar
  18. 18.
    Hung, W.C., Zhang, J., Shen, X., Lin, Z., Lee, J.Y., Yang, M.H.: Learning to blend photos. In: ECCV (2018)Google Scholar
  19. 19.
    Iizuka, S., Simo-Serra, E., Ishikawa, H.: Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM TOG (Proc. SIGGRAPH) 35(4), 110 (2016)Google Scholar
  20. 20.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)Google Scholar
  21. 21.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). Scholar
  22. 22.
    Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR (2018)Google Scholar
  23. 23.
    Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019)Google Scholar
  24. 24.
    Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: NIPS (1992)Google Scholar
  25. 25.
    Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300 (2015)
  26. 26.
    Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 577–593. Springer, Cham (2016). Scholar
  27. 27.
    Larsson, G., Maire, M., Shakhnarovich, G.: FractalNet: ultra-deep neural networks without residuals. In: ICML (2017)Google Scholar
  28. 28.
    Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M.K., Yang, M.H.: Diverse image-to-image translation via disentangled representations. In: ECCV (2018)Google Scholar
  29. 29.
    Lee, H.Y., et al.: DRIT++: diverse image-to-image translation via disentangled representations. IJCV 1–16 (2020)Google Scholar
  30. 30.
    Lee, H.Y., et al.: Neural design network: graphic layout generation with constraints. In: ECCV (2020)Google Scholar
  31. 31.
    Li, Y., Liu, M.Y., Li, X., Yang, M.H., Kautz, J.: A closed-form solution to photorealistic image stylization. In: ECCV (2018)Google Scholar
  32. 32.
    Mao, Q., Lee, H.Y., Tseng, H.Y., Ma, S., Yang, M.H.: Mode seeking generative adversarial networks for diverse image synthesis. In: CVPR (2019)Google Scholar
  33. 33.
    Nazeri, K., Ng, E., Joseph, T., Qureshi, F., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212 (2019)
  34. 34.
    Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: CVPR (2019)Google Scholar
  35. 35.
    Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: CVPR (2016)Google Scholar
  36. 36.
    Portenier, T., Hu, Q., Szabo, A., Bigdeli, S.A., Favaro, P., Zwicker, M.: FaceShop: deep sketch-based face image editing. ACM TOG (Proc. SIGGRAPH) 37(4), 99 (2018)Google Scholar
  37. 37.
    Singh, K.K., Ojha, U., Lee, Y.J.: FineGAN: unsupervised hierarchical disentanglement for fine-grained object generation and discovery. In: CVPR (2019)Google Scholar
  38. 38.
    Song, S., Zhang, W., Liu, J., Mei, T.: Unsupervised person image generation with semantic parsing transformation. In: CVPR (2019)Google Scholar
  39. 39.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  40. 40.
    Tseng, H.Y., Chen, Y.W., Tsai, Y.H., Liu, S., Lin, Y.Y., Yang, M.H.: Regularizing meta-learning via gradient dropout. arXiv preprint arXiv:2004.05859 (2020)
  41. 41.
    Tseng, H.Y., Lee, H.Y., Huang, J.B., Yang, M.H.: Cross-domain few-shot classification via learned feature-wise transformation. In: ICLR (2020)Google Scholar
  42. 42.
    Tseng, H.Y., Lee, H.Y., Jiang, L., Yang, W., Yang, M.H.: RetrieveGAN: image synthesis via differentiable patch retrieval. In: ECCV (2020)Google Scholar
  43. 43.
    Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: CVPR (2018)Google Scholar
  44. 44.
    Zhang, H., et al.: StackGAN++: realistic image synthesis with stacked generative adversarial networks. TPAMI 41(8), 1947–1962 (2018)CrossRefGoogle Scholar
  45. 45.
    Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 649–666. Springer, Cham (2016). Scholar
  46. 46.
    Zhang, R., et al.: Real-time user-guided image colorization with learned deep priors. ACM TOG (Proc. SIGGRAPH) 9(4) (2017)Google Scholar
  47. 47.
    Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 597–613. Springer, Cham (2016). Scholar
  48. 48.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)Google Scholar
  49. 49.
    Zhu, J.Y., et al.: Toward multimodal image-to-image translation. In: NIPS (2017)Google Scholar
  50. 50.
    Zhu, J.Y., et al.: Visual object networks: image generation with disentangled 3D representations. In: NIPS (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.University of CaliforniaOaklandUSA
  2. 2.Adobe ResearchSan JoseUSA

Personalised recommendations