Skip to main content
Log in

Interactively transforming chinese ink paintings into realistic images using a border enhance generative adversarial network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Traditional Chinese painting has a long history. When we appreciate such paintings today, although we can obtain an overview of the landscape and environment of that time, it can be difficult to feel like we are interacting with the paintings. Alongside the rapid rise of deep learning, much research has been conducted on style transfer—for example, transforming photographs into the style of Chinese painting, sketches, or cartoons—but no research has considered the transformation of Chinese paintings into realistic images or even enriching such paintings through user interaction. To address this research gap, we employed a generative adversarial network (GAN), which is a generative model, to create new images that resemble the training data through the process of confrontation. Additionally, compared with general image-to-image translation, converting Chinese ink paintings into realistic images requires additional input because ink paintings contain texture and border features of relatively low quality. We combined cycle-consistent GAN with pix2pix and added a label function to establish a border enhance GAN with the purpose of enhancing the detail of border images and producing more accurate realistic images. In this manner, traditional Chinese paintings can be invigorated. Finally, we compared the image generated using our model with other benchmarks. The results revealed that the image generated using our model exhibited greater similarity to the actual photograph than did the benchmark images. Therefore, our model mitigates a major problem encountered in previous works and renders more realistic results. These interactive images clearly and profoundly convey Chinese culture, offering the user a novel art experience. Moreover, when viewers can interact with the input image by selecting different geologic styles, they can derive a relatively profound immersive experience. Our study can serve as a reference in transforming images (such as watercolor and oil paintings) with blurry borders.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Algorithm 2
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  1. Chan T-H, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) Pcanet: a simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032

    Article  MathSciNet  MATH  Google Scholar 

  2. Chen C, Tan X, Wong K-Y K (2018) Face sketch synthesis with style transfer using pyramid column feature. In: 2018 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 485–493

  3. Chen L, Wu L, Hu Z, Wang M (2019) Quality-aware unpaired image-to-image translation. IEEE Trans Multimed 21(10):2664–2674

    Article  Google Scholar 

  4. Chen S (2020) Exploration of artistic creation of chinese ink style painting based on deep learning framework and convolutional neural network model. Soft Comput 24(11):7873–7884

    Article  Google Scholar 

  5. Cheng Y, Gan Z, Li Y, Liu J, Gao J (2020) Sequential attention gan for interactive image editing. In: Proceedings of the 28th ACM international conference on multimedia, pp 4383–4391

  6. Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. arXiv:1202.2745

  7. Dai C, Peng C, Chen M (2020) Selective transfer cycle gan for unsupervised person re-identification. Multimedia Tools and Applications, 1–17

  8. Dou H, Chen C, Hu X, Peng S (2019) Asymmetric cyclegan for unpaired nir-to-rgb face image translation. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1757–1761

  9. Efros A A, Freeman W T (2001) Image quilting for texture synthesis and transfer. In: Proceedings of the 28th annual conference on computer graphics and interactive techniques. ACM, pp 341–346

  10. Efros A A, Leung T K (1999) Texture synthesis by non-parametric sampling. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2. IEEE, pp 1033–1038

  11. Gao W, Li Y, Yin Y, Yang M-H (2020) Fast video multi-style transfer. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3222–3230

  12. Gatys L A, Ecker A S, Bethge M (2015) A neural algorithm of artistic style. arXiv:1508.06576

  13. Goodfellow I (2016) Nips 2016 Tutorial: generative adversarial networks. arXiv:1701.00160

  14. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  15. Gupta S, Mazumdar S G (2013) Sobel edge detection algorithm. Int J Comput Sci Manag Res 2(2):1578–1583

    Google Scholar 

  16. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  17. Hertzmann A, Jacobs C E, Oliver N, Curless B, Salesin D H (2001) Image analogies. In: Proceedings of the 28th annual conference on computer graphics and interactive techniques. ACM, pp 327–340

  18. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637

  19. Hinton G E, Salakhutdinov R R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  20. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167

  21. Isola P, Zhu J-Y, Zhou T, Efros A A (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134

  22. Jia Z, Yuan B, Wang K, Wu H, Clifford D, Yuan Z, Su H (2020) Lipschitz regularized cyclegan for improving semantic robustness in unpaired image-to-image translation. arXiv:2012.04932

  23. Jing Y, Liu X, Ding Y, Wang X, Ding E, Song M, Wen S (2020) Dynamic instance normalization for arbitrary style transfer. In: Proceedings of the AAAI conference on artificial intelligence. vol 34, pp 4369–4376

  24. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision. Springer, pp 694–711

  25. Khan A, Ahmad M, Naqvi N, Yousafzai F, Xiao J (2019) Photographic painting style transfer using convolutional neural networks. Multimed Tools Applic 78(14):19565–19586

    Article  Google Scholar 

  26. Kolkin N, Salavon J, Shakhnarovich G (2019) Style transfer by relaxed optimal transport and self-similarity. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10051–10060

  27. Krizhevsky A, Sutskever I, Hinton G E (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  28. Larsen A B L, Sønderby S K, Larochelle H, Winther O (2015) Autoencoding beyond pixels using a learned similarity metric. arXiv:1512.09300

  29. Li C, Wand M (2016) Combining markov random fields and convolutional neural networks for image synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2479–2486

  30. Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European conference on computer vision. Springer, pp 702–716

  31. Li Y, Tang S, Zhang R, Zhang Y, Li J, Yan S (2019) Asymmetric gan for unpaired image-to-image translation. IEEE Trans Image Process 28(12):5881–5896

    Article  MathSciNet  MATH  Google Scholar 

  32. Li Z, Zhou F, Yang L, Li X, Li J (2020) Accelerate neural style transfer with super-resolution. Multimed Tools Applic 79(7):4347–4364

    Article  Google Scholar 

  33. Liang Y, Lee D, Li Y, Shin B-S (2021) Unpaired medical image colorization using generative adversarial network. Multimed Tools Applic, 1–15

  34. Lin D, Wang Y, Xu G, Li J, Fu K (2018) Transform a simple sketch to a chinese painting by a multiscale deep neural network. Algorithms 11 (1):4

    Article  MathSciNet  MATH  Google Scholar 

  35. Liu B, Zhu Y, Song K, Elgammal A (2021) Self-supervised sketch-to-image synthesis. In: Proceedings of the AAAI conference on artificial intelligence. vol 35, pp 2073–2081

  36. Liu R, Yu Q, Yu S X (2020) Unsupervised sketch to photo synthesis. In: Computer Vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer, pp 36–52

  37. Longman R, Ptucha R (2019) Embedded cyclegan for shape-agnostic image-to-image translation. In: 2019 IEEE International conference on image processing (ICIP). IEEE, pp 969–973

  38. Lu Y, Wu S, Tai Y W, Tang C K, Youtu T (2017) Sketch-to-image generation using deep contextual completion. arXiv:1711.08972

  39. Osahor U, Kazemi H, Dabouei A, Nasrabadi N (2020) Quality guided sketch-to-photo image synthesis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 820–821

  40. Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros A A (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544

  41. Peng C, Wang N, Li J, Gao X (2020) Universal face photo-sketch style transfer via multiview domain translation. IEEE Trans Image Process 29:8519–8534

    Article  MATH  Google Scholar 

  42. Peng F, Zhang L-, Long M (2018) Fd-gan: face-demorphing generative adversarial network for restoring accomplice’s facial image. arXiv:1811.07665

  43. Pęśko M, Trzciński T (2018) Neural comic style transfer: case study. arXiv:1809.01726

  44. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241

  45. Ruder M, Dosovitskiy A, Brox T (2016) Artistic style transfer for videos. In: German conference on pattern recognition. Springer, pp 26–36

  46. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242

  47. Shen Y, Luo P, Yan J, Wang X, Tang X (2018) Faceid-gan: learning a symmetry three-player gan for identity-preserving face synthesis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 821–830

  48. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  49. Tran L, Yin X, Liu X (2017) Disentangled representation learning gan for pose-invariant face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1415–1424

  50. Turmukhambetov D, Campbell Neill DF, Goldman D B, Kautz J (2015) Interactive sketch-driven image synthesis. In: Computer graphics forum, vol 34. Wiley Online Library, pp 130–142

  51. Tyleček R, Šára R (2013) Spatial pattern templates for recognition of objects with regular structure. In: German conference on pattern recognition. Springer, pp 364–374

  52. Wada K (2016) Labelme: image polygonal annotation with Python. https://github.com/wkentaro/labelme

  53. Wang W, Xu J, Zhang L, Wang Y, Liu J (2020) Consistent video style transfer via compound regularization. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 12233–12240

  54. Wang X, Gupta A (2016) Generative image modeling using style and structure adversarial networks. In: European conference on computer vision. Springer, pp 318–335

  55. Way D-L, Chang W-C, Shih Z-C (2019) Deep learning for anime style transfer. In: Proceedings of the 2019 3rd international conference on advances in image processing, pp 139–143

  56. Xue A (2021) End-to-end chinese landscape painting creation using generative adversarial networks. In: Proceedings of the IEEE/CVF Winter conference on applications of computer vision, pp 3863–3871

  57. Yao Y, Ren J, Xie X, Liu W, Liu Y-J, Wang J (2019) Attention-aware multi-stroke style transfer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1467–1475

  58. Zhou L, Wang Q-F, Huang K, Lo C-H (2019) An interactive and generative approach for chinese shanshui painting document. In: 2019 International conference on document analysis and recognition (ICDAR). IEEE, pp 819–824

  59. Zhou T, Krahenbuhl P, Aubry M, Huang Q, Efros A A (2016) Learning dense correspondence via 3d-guided cycle consistency. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 117–126

  60. Zhu J-Y, Park T, Isola P, Efros A A (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

Download references

Acknowledgments

This work was supported in part by the Ministry of Science and Technology, Taiwan, under MOST 111-2622-8-A49-013 -TM1 and MOST 111-2221-E-A49 -125 -MY3; and in part by the Financial Technology (FinTech) Innovation Research Center,National Yang Ming Chiao Tung University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Szu-Hao Huang.

Ethics declarations

Conflict of Interests

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chung, CY., Huang, SH. Interactively transforming chinese ink paintings into realistic images using a border enhance generative adversarial network. Multimed Tools Appl 82, 11663–11696 (2023). https://doi.org/10.1007/s11042-022-13684-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13684-4

Keywords

Navigation