Skip to main content
Log in

A brief study of generative adversarial networks and their applications in image synthesis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Image Synthesis (IS), an expansion to Artificial Intelligence (AI) and Computer Vision, is the technique of artificially producing images that retains some specific required contents. An adequate procedure to handle IS problem is to tackle it using the Deep Generative Models. Generative Models are broadly utilized in numerous sub fields of AI and have empowered versatile demonstration of perplexing scenarios including image, text and music. In this paper, a particular class of Deep Generative model namely Generative Adversarial Networks (GAN) has been considered to provide a way to acquire deep illustrations derived from backpropagation signals and without the use of wide range of annotated training data. The design of GAN architecture plays a key role in image synthesis and the motive behind this paper is to analyse GAN architecture based on different variants of GANs with respect to Image Synthesis. Furthermore, a compact categorization of GANs along with their key features, pros and cons have been investigated to identify the research challenges in this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

Data Availability

Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.

References

  1. Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial

  2. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan

  3. Cha D, Kim D (2022) Dam-gan: Image inpainting using dynamic attention map based on fake texture detection

  4. Chen J, Liu G, Chen X (2020) Animegan: A novel lightweight gan for photo animation. In: Li W, Wang H, Liu Y (eds) Li K. Artificial Intelligence Algorithms and Applications, Singapore, Springer Singapore, pp 242–256

    Google Scholar 

  5. Chen Y, Lai YK, Liu YJ (2018) Cartoongan: Generative adversarial networks for photo cartoonization. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 9465–9474

  6. Chen T, Zhai X, Ritter M, Lucic M, Houlsby N (2019) Self-supervised gans via auxiliary rotation loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp 12154–12163

  7. Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J (2017) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation

  8. Dash A, Gamboa JCB, Ahmed S, Liwicki M, Afzal MZ (2017) Tac-gan - text conditioned auxiliary classifier generative adversarial network

  9. Dhariwal P, Nichol A (2021) Diffusion models beat gans on image synthesis

  10. Dolhansky B, Ferrer CC (2017) Eye in-painting with exemplar generative adversarial networks

  11. Esser P, Rombach R, Ommer B (2020) Taming transformers for high-resolution image synthesis

  12. Frühstück A, Singh KK, Shechtman E, Mitra NJ, Wonka P, Lu, J (2022) Insetgan for full-body image generation

  13. Goodfellow I (2017) Nips 2016 tutorial: Generative adversarial networks

  14. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio, Y (2014) Generative adversarial networks

  15. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein gans

  16. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. Advances in Neural Information Processing Systems 30

  17. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium

  18. Huang X, Liu MY, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation

  19. Huang X, Mallya A, Wang TC, Liu MY (2021) Multimodal conditional image synthesis with product-of-experts gans

  20. Isola P, Zhu JY, Zhou T, Efros AA (2016) Image-to-image translation with conditional adversarial networks

  21. Jin Y, Zhang J, Li M, Tian Y, Zhu H, Fang Z (2017) Towards the automatic anime characters creation with generative adversarial networks

  22. Jolicoeur-Martineau A (2018) The relativistic discriminator: a key element missing from standard gan

  23. Jo Y, Park J (2019) Sc-fegan: Face editing generative adversarial network with user’s sketch and color

  24. Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and

  25. Karras T, Aittala M, Laine S, Härkönen E, Hellsten J, Lehtinen J, Aila T (2021) Alias-free generative adversarial networks

  26. Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2020) Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp 8110–8119

  27. Kavalerov I, Czaja W, Chellappa R (2019) cgans with multi-hinge loss

  28. Kingma DP, Welling M (2013) Auto-encoding variational bayes

  29. Kodali N, Abernethy J, Hays J, Kira Z (2017) On convergence and stability of gans. arXiv preprint arXiv:1705.07215

  30. Kwak JG, Li Y, Yoon D, Kim D, Han D, Ko H (2022) Injecting 3d perception of controllable nerf-gan into stylegan for editable portrait image synthesis. In: European Conference on Computer Vision, Springer 236–253

  31. Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2016) Photo-realistic single image super-resolution using a generative adversarial network

  32. Lee K, Chang H, Jiang L, Zhang H, Tu Z, Liu C (2021) Vitgan: Training gans with vision transformers

  33. Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollár P (2014) Microsoft coco: Common objects in context

  34. Ling H, Kreis K, Li D, Kim SW, Torralba A, Fidler S (2021) Editgan: High-precision semantic image editing

  35. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows

  36. Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  37. Li B, Zhu Y, Wang Y, Lin CW, Ghanem B, Shen L (2021) Anigan: Style-guided generative adversarial networks for unsupervised anime face generation

  38. Lu Y, Lu J (2020) A universal approximation theorem of deep neural networks for expressing probability distributions

  39. Mao X, Li Q, Xie H, Lau RYK, Wang Z, Smolley SP (2016) Least squares generative adversarial networks

  40. Meng Q, Chen A, Luo H, Wu M, Su H, Xu L, He X, Yu J (2021) Gnerf: Gan-based neural radiance field without posed camera

  41. Metz L, Poole B, Pfau D, Sohl-Dickstein J (2016) Unrolled generative adversarial networks. arXiv preprint arXiv:1611.02163

  42. Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957

  43. Mroueh Y, Sercu T (2017) Fisher gan. Advances in Neural Information Processing Systems 30

  44. Müller A (1997) Integral probability metrics and their generating classes of functions. Adv Appl Probab 29(2):429–443

    Article  MathSciNet  Google Scholar 

  45. Nichol A, Dhariwal P, Ramesh A, Shyam P, Mishkin P, McGrew B, Sutskever I, Chen M (2021) Glide: Towards photorealistic image generation and editing with text-guided diffusion models

  46. Nowozin S, Cseke B, Tomioka R (2016) f-gan: Training generative neural samplers using variational divergence minimization. Advances in Neural Information Processing Systems 29

  47. Odena A, Olah C, Shlens J (2016) Conditional image synthesis with auxiliary classifier gans

  48. Oord AVD, Vinyals O, Kavukcuoglu K (2017) Neural discrete representation learning

  49. Park SW, Kwon J (2019) Sphere generative adversarial network based on geometric moment matching. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 4287–4296

  50. Petzka H, Fischer A, Lukovnicov D (2017) On the regularization of wasserstein gans. arXiv preprint arXiv:1709.08894

  51. Pfau D, Vinyals O (2016) Connecting generative adversarial networks and actor-critic methods

  52. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks

  53. Ramesh A, Dhariwal P, Nichol A, Chu C, Chen M (2022) Hierarchical text-conditional image generation with clip latents

  54. Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H (2016) Generative adversarial text to image synthesis

  55. Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B (2021) High-resolution image synthesis with latent diffusion models

  56. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation

  57. Ruthotto L, Haber E (2021) An introduction to deep generative modeling

  58. Saharia C, Chan W, Saxena S, Li L, Whang J, Denton E, Ghasemipour SKS, Ayan BK, Mahdavi SS, Lopes RG, Salimans T, Ho J, Fleet DJ, Norouzi M (2022) Photorealistic text-to-image diffusion models with deep language understanding

  59. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans

  60. Sauer A, Karras T, Laine S, Geiger A, Aila T (2023) Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis

  61. Sauer A, Schwarz K, Geiger A (2022) Stylegan-xl: Scaling stylegan to large diverse datasets

  62. Shee CF, Uchida S (2022) Montagegan: Generation and assembly of multiple components by gans. In: 2022 26th International Conference on Pattern Recognition (ICPR), IEEE. pp 1478–1484

  63. Shi W, Caballero J, Huszár F, Totz J, Aitken AP, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network

  64. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision

  65. Taigman Y, Polyak A, Wolf L (2016) Unsupervised cross-domain image generation

  66. Tao M, Tang H, Wu F, Jing XY, Bao BK, Xu C (2020) Df-gan: A simple and effective baseline for text-to-image synthesis

  67. Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, Loy CC, Qiao Y, Tang X (2018) Esrgan: Enhanced super-resolution generative adversarial networks

  68. Weng L (2019) From gan to wgan. arXiv preprint arXiv:1904.08994

  69. White T (2016) Sampling generative networks

  70. Wu H, Zheng S, Zhang J, Huang K (2017) Gp-gan: Towards realistic high-resolution image blending

  71. Yuan Y, Guo Y (2020) A review on generative adversarial networks. In: 2020 5th International Conference on Information Science, Computer Technology and Transportation (ISCTT) 392–401

  72. Yu J, Lin Z, Yang J, Shen X, Lu X, Huang TS (2018) Free-form image inpainting with gated convolution. arXiv preprint arXiv:1806.03589

  73. Zhang B, Gu S, Zhang B, Bao J, Chen D, Wen F, Wang Y, Guo B (2021) Styleswin: Transformer-based gan for high-resolution image generation

  74. Zhang H, Xu, T, Li H, Zhang S, Wang X, Huang X, Metaxas D (2016) Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks

  75. Zhao J, Mathieu M, LeCun Y (2016) Energy-based generative adversarial network

  76. Zhao J, Mathieu M, LeCun Y (2016) Energy-based generative adversarial network. arXiv preprint arXiv:1609.03126

  77. Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks

  78. Zhu Z, Huang T, Shi B, Yu M, Wang B, Bai X (2019) Progressive pose attention transfer for person image generation

  79. Zhu J, Shen Y, Zhao D, Zhou B (2020) In-domain gan inversion for real image editing

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Smita Das.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, H., Das, S. A brief study of generative adversarial networks and their applications in image synthesis. Multimed Tools Appl 83, 21551–21581 (2024). https://doi.org/10.1007/s11042-023-16175-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16175-2

Keywords

Navigation