Abstract
Image Synthesis (IS), an expansion to Artificial Intelligence (AI) and Computer Vision, is the technique of artificially producing images that retains some specific required contents. An adequate procedure to handle IS problem is to tackle it using the Deep Generative Models. Generative Models are broadly utilized in numerous sub fields of AI and have empowered versatile demonstration of perplexing scenarios including image, text and music. In this paper, a particular class of Deep Generative model namely Generative Adversarial Networks (GAN) has been considered to provide a way to acquire deep illustrations derived from backpropagation signals and without the use of wide range of annotated training data. The design of GAN architecture plays a key role in image synthesis and the motive behind this paper is to analyse GAN architecture based on different variants of GANs with respect to Image Synthesis. Furthermore, a compact categorization of GANs along with their key features, pros and cons have been investigated to identify the research challenges in this field.
Similar content being viewed by others
Data Availability
Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.
References
Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan
Cha D, Kim D (2022) Dam-gan: Image inpainting using dynamic attention map based on fake texture detection
Chen J, Liu G, Chen X (2020) Animegan: A novel lightweight gan for photo animation. In: Li W, Wang H, Liu Y (eds) Li K. Artificial Intelligence Algorithms and Applications, Singapore, Springer Singapore, pp 242–256
Chen Y, Lai YK, Liu YJ (2018) Cartoongan: Generative adversarial networks for photo cartoonization. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 9465–9474
Chen T, Zhai X, Ritter M, Lucic M, Houlsby N (2019) Self-supervised gans via auxiliary rotation loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp 12154–12163
Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J (2017) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation
Dash A, Gamboa JCB, Ahmed S, Liwicki M, Afzal MZ (2017) Tac-gan - text conditioned auxiliary classifier generative adversarial network
Dhariwal P, Nichol A (2021) Diffusion models beat gans on image synthesis
Dolhansky B, Ferrer CC (2017) Eye in-painting with exemplar generative adversarial networks
Esser P, Rombach R, Ommer B (2020) Taming transformers for high-resolution image synthesis
Frühstück A, Singh KK, Shechtman E, Mitra NJ, Wonka P, Lu, J (2022) Insetgan for full-body image generation
Goodfellow I (2017) Nips 2016 tutorial: Generative adversarial networks
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio, Y (2014) Generative adversarial networks
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein gans
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. Advances in Neural Information Processing Systems 30
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium
Huang X, Liu MY, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation
Huang X, Mallya A, Wang TC, Liu MY (2021) Multimodal conditional image synthesis with product-of-experts gans
Isola P, Zhu JY, Zhou T, Efros AA (2016) Image-to-image translation with conditional adversarial networks
Jin Y, Zhang J, Li M, Tian Y, Zhu H, Fang Z (2017) Towards the automatic anime characters creation with generative adversarial networks
Jolicoeur-Martineau A (2018) The relativistic discriminator: a key element missing from standard gan
Jo Y, Park J (2019) Sc-fegan: Face editing generative adversarial network with user’s sketch and color
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and
Karras T, Aittala M, Laine S, Härkönen E, Hellsten J, Lehtinen J, Aila T (2021) Alias-free generative adversarial networks
Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2020) Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp 8110–8119
Kavalerov I, Czaja W, Chellappa R (2019) cgans with multi-hinge loss
Kingma DP, Welling M (2013) Auto-encoding variational bayes
Kodali N, Abernethy J, Hays J, Kira Z (2017) On convergence and stability of gans. arXiv preprint arXiv:1705.07215
Kwak JG, Li Y, Yoon D, Kim D, Han D, Ko H (2022) Injecting 3d perception of controllable nerf-gan into stylegan for editable portrait image synthesis. In: European Conference on Computer Vision, Springer 236–253
Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2016) Photo-realistic single image super-resolution using a generative adversarial network
Lee K, Chang H, Jiang L, Zhang H, Tu Z, Liu C (2021) Vitgan: Training gans with vision transformers
Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollár P (2014) Microsoft coco: Common objects in context
Ling H, Kreis K, Li D, Kim SW, Torralba A, Fidler S (2021) Editgan: High-precision semantic image editing
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: Hierarchical vision transformer using shifted windows
Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Li B, Zhu Y, Wang Y, Lin CW, Ghanem B, Shen L (2021) Anigan: Style-guided generative adversarial networks for unsupervised anime face generation
Lu Y, Lu J (2020) A universal approximation theorem of deep neural networks for expressing probability distributions
Mao X, Li Q, Xie H, Lau RYK, Wang Z, Smolley SP (2016) Least squares generative adversarial networks
Meng Q, Chen A, Luo H, Wu M, Su H, Xu L, He X, Yu J (2021) Gnerf: Gan-based neural radiance field without posed camera
Metz L, Poole B, Pfau D, Sohl-Dickstein J (2016) Unrolled generative adversarial networks. arXiv preprint arXiv:1611.02163
Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957
Mroueh Y, Sercu T (2017) Fisher gan. Advances in Neural Information Processing Systems 30
Müller A (1997) Integral probability metrics and their generating classes of functions. Adv Appl Probab 29(2):429–443
Nichol A, Dhariwal P, Ramesh A, Shyam P, Mishkin P, McGrew B, Sutskever I, Chen M (2021) Glide: Towards photorealistic image generation and editing with text-guided diffusion models
Nowozin S, Cseke B, Tomioka R (2016) f-gan: Training generative neural samplers using variational divergence minimization. Advances in Neural Information Processing Systems 29
Odena A, Olah C, Shlens J (2016) Conditional image synthesis with auxiliary classifier gans
Oord AVD, Vinyals O, Kavukcuoglu K (2017) Neural discrete representation learning
Park SW, Kwon J (2019) Sphere generative adversarial network based on geometric moment matching. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 4287–4296
Petzka H, Fischer A, Lukovnicov D (2017) On the regularization of wasserstein gans. arXiv preprint arXiv:1709.08894
Pfau D, Vinyals O (2016) Connecting generative adversarial networks and actor-critic methods
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks
Ramesh A, Dhariwal P, Nichol A, Chu C, Chen M (2022) Hierarchical text-conditional image generation with clip latents
Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H (2016) Generative adversarial text to image synthesis
Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B (2021) High-resolution image synthesis with latent diffusion models
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation
Ruthotto L, Haber E (2021) An introduction to deep generative modeling
Saharia C, Chan W, Saxena S, Li L, Whang J, Denton E, Ghasemipour SKS, Ayan BK, Mahdavi SS, Lopes RG, Salimans T, Ho J, Fleet DJ, Norouzi M (2022) Photorealistic text-to-image diffusion models with deep language understanding
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans
Sauer A, Karras T, Laine S, Geiger A, Aila T (2023) Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis
Sauer A, Schwarz K, Geiger A (2022) Stylegan-xl: Scaling stylegan to large diverse datasets
Shee CF, Uchida S (2022) Montagegan: Generation and assembly of multiple components by gans. In: 2022 26th International Conference on Pattern Recognition (ICPR), IEEE. pp 1478–1484
Shi W, Caballero J, Huszár F, Totz J, Aitken AP, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision
Taigman Y, Polyak A, Wolf L (2016) Unsupervised cross-domain image generation
Tao M, Tang H, Wu F, Jing XY, Bao BK, Xu C (2020) Df-gan: A simple and effective baseline for text-to-image synthesis
Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, Loy CC, Qiao Y, Tang X (2018) Esrgan: Enhanced super-resolution generative adversarial networks
Weng L (2019) From gan to wgan. arXiv preprint arXiv:1904.08994
White T (2016) Sampling generative networks
Wu H, Zheng S, Zhang J, Huang K (2017) Gp-gan: Towards realistic high-resolution image blending
Yuan Y, Guo Y (2020) A review on generative adversarial networks. In: 2020 5th International Conference on Information Science, Computer Technology and Transportation (ISCTT) 392–401
Yu J, Lin Z, Yang J, Shen X, Lu X, Huang TS (2018) Free-form image inpainting with gated convolution. arXiv preprint arXiv:1806.03589
Zhang B, Gu S, Zhang B, Bao J, Chen D, Wen F, Wang Y, Guo B (2021) Styleswin: Transformer-based gan for high-resolution image generation
Zhang H, Xu, T, Li H, Zhang S, Wang X, Huang X, Metaxas D (2016) Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks
Zhao J, Mathieu M, LeCun Y (2016) Energy-based generative adversarial network
Zhao J, Mathieu M, LeCun Y (2016) Energy-based generative adversarial network. arXiv preprint arXiv:1609.03126
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks
Zhu Z, Huang T, Shi B, Yu M, Wang B, Bai X (2019) Progressive pose attention transfer for person image generation
Zhu J, Shen Y, Zhao D, Zhou B (2020) In-domain gan inversion for real image editing
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sharma, H., Das, S. A brief study of generative adversarial networks and their applications in image synthesis. Multimed Tools Appl 83, 21551–21581 (2024). https://doi.org/10.1007/s11042-023-16175-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16175-2