Abstract
The creation of an image from another and from different types of data including text, scene graph, and object layout, is one of the very challenging tasks in computer vision. In addition, capturing images from different views for generating an object or a product can be exhaustive and expansive to do manually. Now, using deep learning and artificial intelligence techniques, the generation of new images from different type of data has become possible. For that, a significant effort has been devoted recently to develop image generation strategies with a great achievement. To that end, we present in this paper, to the best of the authors’ knowledge, the first comprehensive overview of existing image generation methods. Accordingly, a description of each image generation technique is performed based on the nature of the adopted algorithms, type of data used, and main objective. Moreover, each image generation category is discussed by presenting the proposed approaches. In addition, a presentation of existing image generation datasets is given. The evaluation metrics that are suitable for each image generation category are discussed and a comparison of the performance of existing solutions is provided to better inform the state-of-the-art and identify their limitations and strengths. Lastly, the current challenges that are facing this subject are presented.
This is a preview of subscription content, access via your institution.





Notes
References
Akbari Y, Almaadeed N, Al-maadeed S, Elharrouss O (2021) Applications, databases and open computer vision research from drone videos and images: a survey. Artif Intell Rev 54(5):3887–3938
Elharrouss O, Almaadeed N, Al-Maadeed S (2021) A review of video surveillance systems. J Vis Commun Image Represent 77:103116
Elharrouss O, Al-Maadeed S, Subramanian N, Ottakath N, Almaadeed N, Himeur Y (2021) Panoptic segmentation: a review. arXiv preprint arXiv:2111.10250
Ma L, Sun Q, Georgoulis S, Van Gool L, Schiele B, Fritz M (2018) Disentangled person image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 99–108
Elharrouss O, Moujahid D, Elkah S, Tairi H (2016) Moving object detection using a background modeling based on entropy theory and quad-tree decomposition. J Electron Imaging 25(6):061615
Maafiri A, Elharrouss O, Rfifi S, Al-Maadeed SA, Chougdali K (2021) DeepWTPCA-L1: a new deep face recognition model based on WTPCA-L1 norm features. IEEE Access 9:65091–65100
Zhu J-Y, Zhoutong Z, Chengkai Z, Jiajun W, Antonio T, Josh T, Bill F (2018) Visual object networks: image generation with disentangled 3D representations. Adv Neural Inform Process Syst, pp 118–129
Han C, Hayashi H, Rundo L, Araki R, Shimoda W, Muramatsu S, Furukawa Y, Mauri G, Nakayama H (2018) GAN-based synthetic brain MR image generation. In: 2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018), pp 734–738. IEEE
Mao X, Wang S, Zheng L, Huang Q (2018) Semantic invariant cross-domain image generation with generative adversarial networks. Neurocomputing 293:55–63
Liu Y, Qin Z, Wan T, Luo Z (2018) Auto-painter: Cartoon image generation from sketch by using conditional Wasserstein generative adversarial networks. Neurocomputing 311:78–87
Sharma S, Suhubdy D, Michalski V, Kahou SE, Bengio Y (2018) Chatpainter: improving text to image generation using dialogue. arXiv preprint arXiv:1802.08216
Zhao B, Wu X, Cheng ZQ, Liu H, Jie Z, Feng J (2018) Multi-view image generation from a single-view. In: Proceedings of the 26th ACM international conference on multimedia, pp 383–391
Jakab T, Gupta A, Bilen H, Vedaldi A (2018) Conditional image generation for learning the structure of visual objects. Methods 43:44
Esser P, Sutter E, Ommer B (2018) A variational u-net for conditional appearance and shape generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8857–8866
Johnson J, Gupta A, Fei-Fei L (2018) Image generation from scene graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1219–1228
Bodla N, Hua G, Chellappa R (2018) Semi-supervised FusedGAN for conditional image generation. In: Proceedings of the European conference on computer vision (ECCV), pp 669–683
Siarohin A, Sangineto E, Lathuiliere S, Sebe N (2018) Deformable gans for pose-based human image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3408–3416
Xu T, Zhang P, Huang Q, Zhang H, Gan Z, Huang X, He X (2018) Attngan: fine-grained text to image generation with attentional generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1316–1324
Qian X, Fu Y, Xiang T, Wang W, Qiu J, Wu Y, Xue X (2018) Pose-normalized image generation for person re-identification. In: Proceedings of the European conference on computer vision (ECCV), pp 650–667
Lu Y, Wu S, Tai YW, Tang CK (2018) Image generation from sketch constraint using contextual gan. In: Proceedings of the European conference on computer vision (ECCV), pp 205–220
Cai L, Gao H, Ji S (2019) Multi-stage variational auto-encoders for coarse-to-fine image generation. In: Proceedings of the 2019 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, pp 630–638
Chelsea F, Pieter A, Sergey L (2017) Modelagnostic meta-learning for fast adaptation of deep networks. CoRR, arXiv:1703.03400
Nichol A, Achiam J, Schulman J (2018) On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999
Clouâtre L, Demers M (2019) Figr: few-shot image generation with reptile. arXiv preprint arXiv:1901.02199
Tripathi S, Bhiwandiwalla A, Bastidas A, Tang H (2019) Using scene graph context to improve image generation. arXiv preprint arXiv:1901.03762
Lucic M, Tschannen M, Ritter M, Zhai X, Bachem O, Gelly S (2019) High-fidelity image generation with fewer labels. arXiv preprint arXiv:1903.02271
Jiang S, Liu H, Wu Y, Fu Y (2019) Spatially constrained generative adversarial networks for conditional image generation. arXiv preprint arXiv:1905.02320
Mittal G, Agrawal S, Agarwal A, Mehta S, Marwah T (2019) Interactive image generation using scene graphs. arXiv preprint arXiv:1905.03743
Ardizzone L, Lüth C, Kruse J, Rother C, Köthe U (2019) Guided image generation with conditional invertible neural networks. arXiv preprint arXiv:1907.02392
Xu Z, Wang X, Shin HC, Yang D, Roth H, Milletari F, Xu D (2019) Correlation via synthesis: end-to-end nodule image generation and radiogenomic map learning based on generative adversarial network. arXiv preprint arXiv:1907.03728
Andreini P, Bonechi S, Bianchini M, Mecocci A, Scarselli F, Sodi A (2019) A two stage gan for high resolution retinal image generation and segmentation. arXiv preprint arXiv:1907.12296
Sarkar A, Iyengar R (2020) Enforcing linearity in dnn succours robustness and adversarial image generation. In: International conference on artificial neural networks Springer, Cham, pp 52–64
Pan J, Goyal Y, Lee S (2019) Question-conditioned counterfactual image generation for VQA. arXiv preprint arXiv:1911.06352
Liang D, Wang R, Tian X, Zou C (2019) PCGAN: partition-controlled human image generation. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 8698–8705
Jakab T, Gupta A, Bilen H, Vedaldi A (2018) Unsupervised learning of object landmarks through conditional image generation. Adv Neural Inf Process Syst 31:4016–4027
Qiao T, Zhang J, Xu D, Tao D (2019) Learn, imagine and create: text-to-image generation from prior knowledge. In: Advances in neural information processing systems, pp 887–897
Li B, Qi X, Lukasiewicz T, Torr P (2019) Controllable text-to-image generation. Adv Neural Inf Process Syst 32:2065–2075
Tang H, Xu D, Liu G, Wang W, Sebe N, Yan Y (2019) Cycle in cycle generative adversarial networks for keypoint-guided image generation. In: Proceedings of the 27th ACM international conference on multimedia, pp 2052–2060
Yong H, Huang J, Xiang W, Hua X, Zhang L (2019) Panoramic background image generation for PTZ cameras. IEEE Trans Image Process 28(7):3162–3176
Xu W, Keshmiri S, Wang G (2019) Adversarially approximated autoencoder for image generation and manipulation. IEEE Trans Multimed 21(9):2387–2396
Togo R, Ogawa T, Haseyama M (2019) Synthetic gastritis image generation via loss function-based conditional PGGAN. IEEE Access 7:87448–87457
Al Rahhal MM, Bazi Y, Almubarak H, Alajlan N, Al Zuair M (2019) Dense convolutional networks with focal loss and image generation for electrocardiogram classification. IEEE Access 7:182225–182237
Huang HM, Lin C (2019) A kernel-based image denoising method for improving parametric image generation. Med Image Anal 55:41–48
Bailo O, Ham D, Min Shin Y (2019) Red blood cell image generation for data augmentation using conditional generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops
Grigorev A, Sevastopolsky A, Vakhitov A, Lempitsky V (2019) Coordinate-based texture inpainting for pose-guided human image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12135–12144
Gu J, Zhao H, Lin Z, Li S, Cai J, Ling M (2019) Scene graph generation with external knowledge and image reconstruction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1969–1978
Han X, Hu X, Huang W, Scott MR (2019) Clothflow: a flow-based model for clothed person generation. In: Proceedings of the IEEE international conference on computer vision, pp 10471–10480
Heim E (2019) Constrained generative adversarial networks for interactive image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10753–10761
Tripathi S, Bhiwandiwalla A, Bastidas A, Tang H (2019) Heuristics for image generation from scene graphs
Burlina PM, Joshi N, Pacheco KD, Liu TA, Bressler NM (2019) Assessment of deep generative models for high-resolution synthetic retinal image generation of age-related macular degeneration. JAMA Ophthalmol 137(3):258–264
Noguchi A, Harada T (2019) Image generation from small datasets via batch statistics adaptation. In: Proceedings of the IEEE international conference on computer vision, pp 2750–2758
Pan J, Wang C, Jia X, Shao J, Sheng L, Yan J, Wang X (2019) Video generation from single semantic label map. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3733–3742
Wong H, Neary D, Shahzad S, Jones E, Fox P, Sutcliffe C (2019) Pilot investigation of feedback electronic image generation in electron beam melting and its potential for in-process monitoring. J Mater Process Technol 266:502–517
Qiao T, Zhang J, Xu D, Tao D (2019) Mirrorgan: learning text-to-image generation by redescription. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1505–1514
Song S, Zhang W, Liu J, Mei T (2019) Unsupervised person image generation with semantic parsing transformation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2357–2366
Yin G, Liu B, Sheng L, Yu N, Wang X, Shao J (2019) Semantics disentangling for text-to-image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2327–2336
Zhai M, Chen L, Tung F, He J, Nawhal M, Mori G (2019) Lifelong gan: Continual learning for conditional image generation. In: Proceedings of the IEEE international conference on computer vision, pp 2759–2768
Zhang J, Yin X, Luan J, Liu T (2019) An improved vehicle panoramic image generation algorithm. Multimed Tools Appl 78(19):27663–27682
Zhao B, Meng L, Yin W, Sigal L (2019) Image generation from layout. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8584–8593
Zhu Z, Huang T, Shi B, Yu M, Wang B, Bai X (2019) Progressive pose attention transfer for person image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2347–2356
Damer N, Boutros F, Mallat K, Kirchbuchner F, Dugelay JL, Kuijper A (2019) Cascaded generation of high-quality color visible face images from thermal captures. arXiv preprint arXiv:1910.09524
Ak KE, Lim JH, Tham JY, Kassim AA (2020) Semantically consistent text to fashion image synthesis with an enhanced attentional generative adversarial network. Pattern Recogn Lett
Andreini P, Bonechi S, Bianchini M, Mecocci A, Scarselli F (2020) Image generation by gan and style transfer for agar plate image segmentation. Comput Methods Programs Biomed 184:105268
Sun J, Zhong G, Chen Y, Liu Y, Li T, Huang K (2020) Generative adversarial networks with mixture of t-distributions noise for diverse image generation. Neural Netw 122:374–381
Yanshu Y, Xun H, Jixin H, Wenjie F, Linxin W, Taizhong D, Lei Z, Wenbiao Z (2020) A three-dimensional model of deep-water turbidity channel in Plutonio oilfield, Angola: From training image generation, optimization to multi-point geostatistical modelling. J Petrol Sci Eng 195:107650
Zhang Z, Pan X, Jiang S, Zhao P (2020) High-quality face image generation based on generative adversarial networks. J Vis Commun Image Represent 71:102719
Shi H, Wang L, Tang W, Zheng N, Hua G Loss functions for person image generation
Ali MA, Alsaidi BK (2020) Luminance pyramid for image generation and colorization. Periodic Eng Nat Sci 8(2):784–789
Pavllo D, Lucchi A, Hofmann T (2020) Controlling style and semantics in weakly-supervised image generation. In: European conference on computer vision. Springer, Cham, pp 482–499
Herzig R, Bar A, Xu H, Chechik G, Darrell T, Globerson A (2020) Learning canonical representations for scene graph to image generation. In: European conference on computer vision. Springer, Cham, pp 210–227
Hara T, Harada T (2020) Spherical image generation from a single normal field of view image by considering scene symmetry. arXiv preprint arXiv:2001.02993
Zia T, Arif S, Murtaza S, Ullah MA (2020) Text-to-image generation with attention based recurrent neural networks. arXiv preprint arXiv:2001.06658
Sylvain T, Zhang P, Bengio Y, Hjelm RD, Sharma S (2020) Object-centric image generation from layouts. arXiv preprint arXiv:2003.07449
Doan KD, Manchanda S, Wang F, Keerthi S, Bhowmik A, Reddy CK (2020) Image generation via minimizing Fréchet distance in discriminator feature space. arXiv preprint arXiv:2003.11774
Wieluch S, Schwenker F (2020) StrokeCoder: path-based image generation from single examples using transformers. arXiv preprint arXiv:2003.11958
Karki M, Cho J (2020) Lesion conditional image generation for improved segmentation of intracranial hemorrhage from CT images. arXiv preprint arXiv:2003.13868
Yang Z, Wu W, Hu H, Xu C, Li Z (2020) Open domain dialogue generation with latent images. arXiv preprint arXiv:2004.01981
Widya AR, Monno Y, Okutomi M, Suzuki S, Gotoda T, Miki K (2020) Stomach 3D reconstruction based on virtual chromoendoscopic image generation. arXiv preprint arXiv:2004.12288
Benny Y, Galanti T, Benaim S, Wolf L (2020) Evaluation metrics for conditional image generation. arXiv preprint arXiv:2004.12361
Shi R, Shu H, Zhu H, Chen Z (2020) Adversarial image generation and training for deep convolutional neural networks. arXiv preprint arXiv:2006.03243
Chen X, Cohen-Or D, Chen B, Mitra NJ (2020) Neural graphics pipeline for controllable image generation. arXiv preprint arXiv:2006.10569
Tseng HY, Fisher M, Lu J, Li Y, Kim V, Yang MH (2020) Modeling artistic workflows for image generation and editing. In: European conference on computer vision. Springer, Cham, pp 158–174
Tang H, Bai S, Zhang L, Torr PH, Sebe N (2020) Xinggan for person image generation. In: European conference on computer vision. Springer, Cham, pp 717–734
Hong Y, Niu L, Zhang J, Zhao W, Fu C, Zhang L (2020) F2GAN: fusing-and-filling GAN for few-shot image generation. In Proceedings of the 28th ACM international conference on multimedia, pp 2535–2543
Rafner J, Hjorth A, Risi S, Philipsen L, Dumas C, Biskjær MM, Sherson J (2020) CREA. Blender: a neural network-based image generation game to assess creativity. In: Extended abstracts of the 2020 annual symposium on computer-human interaction in play, pp 340–344
Deng F, Yang J (2020) Panoramic image generation using centerline-constrained mesh parameterization for arbitrarily shaped tunnel lining. IEEE Access 8:7969–7980
Duan Y, Han C, Tao X, Geng B, Du Y, Lu J (2020) Panoramic image generation: from 2-D sketch to spherical image. IEEE J Select Top Signal Process 14(1):194–208
Zhan H, Yi C, Shi B, Duan LY, Kot AC (2020) Pose-normalized and appearance-preserved street-to-shop clothing image generation and feature learning. IEEE Trans Multimed
Li H, Tang J (2020) Dairy goat image generation based on improved-self-attention generative adversarial networks. IEEE Access 8:62448–62457
Seo M, Kitajima T, Chen YW (2020) High-resolution gaze-corrected image generation based on combined conditional GAN and residual dense network. In: 2020 IEEE international conference on consumer electronics (ICCE), pp 1–5. IEEE
Song S, Zhang W, Liu J, Guo Z, Mei T (2020) Unpaired person image generation with semantic parsing transformation. IEEE Trans Pattern Anal Mach Intell
Zhou T, He D, Lee CH (2020) Pixel-level bird view image generation from front view by using a generative adversarial network. In: 2020 6th international conference on control, automation and robotics (ICCAR), pp 683–689. IEEE
Tang W, Li T, Nian F, Wang M (2018) MsCGAN: multi-scale conditional generative adversarial networks for person image generation. arXiv preprint arXiv:1810.08534
Matsuo R, Hasegawa M (2020) Study of UV skin image generation from an RGB color image with deep learning for beauty industries. In: 2020 35th international technical conference on circuits/systems, computers and communications (ITC-CSCC), pp 421–425. IEEE
Deng Y, Yang J, Chen D, Wen F, Tong X (2020) Disentangled and controllable face image generation via 3D imitative-contrastive learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5154–5163
Gao C, Liu Q, Xu Q, Wang L, Liu J, Zou C (2020) SketchyCOCO: image generation from freehand scene sketches. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5174–5183
Liu S, Wang T, Bau D, Zhu JY, Torralba A (2020) Diverse image generation via self-conditioned gans. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14286–14295
Islam J, Zhang Y (2020) GAN-based synthetic brain PET image generation. Brain Inform 7:1–12
Kim HK, Yoo KY, Jung HY (2020) Color image generation from LiDAR reflection data by using selected connection UNET. Sensors 20(12):3387
Soviany P, Ardei C, Ionescu RT, Leordeanu M (2020) Image difficulty curriculum for generative adversarial networks (CuGAN). In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3463–3472
Wang Z, Healy G, Smeaton AF, Ward TE (2020) Use of neural signals to evaluate the quality of generative adversarial network performance in facial image generation. Cogn Comput 12(1):13–24
Liao Y, Schwarz K, Mescheder L, Geiger A (2020) Towards unsupervised learning of generative models for 3D controllable image synthesis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5871–5880
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2226–2234
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Klambauer G, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a nash equilibrium. CoRR. Available: arXiv:1706.08500
Zhou Wang (2004) Bovik-Alan C, Sheikh-Hamid R, Simoncelli-Eero P (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4): 600–612
Metz L, Poole B, Pfau D, Sohl-Dickstein J (2016) Unrolled generative adversarial networks. arXiv preprint arXiv:1611.02163
Wu X, Xu K, Hall P (2017) A survey of image synthesis and editing with generative adversarial networks. Tsinghua Sci Technol 22(6):660–674
Lee KS, Tran NT, Cheung NM (2021) Infomax-gan: improved adversarial image generation via information maximization and contrastive learning. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3942–3952
Riviere M, Teytaud O, Rapin J, LeCun Y, Couprie C (2019) Inspirational adversarial image generation. arXiv preprint arXiv:1906.11661
Kuang H, Huang N, Xu S, Du S (2021) A Pixel image generation algorithm based on CycleGAN. In: 2021 IEEE 4th advanced information management, communicates, electronic and automation control conference (IMCEC), vol 4, pp 476–480. IEEE
Xia W, Yang Y, Xue JH (2021) Cali-sketch: stroke calibration and completion for high-quality face image generation from human-like sketches. Neurocomputing
Suhail M, Mittal A, Siddiquie B, Broaddus C, Eledath J, Medioni G, Sigal L (2021) Energy-based learning for scene graph generation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13936–13945
He S, Liao W, Yang MY, Yang Y, Song YZ, Rosenhahn B, Xiang T (2021) Context-aware layout to image generation with enhanced object appearance. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15049–15058
Yang CF, Fan WC, Yang FE, Wang YCF (2021) LayoutTransformer: scene layout generation with conceptual and spatial diversity. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3732–3741
Phaphuangwittayakul A, Guo Y, Ying F (2021) Fast adaptive meta-learning for few-shot image generation. IEEE Trans Multimed
Zhang H, Koh JY, Baldridge J, Lee H, Yang Y (2021) Cross-modal contrastive learning for text-to-image generation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 833–842
Abdelmotaal H, Abdou AA, Omar AF, El-Sebaity DM, Abdelazeem K (2021) Pix2pix Conditional generative adversarial networks for scheimpflug camera color-coded corneal tomography
Author information
Authors and Affiliations
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Elasri, M., Elharrouss, O., Al-Maadeed, S. et al. Image Generation: A Review. Neural Process Lett 54, 4609–4646 (2022). https://doi.org/10.1007/s11063-022-10777-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-022-10777-x