The use of generative adversarial networks in medical image augmentation

Makhlouf, Ahmed; Maayah, Marina; Abughanam, Nada; Catal, Cagatay

doi:10.1007/s00521-023-09100-z

The use of generative adversarial networks in medical image augmentation

Review
Open access
Published: 15 October 2023

Volume 35, pages 24055–24068, (2023)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

The use of generative adversarial networks in medical image augmentation

Download PDF

Ahmed Makhlouf¹,
Marina Maayah¹,
Nada Abughanam¹ &
…
Cagatay Catal ORCID: orcid.org/0000-0003-0959-2930¹

3448 Accesses
Explore all metrics

Abstract

Generative Adversarial Networks (GANs) have been widely applied in various domains, including medical image analysis. GANs have been utilized in classification and segmentation tasks, aiding in the detection and diagnosis of diseases and disorders. However, medical image datasets often suffer from insufficiency and imbalanced class distributions. To overcome these limitations, researchers have employed GANs to generate augmented medical images, effectively expanding datasets and balancing class distributions. This review follows the PRISMA guidelines and systematically collects peer-reviewed articles on the development of GAN-based augmentation models. Automated searches were conducted on electronic databases such as IEEE, Scopus, Science Direct, and PubMed, along with forward and backward snowballing. Out of numerous articles, 52 relevant ones published between 2018 and February 2022 were identified. The gathered information was synthesized to determine common GAN architectures, medical image modalities, body organs of interest, augmentation tasks, and evaluation metrics employed to assess model performance. Results indicated that cGAN and DCGAN were the most popular GAN architectures in the reviewed studies. Medical image modalities such as MRI, CT, X-ray, and ultrasound, along with body organs like the brain, chest, breast, and lung, were frequently used. Furthermore, the developed models were evaluated, and potential challenges and future directions for GAN-based medical image augmentation were discussed. This review presents a comprehensive overview of the current state-of-the-art in GAN-based medical image augmentation and emphasizes the potential advantages and challenges associated with GAN utilization in this domain.

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Medical image data augmentation: techniques, comparisons and interpretations

Article 20 March 2023

Medical image analysis based on deep learning approach

Article 06 April 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Medical imaging is an essential part of the disease diagnosis process. Medical images are classified and segmented by experts to diagnose disorders, and in some situations, anticipate how diseases will progress. Medical image processing was previously done manually by specialists. However, with the rapid advancement of equipment used in imaging, a substantial volume of medical images is generated daily, which takes time to process and requires specific domain knowledge [1]. Therefore, there is an increasing need for the automation of this process. The best method for automating medical image processing has been to treat it as a deep neural network (DNN) task and solve it by training a deep learning (DL) model. Improving the performance of the DL model can be achieved by training it with a sufficiently large and correctly labeled dataset. Furthermore, if the aim is to classify diseases, the model must be fed by a balanced size of each class with correct labeling. In other words, diversity of the data is necessary to obtain better generalizability of the developed model; otherwise, the model will perform poorly [2,3,4].

There are many challenging obstacles to developing an efficient DL model for medical image processing. The first obstacle is obtaining sufficient data. This depends on many factors, such as sharing data between healthcare facilities and researchers while maintaining the confidentiality of patients’ data. Second, the lack of labeled data is a significant impediment to developing reliable image segmentation and classification models because annotating medical images by hand is difficult, time-consuming, and inconsistent among imaging modalities. Third, most datasets are imbalanced because disorders usually occur with much lower frequency compared to non-diseased conditions, which makes developing an automated diagnosis system much more difficult. As a result, the researchers began to consider improving the medical image datasets to create a good classification or segmentation model. One technique that adds more information to the original dataset is data augmentation, which includes: (1) basic image augmentation (e.g., rotation, flipping, cropping, color spacing); and (2) deep learning approaches (e.g., Generative Adversarial Network(GAN), Neural Style Transfer) [5]. The basic augmentation methods are limited in terms of the amount of data they can generate and depend entirely on the original dataset. In contrast, deep learning approaches, specifically GANs, are capable of generating a wide range of diversity in the data independently of the original dataset [5].

In this work, we systematically review the use of GANs in medical image augmentation and to what extent they improve the performance of further DL models, which use the augmented dataset either to solve classification or segmentation tasks. In our review strategy, we take into consideration many factors, including: (1) the type of GANs used in the augmentation; (2) the medical image modality (e.g., MRI, CT scan) that was used as an input for the GAN model; (3) the purpose of using the augmented dataset; and (4) the evaluation models used to evaluate the augmented dataset.

In this review, we aim to explore how GAN-based architectures are used in medical image augmentation. To achieve this aim, we defined the following research questions:

RQ1: What are the GAN-based architectures used in medical image augmentation?
RQ2: What were the different medical image modalities that used GAN in image augmentation?
RQ3: What were the body organs associated with the medical image modalities?
RQ4: What was the targeted task following GAN-based image augmentation?
RQ5: How was the performance of the GAN model for augmentation evaluated in the retrieved articles?

Following this introduction section, Sect. 2 provides the background and related work. Section 3 describes the research methodology. Then, Sect. 4 presents the results obtained in the order of the research questions. Section 5 then presents the discussion of our results. Finally, in Sect. 6, we cover the limitations and reliability of our work, followed by the conclusion.

2 Background and related work

We explain GANs and the different architectures in Sect. 2.1. In Sect. 2.2, we discuss the evaluation methods used by the primary articles to evaluate the GAN-based augmentation models. Then, in Sect. 2.3, related reviews are presented.

2.1 GAN architecture

GANs, proposed by Goodfellow et al. [6] in 2014, are deep neural networks that produce realistic samples. A GAN model consists of a generator network G and a discriminator network D. The generator is the generative model that is responsible for producing fake samples that follow the distribution of the real samples, while the discriminator learns to distinguish between real samples and fake samples produced by the generator. The dynamics between the generator and the discriminator are what is known as adversarial learning, where each network is trained to make the other make a mistake, e.g., the generator wants to fool the discriminator into thinking a fake sample is real. This is done using backpropagation, where the generator maps a random distribution into one that matches the distribution of real data, and the discriminator evaluates the differences between the fake and the real data distributions, with the error being backpropagated to the generator if the discriminator finds that the sample is fake. After some time, the two models will reach a balance, with the generator producing images that look extremely realistic and are so close to the real data distribution that the discriminator can only achieve a 50 Several variations on the architecture of GANs have been proposed for medical image synthesis, with the most popular and common being conditional GAN (cGAN), deep convolutional GAN (DCGAN), cycle-consistent GAN (CycleGAN), auxiliary classifier GAN (ACGAN), Wasserstein GAN (WGAN), information-maximizing GAN (infoGAN), Pix2Pix, StyleGAN, and progressive growing GAN (Progressive GAN or PGGAN). cGAN [7] adds conditional information, such as class labels, to the generator and discriminator, which improves the generation of detailed features and imbalanced classes. In DCGAN [8], the generator and discriminator both use a deep convolutional network architecture. CycleGAN [9] uses two GANs to transform images from one domain into another and back to the original domain, solving the issue of image-to-image translation. ACGAN [10] is an extension of cGAN in that, rather than adding class labels to the discriminator, the discriminator is made to predict the class label of an image in addition to the probability of that image being real or fake. WGAN [11] uses the Wasserstein loss function in order to improve the learning stability of GAN. InfoGAN [12] allows the learning of meaningful and interpretable representations of images using latent codes (i.e., producing images with different rotations or widths). Pix2Pix [13] is another variation of cGAN that solves image-to-image translation by generating an image that is conditional on the input image, where the discriminator determines whether the produced image is a plausible transformation of the input image. StyleGAN [14] allows control of smaller-scale features of the image without altering levels. Progressive GAN [15] increases the depth and number of layers in the generator and discriminator throughout the training process, allowing the model to learn the finer details and increasing the quality and stability of the GAN.

Table 1 compares between the main commonly-used GAN architectures. In order to build a given GAN architecture, a number of tools are available. These are summarized in Table 2.

Table 1 Commonly used GAN architectures

Full size table

Table 2 A summary of GAN building tools

Full size table

2.2 Evaluation methods

Since the loss function used to train the GAN models does not have a metric for assessing the model’s quality, there is no method to objectively measure progress in training and model quality using only losses [16]. Various qualitative and quantitative methodologies have been developed to look at the quality and diversity of the synthetic images created by GANs. Evaluating GANs model is mainly categorized into two techniques: qualitative and quantitative. In the qualitative method, the practitioners visually assess images created by GANs concerning the target domain. Although manual assessment is the most basic form of model evaluation, it has several drawbacks, mainly that it is subjective, with the reviewer’s biases about the model, its configuration, and its goal. It necessitates an understanding of what is and is not feasible in the target area. The number of images that may be reviewed in an acceptable amount of time is restricted [17]. As a result, many quantitative methods have been developed to evaluate GANs output; some of these methods target the GANs output directly or indirectly. In direct evaluation, the augmentation model itself is evaluated directly. One method is to quantify experts’ opinions on synthetic images using a numerical score. A second method is to compare the generated images to the original images on the pixel level. Another method is to compare the data distribution of the original datasets with the data distribution of the new datasets after adding the synthetic images. Another type of quantitative method evaluates GANs indirectly by comparing the performance of the downstream models (e.g., classification or segmentation models that use the augmented dataset) with and without the synthetic images, which would reflect the quality and divergence of the GANs output. Table 3 illustrates all qualitative and quantitative methods.

Table 3 Classification of different performance evaluation metrics

Full size table

2.3 Related work

Several reviews address the use of GANs in medical images for different objectives, including synthesis [18], augmentation [19], segmentation [20], and classification [21]. Singh et al. [18] discussed the different GAN architectures used in medical image generation, in addition to the applications of these GANs in the reconstruction of medical images to reduce noise as well as the synthesis of medical images. For image reconstruction, 22 articles are summarized in terms of modalities, methods, losses, and some remarks, in addition to another 22 articles being summarized for image synthesis in terms of modalities, methods, additional remarks for unconditional synthesis, and conditional information for conditional synthesis. The paper concludes with some future research directions for GANs in medical image generation. However, this review was not done systematically, and its focus was not on augmentation. Chen et al. [19] presented a review of GANs in medical image augmentation using 105 papers published from 2018 to 2021, mainly collected from ELSEVIER, IEEE Xplore, and Springer that were published and pre-published. The papers were analyzed based on the different organs, the dataset utilized for training and testing the models, the loss function employed in the training, and the performance evaluation metrics used to evaluate the performance of the proposed models. As per the analysis, the advantages of each method, loss function, and evaluation metric are discussed. Nevertheless, this review was not done systematically, which may affect its repeatability, and it was not restricted to peer-reviewed published articles. In addition, the searched databases did not include Science Direct, Scopus, or PubMed, and the whole review did not include articles beyond 2021. On the other hand, Xun et al. [20] based their review on GANs used in medical image segmentation, where 120 papers published before September of 2021 were reviewed and analyzed based on the segmentation region (i.e., brain, chest, abdomen, etc.), image modality, and classification methods. The articles are collected from Google Scholar, PubMed, Semantic Scholar, Springer, arXiv, and some top conferences in computer science. Similarly, the advantages of the proposed methods are discussed, in addition to the limitations and future research directions for the use of GANs in segmentation. However, this review did not focus on augmentation, and it was not done systematically. Following PRISMA guidelines, Joeng et al. [21] conducted a systematic literature review on the use of GANs in medical image classification and segmentation. A meta-analysis was performed on 54 papers published from 2015 to 2020 based on image modality, task, and clinical domain, focusing on how GANs were utilized for classification and segmentation purposes. The articles were retrieved from PubMed, Science Direct, and Google Scholar. The 54 papers are split into 33 papers that address segmentation (only 12 address augmentation for segmentation), 13 papers that address generation, and 9 papers that address classification. Their primary focus was on the classification and segmentation objectives for GANs, and they did not focus particularly on GANs for medical image augmentation, which was reflected in the search keys they used and the reduced number of augmentation papers retrieved. Additional manual searches from retrieved articles, such as backward and forward snowballing, were not reported in their review. Finally, the performance evaluation metrics of the included articles were not retrieved, quantified, and grouped in detail. The main contribution of this work is that we perform a systematic review following PRISMA guidelines to retrieve and synthesize data from peer-reviewed published articles in IEEE, PubMed, Science Direct, and Scopus electronic databases with manual search to identify additional relevant articles through backward and forward snowballing. We included articles up to February 2022 as well. From the included articles, our work identified, quantified, and grouped all performance evaluation metrics that were used to evaluate the developed GAN-based models.

3 Research methodology

In this article, we systematically reviewed the use of GANs to augment medical images following PRISMA guidelines. In this section, we describe our research methodology, which comprises a review protocol, Research Questions (RQ), sources of data, search criteria, exclusion and inclusion criteria, and quality assessment. In this review, we explore the literature related to GAN-based augmentation of medical images. We formulated five research questions and developed our search strategy to define the key search terms, electronic databases, and inclusion and exclusion criteria. Data from finally included articles that met the selection criteria and quality assessment were collected and synthesized to answer our RQs. In this review, we formulated five research questions as presented in the introduction section. By searching the scientific databases of IEEE, PubMed, Scopus, and Science Direct, we identified peer-reviewed, published articles that meet our inclusion criteria and quality assessment. Our search string combined several keywords as follows: (“GAN” OR “Generative Adversarial Network”) AND “medical image” AND “augmentation”. In addition, reference lists of included articles and literature that cited those articles were also tracked manually to find additional relevant articles for inclusion. We defined inclusion and exclusion criteria to govern the selection process and retrieve relevant articles. The inclusion and exclusion criteria for this study are presented in Tables 4 and 5, respectively.

Table 4 Inclusion criteria

Full size table

Table 5 Exclusion criteria

Full size table

After applying the above-mentioned selection criteria, 93 articles from different data sources were considered for further screening. After eliminating duplicates, three independent researchers screened the title and abstract of the remaining 72 articles. The resulting 49 articles were subjected to full-text reading, and none were discarded. An additional manual search in the reference lists of included articles (backward snowballing) and published articles that cited these articles (forward snowballing) resulted in the addition of three extra articles. Finally, 52 articles were assessed on the basis of their quality, and all of them were retained.

Finally, the included articles were evaluated based on their quality. The criteria utilized to assess the quality of papers are included in Table 6.

Table 6 Quality assessment criteria

Full size table

A cumulative score is calculated based on the above criteria to determine the inclusion of each article. It is important to highlight that these quality criteria are not used to exclude articles but to evaluate the degree of relevance of the literature to be included in our research. For each article that was deemed to have met or exceeded the quality assessment, the following facets were explored in depth: motivations, contributions, and results achieved. The number of articles selected at the end of this process is 52. Figure 1 shows the PRISMA flow diagram of the process of study screening and selection.

4 Results

In this section, we present the results we synthesized from our collected data. The results will be organized in the same flow as per the research questions mentioned in the introduction section. Out of the initially identified 93 articles (from searching PubMed, Scopus, Science Direct, and IEEE electronic databases), we finally included 52 articles [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73] as shown in Fig. 1. The publication years of these articles are shown in Fig. 2. The numbers show a growing trend in the number of publications that use GAN-based augmentation models for medical images. Since we could not cover all the papers published in 2022 due to our search period, the number of papers in 2022 looks limited, however, we expect more papers will be published by the end of the year 2022. Considering the previous years, it is clear that interest in this topic is increasing dramatically.

4.1 RQ1: What are the GAN-based architectures that were used in medical image augmentation?

The included articles reported the use of a variety of basic GAN architectures. Although 8 articles did not specify exactly which GAN architecture was used, there were 9 reported types used independently or in combination, as shown in Fig. 3. cGAN comes at the top of the most used architecture (\(n = 13\)), followed by DCGAN (\(n = 10\)), cycleGAN (\(n = 5\)), Pix2Pix architecture (\(n = 4\)), and others. Only two articles combined two or three of these basic architectures.

4.2 RQ2: What were the different medical image modalities that used GAN in image augmentation?

The most frequently reported image modality in the included articles was Magnetic Resonance Imaging (MRI) (\(n = 17\)), followed by Computed Tomography (CT) (\(n = 13\)), X-ray (\(n = 9\)), ultrasound (\(n = 4\)), mammography (\(n = 4\)), and others. Figure 4 presents the count of articles per image modality.

4.3 RQ3: What were the body organs associated with the above medical imaging modalities?

The top reported body organ in the included articles was the brain (\(n = 15\)), followed by the chest (\(n = 8\)), the breast (\(n = 8\)), the lung (\(n = 7\)), and others. Figure 5 shows the number of articles per body organ.

4.4 RQ4: What was the targeted task following GAN-based image augmentation?

The majority of the articles (\(n = 37\)) implemented medical image classification tasks following the development of the GAN-based augmentation model. Segmentation tasks were implemented in 11 articles following augmentation, whereas augmentation alone was performed in 4 articles. The number of articles based on the tasks that follow augmentation is depicted in Fig. 6.

4.5 RQ5: How was the performance of the GAN model for augmentation evaluated in the retrieved articles?

The most frequent single performance evaluation method used was indirect quantitative performance evaluation (\(n = 23\)), whereas direct quantitative methods were used alone in 7 articles and qualitative methods (expert opinion) were never used alone. Eleven articles combined both direct and indirect quantitative methods in their model evaluations. Qualitative methods were combined with direct methods (\(n = 4\)), indirect methods (\(n = 4\)), and both direct and indirect methods (i.e., the full set of performance evaluation methods) (\(n = 4\)). Figure 7 presents the number of articles based on the performance evaluation method of the developed GAN-based augmentation models.

5 Discussion

GANs have been increasingly used in a variety of applications related to medical images. Since the introduction of the GAN concept in 2014, the number of publications investigating GAN applications has grown notably. Some uses were related to classification tasks that may guide the decision to diagnose a disease, the detection of its presence, or even the progress of its level. In 2016, publications started to include segmentation tasks as another application of GAN. Our review revealed the inclusion of GANs in medical image augmentation starting in 2018. Since we stopped our review in February 2022, we estimate that the trend is going to keep growing at the end of 2022 for publications that developed GAN-based models for the augmentation of medical images. Medical imaging is a very suitable area for GAN-based augmentation since it is usually associated with imbalanced datasets in favor of normal (i.e., healthy) images rather than diseased images. In addition, the volume of available datasets may be insufficient for performing training on other tasks such as classification or segmentation. In this section, we present the discussion related to each of our research questions. Regarding the types of basic GAN architectures used in augmentation, we found that cGAN came at the top of the list of basic architectures used. The popularity of cGAN is mainly attributed to its ability to learn specific features from the dataset via the use of conditional information [7, 74]. Following cGAN, DCGAN is the second-most popular architecture. This is due to the fact that DCGANs are known for their capacity to produce high-resolution medical images with improvements. That’s why it is also commonly used in architectures used in segmentation tasks. Most of the articles reported the use of a single basic GAN architecture for augmentation. However, only two articles reported that they used a combination of two or three basic types together to develop a more complex model. We found that eight articles did not specify the GAN architecture used in medical image augmentation. Although an in-depth explanation of the GAN architecture is needed to facilitate the replication of the study results, most of the included studies did not mention other details like the hyperparameter values they used in their models. Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) appeared as the top two image modalities that leveraged GAN-based augmentation models. This may be attributed to the fact that these two tests are considered less available and more expensive diagnostic tests, whereas x-ray and ultrasound come next because they are more available and less expensive general screening tests. Both types of tests share the characteristic of having imbalanced classes (healthy vs. diseased). Image modalities that were used least are laparoscopic images, Human Epithelial type2 (HEP-2 cell images), Functional Near InfraRed Spectroscopy (FNIRS), and digital histopathological slides. These studies may be in the early phases of experimenting with GAN-based augmentation and may witness an increase in the upcoming years. The brain, chest, breast, and lung were the top reported body organs in the included articles. This may be related to the increasing demand for the availability of medical images of these organs in comparison to other organs. On the other hand, the cervix, heart, and pancreas were examples of the least reported body organs that used GAN-based augmentation. Medical images of these body organs might be more readily available, or the specialized physicians might be in the early stages of experimenting with GAN-based augmentation in these organs’ images. Detection or diagnosis of a disease through a classification task seems to be the most commonly reported use following the GAN-augmented datasets, being the most demanded application. GAN-augmented images were used to perform segmentation tasks in less than one-third of reported classification tasks. In Table 1, we demonstrated different types of performance evaluation metrics used for GAN-based medical image augmentation. The ability of the subsequent model (whether classification or segmentation) to perform better in its tasks according to a set of performance evaluation metrics is considered a quantitative, indirect method of evaluating the performance of the GAN-augmentation model. That’s why we found this indirect evaluation to be the most reported single metric in the included articles. Indirect metrics were also used in combination with direct metrics only or in a combination of both qualitative and direct methods. All reported indirect metrics suggest that GAN-based augmentation was effective in improving the performance of the subsequent task, as per the reported indirect metrics. Quantitative, direct performance evaluation metrics were used alone in a smaller number of articles. However, they were combined with indirect or both indirect and qualitative methods in some articles. Finally, qualitative performance evaluation metrics were not used as a single evaluation method in any of the articles. It was always combined with either indirect metrics, direct metrics, or both. Qualitative metrics were included in a limited number of articles (12 articles). This may be because qualitative methods require the availability of experts to examine and evaluate the generated medical images. However, validating the performance of GAN-based augmented medical images by combining both qualitative and quantitative (direct and indirect) methods should be sought in a wider range of articles because augmented images can generate features that do not exist or hide existing features, which necessitates expert opinion. As we mentioned, the included augmentation models did not explain the details of their architectures in many cases or the hyperparameters they used. Therefore, the replication of such studies may be limited without further details from the authors. In addition, none of the included articles evaluated the effectiveness of the proposed model in a practical clinical setting.

6 Conclusion and future work

GANs have a wide range of applications in a variety of domains. However, the scope of this review is limited to applications of GANs in medical image augmentation tasks only. We also restricted our search to publications published between January 2012 and February 2022. However, GANs were first introduced in 2014. We stopped reviewing papers released after February 2022 as an endpoint. We also focused on the augmentation of medical images using GANs; despite the fact that GANs have been used for other tasks with medical images, such as segmentation, classification, super-resolution, image modality translation, and image denoising, some of these tasks, such as segmentation and classification tasks, were performed as a second step following GAN-based augmentation. Furthermore, the included articles did not share full details about the augmentation model they used, such as the values of the hyperparameters, which limited our data synthesis scope. We did not restrict our search to a single sort of medical image or a single usage of augmented medical images. The selection process we followed was compliant with PRISMA guidelines, and pre-defined search key terms, inclusion criteria, and exclusion criteria were used. The results of each selection stage were agreed upon by three independent reviewers, and a further quality assessment check was performed. In conclusion, this systematic review covered GAN-based models that were used for medical image augmentation. We presented the basic GAN architectures used for building these augmentation models. cGAN and DCGAN are the most popular basic GAN architectures. Furthermore, we presented the most common medical image modalities (MRI, CT, X-ray, and ultrasound) and body organs (brain, chest, breast, and lung) reported in the included articles. In addition, we reported the second task that followed medical image augmentation (classification or segmentation). We first explained the different types of metrics used to evaluate the performance of GAN-based models. These included qualitative, quantitative direct, and quantitative indirect methods. Following that, we grouped the number of articles that used each of these performance-evaluating metrics, whether individually or in combination. The most reported performance evaluation metrics were the quantitative indirect methods, whereas qualitative metrics were the least used ones. Publications focused on the use of GANs for medical image augmentation are expected to continue to grow steadily as they are still in their early stages. Explaining more details about the developed models and the hyperparameter values used and combining the three types of performance evaluation metrics are needed in future publications to facilitate the replication and validation of the published results. Further work to evaluate the GAN-based augmentation models in real clinical settings would be considered another future direction to pursue.

Data Availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

Jiang J, Veeraraghavan H (2020) Unified cross-modality feature disentangler for unsupervised multi-domain mri abdomen organs segmentation. In: Medical image computing and computer assisted intervention—MICCAI 2020: 23rd international conference, Lima, Peru, October 4–8, 2020, Proceedings, Part II 23. Springer, Berlin, pp 347–358
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48
Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63(11):139–144
MathSciNet Google Scholar
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Odena A, Olah C, Shlens J (2017) Conditional image synthesis with auxiliary classifier GANs. In: International conference on machine learning. PMLR, pp 2642–2651
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning. PMLR, pp 214–223
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems, vol 29
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4401–4410
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training GANs. In: Advances in neural information processing systems, vol 29
Borji A (2019) Pros and cons of GAN evaluation measures. Comput Vis Image Underst 179:41–65
Google Scholar
Singh NK, Raza K (2021) Medical image generation using generative adversarial networks: a review. In: Patgiri R, Biswas A, Roy P (eds) Health informatics: a computational perspective in Healthcare. Studies in Computational Intelligence, vol 932. Springer, Singapore. https://doi.org/10.1007/978-981-15-9735-0_5
Chapter Google Scholar
Chen Y, Yang X-H, Wei Z, Heidari AA, Zheng N, Li Z, Chen H, Hu H, Zhou Q, Guan Q (2022) Generative adversarial networks in medical image augmentation: a review. Comput Biol Med 144:105382
Google Scholar
Xun S, Li D, Zhu H, Chen M, Wang J, Li J, Chen M, Wu B, Zhang H, Chai X (2022) Generative adversarial networks in medical image segmentation: a review. Comput Biol Med 140:105063
Google Scholar
Jeong JJ, Tariq A, Adejumo T, Trivedi H, Gichoya JW, Banerjee I (2022) Systematic review of generative adversarial networks (GANs) for medical image classification and segmentation. J Digit Imaging 35(2):137–152
Google Scholar
Ma D, Lu D, Popuri K, Wang L, Beg MF, Initiative ADN (2020) Differential diagnosis of frontotemporal dementia, Alzheimer’s disease, and normal aging using a multi-scale multi-type feature generative adversarial deep neural network on structural magnetic resonance images. Front Neurosci 14:853
Google Scholar
Bu T, Yang Z, Jiang S, Zhang G, Zhang H, Wei L (2021) 3d conditional generative adversarial network-based synthetic medical image augmentation for lung nodule detection. Int J Imaging Syst Technol 31(2):670–681
Google Scholar
Qin Z, Liu Z, Zhu P, Xue Y (2020) A gan-based image synthesis method for skin lesion classification. Comput Methods Programs Biomed 195:105568
Google Scholar
Jain V, Nankar O, Jerrish DJ, Gite S, Patil S, Kotecha K (2021) A novel AI-based system for detection and severity prediction of dementia using MRI. IEEE Access 9:154324–154346
Google Scholar
Esmaeilishahmirzadi N, Mortezapour H (2018) A novel method for enhancing the classification of pulmonary data sets using generative adversarial networks. Biomed Res 29(14):3022–3027
Google Scholar
Zhang R, Lu W, Wei X, Zhu J, Jiang H, Liu Z, Gao J, Li X, Yu J, Yu M (2021) A progressive generative adversarial method for structurally inadequate medical image data augmentation. IEEE J Biomed Health Inform 26(1):7–16
Google Scholar
Liu J, Shen C, Aguilera N, Cukras C, Hufnagel RB, Zein WM, Liu T, Tam J (2021) Active cell appearance model induced generative adversarial networks for annotation-efficient cell segmentation and identification on adaptive optics retinal images. IEEE Trans Med Imaging 40(10):2820–2831
Google Scholar
Teixeira JF, Dias M, Batista E, Costa J, Teixeira LF, Oliveira HP (2021) Adversarial data augmentation on breast MRI segmentation. Appl Sci 11(10):4554
Google Scholar
Sun L, Wang J, Huang Y, Ding X, Greenspan H, Paisley J (2020) An adversarial learning approach to medical image synthesis for lesion detection. IEEE J Biomed Health Inform 24(8):2303–2314
Google Scholar
Guan S, Loew M (2019) Breast cancer detection using synthetic mammograms from generative adversarial networks in convolutional neural networks. J Med Imaging 6(3):031411–031411
Google Scholar
Apostolopoulos ID, Papathanasiou ND, Panayiotakis GS (2021) Classification of lung nodule malignancy in computed tomography imaging utilising generative adversarial networks and semi-supervised transfer learning. Biocybern Biomed Eng 41(4):1243–1257
Google Scholar
Han C, Rundo L, Araki R, Nagano Y, Furukawa Y, Mauri G, Nakayama H, Hayashi H (2019) Combining noise-to-image and image-to-image GANs: Brain MR image augmentation for tumor detection. Ieee Access 7:156966–156977
Google Scholar
Baccouche A, Garcia-Zapirain B, Castillo Olea C, Elmaghraby AS (2021) Connected-unets: a deep learning architecture for breast mass segmentation. NPJ Breast Cancer 7(1):151
Google Scholar
Morís DI, Moura Ramos JJ, Buján JN, Hortas MO (2021) Data augmentation approaches using cycle-consistent adversarial networks for improving covid-19 screening in portable chest x-ray images. Expert Syst Appl 185:115681
Google Scholar
Segato A, Corbetta V, Di Marzo M, Pozzi L, De Momi E (2020) Data augmentation of 3d brain environment using deep convolutional refined auto-encoding alpha GAN. IEEE Trans Med Robot Bionics 3(1):269–272
Google Scholar
Motamed S, Rogalla P, Khalvati F (2021) Data augmentation using generative adversarial networks (GANs) for GAN-based detection of pneumonia and covid-19 in chest x-ray images. Inform Med Unlocked 27:100779
Google Scholar
Sajjad M, Ramzan F, Khan MUG, Rehman A, Kolivand M, Fati SM, Bahaj SA (2021) Deep convolutional generative adversarial network for Alzheimer’s disease classification using positron emission tomography (pet) and synthetic data augmentation. Microsc Res Tech 84(12):3023–3034
Google Scholar
Wu W, Lu Y, Mane R, Guan C (2020) Deep learning for neuroimaging segmentation with a novel data augmentation strategy. In: 2020 42nd annual international conference of the IEEE engineering in medicine and biology society (EMBC), 1516–1519. IEEE
Zhao C, Shuai R, Ma L, Liu W, Hu D, Wu M (2021) Dermoscopy image classification based on Stylegan and densenet201. Ieee Access 9:8659–8679
Google Scholar
Chen X, Lian C, Wang L, Deng H, Kuang T, Fung SH, Gateno J, Shen D, Xia JJ, Yap P-T (2021) Diverse data augmentation for learning image segmentation with cross-modality annotations. Med Image Anal 71:102060
Google Scholar
Zhou Y, Wang B, He X, Cui S, Shao L (2020) DR-GAN: conditional generative adversarial network for fine-grained lesion synthesis on diabetic retinopathy images. IEEE J Biomed Health Inform 26(1):56–66
Google Scholar
Ge C, Gu IY-H, Jakola AS, Yang J (2020) Enlarged training dataset by pairwise GANs for molecular-based brain tumor classification. IEEE Access 8:22560–22570
Google Scholar
Segal B, Rubin DM, Rubin G, Pantanowitz A (2021) Evaluating the clinical realism of synthetic chest x-rays generated using progressively growing GANs. SN Comput Sci 2(4):321
Google Scholar
Kora Venu S, Ravula S (2020) Evaluation of deep convolutional generative adversarial networks for data augmentation of chest x-ray images. Fut Internet 13(1):8
Google Scholar
Niu Y, Gu L, Zhao Y, Lu F (2021) Explainable diabetic retinopathy detection and retinal image generation. IEEE J Biomed Health Inform 26(1):44–55
Google Scholar
Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H (2018) Gan-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321:321–331
Google Scholar
Platscher M, Zopes J, Federau C (2022) Image translation for medical image generation: ischemic stroke lesion segmentation. Biomed Signal Process Control 72:103283
Google Scholar
Carver EN, Dai Z, Liang E, Snyder J, Wen N (2021) Improvement of multiparametric MR image segmentation by augmenting the data with generative adversarial networks for glioma patients. Front Comput Neurosci 14:495075
Google Scholar
Muramatsu C, Nishio M, Goto T, Oiwa M, Morita T, Yakami M, Kubo T, Togashi K, Fujita H (2020) Improving breast mass classification by shared data with domain transformation using a generative adversarial network. Comput Biol Med 119:103698
Google Scholar
Bozorgtabar B, Mahapatra D, Tengg-Kobligk H, Poellinger A, Ebner L, Thiran J-P, Reyes M (2019) Informative sample generation using class aware generative adversarial networks for classification of chest x-rays. Comput Vis Image Underst 184:57–65
Google Scholar
Onishi Y, Teramoto A, Tsujimoto M, Tsukamoto T, Saito K, Toyama H, Imaizumi K, Fujita H (2020) Investigation of pulmonary nodule classification using multi-scale residual network enhanced with 3dgan-synthesized volumes. Radiol Phys Technol 13:160–169
Google Scholar
Shi G, Wang J, Qiang Y, Yang X, Zhao J, Hao R, Yang W, Du Q, Kazihise NG-F (2020) Knowledge-guided synthetic medical image adversarial augmentation for ultrasonography thyroid nodule classification. Comput Methods Programs Biomed 196:105611
Google Scholar
Fetty L, Bylund M, Kuess P, Heilemann G, Nyholm T, Georg D, Löfstedt T (2020) Latent space manipulation for high-resolution medical image synthesis via the StyleGAN. Z Med Phys 30(4):305–314
Google Scholar
Shen T, Gou C, Wang F-Y, He Z, Chen W (2019) Learning from adversarial medical images for x-ray breast mass segmentation. Comput Methods Programs Biomed 180:105012
Google Scholar
Wickramaratne SD, Mahmud MS (2021) LSTM based GAN networks for enhancing ternary task classification using fNIRS data. In: 2021 43rd annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 1043–1046. IEEE
Shen T, Hao K, Gou C, Wang F-Y (2021) Mass image synthesis in mammogram with contextual information based on GANs. Comput Methods Programs Biomed 202:106019
Google Scholar
Ann K, Jang Y, Shim H, Chang H-J (2021) Multi-scale conditional generative adversarial network for small-sized lung nodules using class activation region influence maximization. IEEE Access 9:139426–139437
Google Scholar
Fu Y, Gong M, Yang G, Hu J, Wei H, Zhou J (2021) Optimization of cardiac magnetic resonance synthetic image based on simulated generative adversarial network. Math Probl Eng 2021:1–10
Google Scholar
Xue Y, Ye J, Zhou Q, Long LR, Antani S, Xue Z, Cornwell C, Zaino R, Cheng KC, Huang X (2021) Selective synthetic augmentation with HistoGAN for improved histopathology image classification. Med Image Anal 67:101816
Google Scholar
Pang T, Wong JHD, Ng WL, Chan CS (2021) Semi-supervised GAN-based radiomics model for data augmentation in breast ultrasound mass classification. Comput Methods Programs Biomed 203:106018
Google Scholar
Bargsten L, Schlaefer A (2020) Specklegan: a generative adversarial network with an adaptive speckle layer to augment limited training data for ultrasound image processing. Int J Comput Assist Radiol Surg 15:1427–1436
Google Scholar
Anaam A, Bu-Omer HM, Gofuku A (2021) Studying the applicability of generative adversarial networks on hep-2 cell image augmentation. IEEE Access 9:98048–98059
Google Scholar
Kim S, Kim B, Park H (2021) Synthesis of brain tumor multicontrast MR images for improved data augmentation. Med Phys 48(5):2185–2198
Google Scholar
Salehinejad H, Colak E, Dowdell T, Barfett J, Valaee S (2018) Synthesizing chest x-ray pathology for training deep convolutional neural networks. IEEE Trans Med Imaging 38(5):1197–1206
Google Scholar
Toda R, Teramoto A, Tsujimoto M, Toyama H, Imaizumi K, Saito K, Fujita H (2021) Synthetic CT image generation of shape-controlled lung cancer using semi-conditional InfoGAN and its applicability for type classification. Int J Comput Assist Radiol Surg 16:241–251
Google Scholar
Kamli A, Saouli R, Batatia H, Naceur MB, Youkana I (2020) Synthetic medical image generator for data augmentation and anonymisation based on generative adversarial network for glioblastoma tumors growth prediction. IET Image Proc 14(16):4248–4257
Google Scholar
Zhao D, Zhu D, Lu J, Luo Y, Zhang G (2018) Synthetic medical images using F &BGAN for improved lung nodules classification by multi-scale VGG16. Symmetry 10(10):519
Google Scholar
Cai J, Zhang Z, Cui L, Zheng Y, Yang L (2019) Towards cross-modal organ translation and segmentation: a cycle-and shape-consistent generative adversarial network. Med Image Anal 52:174–184
Google Scholar
Marzullo A, Moccia S, Catellani M, Calimeri F, De Momi E (2021) Towards realistic laparoscopic image generation using image-domain translation. Comput Methods Programs Biomed 200:105834
Google Scholar
Li Q, Yu Z, Wang Y, Zheng H (2020) Tumorgan: a multi-modal data augmentation framework for brain tumor segmentation. Sensors 20(15):4203
Google Scholar
Kang H, Park J-S, Cho K, Kang D-Y (2020) Visual and quantitative evaluation of amyloid brain pet image synthesis with generative adversarial network. Appl Sci 10(7):2628
Google Scholar
Wang Q, Zhou X, Wang C, Liu Z, Huang J, Zhou Y, Li C, Zhuang H, Cheng J-Z (2019) Wgan-based synthetic minority over-sampling technique: improving semantic fine-grained classification for lung nodules in CT images. IEEE Access 7:18450–18463
Google Scholar
Goodfellow I (2016) Nips 2016 tutorial: generative adversarial networks. arXiv preprint arXiv:1701.00160

Download references

Funding

Open Access funding provided by the Qatar National Library. Open Access funding provided by the Qatar National Library.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Qatar University, Doha, 2713, Qatar
Ahmed Makhlouf, Marina Maayah, Nada Abughanam & Cagatay Catal

Authors

Ahmed Makhlouf
View author publications
You can also search for this author in PubMed Google Scholar
Marina Maayah
View author publications
You can also search for this author in PubMed Google Scholar
Nada Abughanam
View author publications
You can also search for this author in PubMed Google Scholar
Cagatay Catal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cagatay Catal.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Makhlouf, A., Maayah, M., Abughanam, N. et al. The use of generative adversarial networks in medical image augmentation. Neural Comput & Applic 35, 24055–24068 (2023). https://doi.org/10.1007/s00521-023-09100-z

Download citation

Received: 22 November 2022
Accepted: 20 September 2023
Published: 15 October 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00521-023-09100-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The use of generative adversarial networks in medical image augmentation

Abstract

Similar content being viewed by others

UNet++: A Nested U-Net Architecture for Medical Image Segmentation

Medical image data augmentation: techniques, comparisons and interpretations

Medical image analysis based on deep learning approach

1 Introduction