Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Augmenting data with GANs to segment melanoma skin lesions

  • 169 Accesses

  • 4 Citations


This paper presents a novel strategy that employs Generative Adversarial Networks (GANs) to augment data in the skin lesion segmentation task, which is a fundamental first step in the automated melanoma detection process. The proposed framework generates both skin lesion images and their segmentation masks, making the data augmentation process extremely straightforward. In order to thoroughly analyze how the quality and diversity of synthetic images impact the efficiency of the method, we remodel two different well known GANs: a Deep Convolutional GAN (DCGAN) and a Laplacian GAN (LAPGAN). Experimental results reveal that, by introducing such kind of synthetic data into the training process, the overall accuracy of a state-of-the-art Convolutional/Deconvolutional Neural Network for melanoma skin lesion segmentation is increased.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. 1.

    An epoch is the number of batches needed to feed the network with 1882 images.

  2. 2.

    The official accuracy measure of the ISIC challenge [7].

  3. 3.

    2017 challenge scoreboard at https://bit.ly/2yUhs3O.


  1. 1.

    Antoniou A, Storkey A, Edwards H (2017) Data augmentation generative adversarial networks, arXiv:1711.04340

  2. 2.

    Baur C, Albarqouni S, Navab N (2018) MelanoGANs: high resolution skin lesion synthesis with GANs, arXiv:1804.04338

  3. 3.

    Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

  4. 4.

    Bolelli F, Baraldi L, Cancilla M, Grana C (2018) Connected components labeling on DRAGs. In: International conference on pattern recognition

  5. 5.

    Bolelli F, Cancilla M, Grana C (2017) Two more strategies to speed up connected components labeling algorithms. In: International conference on image analysis and processing. Springer, pp 48–58

  6. 6.

    Celebi ME, Wen Q, Iyatomi H, Shimizu K, Zhou H, Schaefer G (2015) A state-of-the-art survey on lesion border detection in dermoscopy images. Dermoscopy Image Analysis 10:97–129

  7. 7.

    Codella NC, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza SW, Kalloo A, Liopyris K, Mishra N, Kittler H et al (2017) Skin lesion analysis toward melanoma detection: a challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC), arXiv:1710.05006

  8. 8.

    Denton EL, Chintala S, Fergus R et al (2015) Deep generative image models using a Laplacian pyramid of adversarial networks. In: Advances in neural information processing systems, pp 1486–1494

  9. 9.

    Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  10. 10.

    Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift, arXiv:1502.03167

  11. 11.

    Kingma DP, Ba J (2014) Adam: a method for stochastic optimization, arXiv:1412.6980

  12. 12.

    Kittler H, Pehamberger H, Wolff K, Binder M (2002) Diagnostic accuracy of dermoscopy. Lancet Oncol 3(3):159–165

  13. 13.

    Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  14. 14.

    Lipkus AH (1999) A proof of the triangle inequality for the Tanimoto distance. J Math Chem 26(1):263–265. https://doi.org/10.1023/A:1019154432472

  15. 15.

    Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88

  16. 16.

    Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2Activity: recognizing complex activities from sensor data. In: Twenty-fourth international joint conference on artificial intelligence

  17. 17.

    Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115

  18. 18.

    Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  19. 19.

    Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on deep learning for audio, speech, and language processing (WDLASL 2013)

  20. 20.

    Mishkin D, Matas J (2016) All you need is a good init. In: International conference on learning representations (ICLR) 2016

  21. 21.

    Neff T, Payer C, Štern D, Urschler M (2017) Generative adversarial network based synthesis for supervised medical image segmentation. In: OAGM & ARW Joint workshop 2017 on “vision, automation & robotics”. Verlag der Technischen Universität Graz

  22. 22.

    Pollastri F, Bolelli F, Grana C (2018) Improving skin lesion segmentation with generative adversarial networks. In: 31St international symposium on computer-based medical systems

  23. 23.

    Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks,. arXiv:1511.06434

  24. 24.

    Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241

  25. 25.

    Springenberg JT (2015) Unsupervised and semi-supervised learning with categorical generative adversarial networks, arXiv:1511.06390

  26. 26.

    Yuan Y, Chao M, Lo YC (2017) Automatic skin lesion segmentation with fully convolutional-deconvolutional networks, arXiv:1703.05165

  27. 27.

    Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: 2011 IEEE international conference on computer vision (ICCV). IEEE, pp 2018–2025

  28. 28.

    Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by GAN improve the person re-identification baseline in vitro, vol 3. arXiv:1701.07717

Download references

Author information

Correspondence to Federico Bolelli.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



A large dataset is a crucial asset for any GAN training process: additional images allow the network to learn how to generate more realistic samples, with realistic details and less similar one another. Figures 1011, and 12 show how enlarging the training dataset improves the output results of the DCGAN. Increasing the ammount of samples from 500 to 1000, the DCGAN becomes able to provide a wider variety of skin lesions, with different sizes and shapes and with various textures. After reaching 1500 training images, the framework delivers high-resolution hair in samples that present much fewer noisy artifacts.

Fig. 10

DCGAN-generated skin lesion samples a and their segmentation masks b, generated by a GAN trained with 500 dermoscopic samples

Fig. 11

DCGAN-generated skin lesion samples a and their segmentation masks b, generated by a GAN trained with 1000 dermoscopic samples

Fig. 12

DCGAN-generated skin lesion samples a and their segmentation masks b, generated by a GAN trained with 1500 dermoscopic samples

The sampling process of the LAPGAN is further examined in Fig. 13. We merge five independent GANs to form one LAPGAN, which is divided into six different pyramid levels. In the first level, the GAN named G0 transforms a noise vector Z0 into a 6 × 8 pixels skin lesion sample coupled with its segmentation mask, employing fully-connected layers for both the generator and the discriminator.

Fig. 13

A visual representation of our LAPGAN sampling process. For each level but the first one images are upsampled (blue arrows) and fed, together with a new source of noise, to a Convolutional Generative Adversarial Network, which serves to generate residual images. We add an extra step employing G4 for a second time at the end of the process, in order to obtain 192 × 256 pixels samples

In the next pyramid level, the two outputs of G0 are upsampled and fed, together with a new source of noise, to G1, a GAN that exploits convolutional layers in both of its two subnetworks. The output of G1 are two residual images (skin lesion and segmentation mask) to be added to the expanded low resolution samples, provided by the previous pyramid level. This approach allows to enlarge images generated by the previous layer without lowering the resolution. Each following level has the same structure as the one employing G1. However, G4 is used to provide residual images for both of the two last pyramid levels.

Figure 13 illustrates how G3 adds noise in each sample, generating hair poorly, whereas G4 does a great job improving the resolution and the realism of every image. As the image dimensions grow, target residual images of adjacent pyramid levels become more similar one another, allowing us to exploit the same GAN in more than one layer of the pyramid.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pollastri, F., Bolelli, F., Paredes, R. et al. Augmenting data with GANs to segment melanoma skin lesions. Multimed Tools Appl (2019). https://doi.org/10.1007/s11042-019-7717-y

Download citation


  • Deep learning
  • Convolutional neural networks
  • Adversarial learning
  • Skin lesion segmentation