Skip to main content

Data Generation Using Gene Expression Generator

Part of the Lecture Notes in Computer Science book series (LNISA,volume 12490)


Generative adversarial networks (GANs) could be used efficiently for image and video generation when labeled training data is available in bulk. In general, building a good machine learning model requires a reasonable amount of labeled training data. However, there are areas such as the biomedical field where the creation of such a dataset is time-consuming and requires expert knowledge. Thus, the aim is to use data augmentation techniques as an alternative to data collection to improve data classification. This paper presents the use of a modified version of a GAN called Gene Expression Generator (GEG) to augment the available data samples. The proposed approach was used to generate synthetic data for binary biomedical datasets to train existing supervised machine learning approaches. Experimental results show that the use of GEG for data augmentation with a modified version of leave one out cross-validation (LOOCV) increases the performance of classification accuracy.


  • Data generation
  • Generative adversarial networks
  • Gene expression data
  • Cancer classification

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-62365-4_6
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-62365-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.


  1. 1.

    x denotes original (positive) instances, \(x'\) denotes synthetic (negative) instances, z is a random Gaussian noise. D(x) and \(D(x')\) are discriminator’s outputs for original and synthetic instances respectively, and G(z) is the generator’s output.

  2. 2.

    DNA microarray data:


  1. Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Nat. Acad. Sci. 96(12), 6745–6750 (1999)

    CrossRef  Google Scholar 

  2. Antipov, G., Baccouche, M., Dugelay, J.L.: Face aging with conditional generative adversarial networks. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 2089–2093. IEEE (2017)

    Google Scholar 

  3. Berthelot, D., Milanfar, P., Goodfellow, I.: Creating high resolution images with a latent adversarial generator (2020). arXiv preprint arXiv:2003.02365

  4. Buza, K.: Classification of gene expression data: a hubness-aware semi-supervised approach. Comput. Methods Prog. Biomed. 127, 105–113 (2016)

    CrossRef  Google Scholar 

  5. Damian, A., Piciu, L., Turlea, S., Tapus, N.: Advanced customer activity prediction based on deep hierarchic encoder-decoders. In: 2019 22nd International Conference on Control Systems and Computer Science (CSCS), pp. 403–409. IEEE (2019)

    Google Scholar 

  6. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  7. Im, D.J., Kim, C.D., Jiang, H., Memisevic, R.: Generating images with recurrent adversarial networks (2016). arXiv preprint arXiv:1602.05110

  8. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Ppattern Recognition, pp. 1125–1134 (2017)

    Google Scholar 

  9. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  10. Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)

    Google Scholar 

  11. Li, C., Alvarez-Melis, D., Xu, K., Jegelka, S., Sra, S.: Distributional adversarial networks (2017). arXiv preprint arXiv:1706.09549

  12. Li, C., Wand, M.: Precomputed real-time texture synthesis with markovian generative adversarial networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 702–716. Springer, Cham (2016).

    CrossRef  Google Scholar 

  13. Li, J., Monroe, W., Shi, T., Jean, S., Ritter, A., Jurafsky, D.: Adversarial learning for neural dialogue generation. In: EMNLP (2017)

    Google Scholar 

  14. Lin, W.J., Chen, J.J.: Class-imbalanced classifiers for high-dimensional data. Briefings Bioinf. 14(1), 13–26 (2013)

    CrossRef  Google Scholar 

  15. Lu, Y., Kakillioglu, B., Velipasalar, S.: Autonomously and simultaneously refining deep neural network parameters by a bi-generative adversarial network aided genetic algorithm (2018). arXiv preprint arXiv:1809.10244

  16. Marchesi, M.: Megapixel size image creation using generative adversarial networks (2017). arXiv preprint arXiv:1706.00082

  17. Marouf, M., et al.: Realistic in silico generation and augmentation of single cell RNA-seq data using generative adversarial neural networks. bioRxiv, p. 390153 (2018)

    Google Scholar 

  18. Mirza, M., Osindero, S.: Conditional generative adversarial nets (2014). arXiv preprint arXiv:1411.1784

  19. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  20. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2015). arXiv preprint arXiv:1511.06434

  21. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis (2016). arXiv preprint arXiv:1605.05396

  22. Shang, C., Palmer, A., Sun, J., Chen, K.S., Lu, J., Bi, J.: Vigan: missing view imputation with generative adversarial networks. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 766–775. IEEE (2017)

    Google Scholar 

  23. Smith, E.J., Meger, D.: Improved adversarial systems for 3d object generation and reconstruction. In: Conference on Robot Learning, pp. 87–96 (2017)

    Google Scholar 

  24. Sotiriou, C., et al.: Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc. Nat. Acad. Sci. 100(18), 10393–10398 (2003)

    CrossRef  Google Scholar 

  25. Taan, A., Farou, Z.: Supervised learning methods for skin segmentation classification (2020).

  26. Vondrick, C., Pirsiavash, H., Torralba, A.: Generating videos with scene dynamics. In: Advances in Neural Information Processing Systems, pp. 613–621 (2016)

    Google Scholar 

  27. Wang, H., Qin, Z., Wan, T.: Text generation based on generative adversarial nets with latent variables. In: Phung, D., Tseng, V.S., Webb, G.I., Ho, B., Ganji, M., Rashidi, Lida (eds.) PAKDD 2018. LNCS (LNAI), vol. 10938, pp. 92–103. Springer, Cham (2018).

    CrossRef  Google Scholar 

  28. Wang, Z., She, Q., Ward, T.E.: Generative adversarial networks: a survey and taxonomy (2019). arXiv preprint arXiv:1906.01529

  29. Zhang, H.: Generative Adversarial Networks for Image Synthesis. Ph.D. thesis, Rutgers The State University of New Jersey-New Brunswick and University of Medicine and Dentistry of New Jersey (2019)

    Google Scholar 

  30. Zhang, Y., Bai, Y., Ding, M., Ghanem, B.: Multi-task generative adversarial network for detecting small objects in the wild. Int. J. Comput. Vis. 128, 1–19 (2020).

    MathSciNet  CrossRef  Google Scholar 

  31. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)

    Google Scholar 

Download references


This work was supported by the Telekom Innovation Laboratories (T-Labs) and the Research and Development unit of Deutsche Telekom.

The authors would like to express their deepest gratitude towards Tsegaye Misikir Tashu for his advice, valuable feedback, proofreading and assistance in overcoming technical problems.

Project no. ED_18-1-2019-0030 (Application domain specific highly reliable IT solutions subprogramme) has been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the Thematic Excellence Programme funding scheme.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Zakarya Farou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Farou, Z., Mouhoub, N., Horváth, T. (2020). Data Generation Using Gene Expression Generator. In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2020. IDEAL 2020. Lecture Notes in Computer Science(), vol 12490. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62364-7

  • Online ISBN: 978-3-030-62365-4

  • eBook Packages: Computer ScienceComputer Science (R0)