Abstract
Molecular biology studies on cancer, using gene expression datasets, have revealed that the datasets have a very small number of samples. Obtaining medical data is difficult and expensive due to privacy constraints. Accuracy of classifiers depends greatly on the quality and quantity of input data. The problem of small sample size or small data size has been addressed by augmentation. Owing to the sensitivity of synthetic data samples for the cancer data classification for gene expression data, this paper is motivated to investigate data augmentation using GAN. GAN is based on the principle of two blocks (generator and discriminator) working in a collaborative yet adversarial way. This paper proposes modified generator GAN (MG-GAN) where the generator is fed with original data and multivariate noise to generate data with Gaussian distribution. As the generated data lie within latent space, we reach saddle point faster. GAN has been widely used in data augmentation for image datasets. As per our understanding, this is the first attempt of using GAN for augmentation on gene expression dataset. The performance merit of proposed MG-GAN was compared with KNN and Basic GAN. As compared to KNN and GAN, MG-GAN improves classification accuracy by 18.8% and 11.9%, respectively. The loss value of the error function for MG-GAN is drastically reduced, from 0.6978 to 0.0082, ensuring sensitivity of the generated data. Improved classification accuracy and reduction in the loss value make our improved MG-GAN method better suited for critical applications with sensitive data.
Similar content being viewed by others
References
Antipov G, Baccouche M, Dugelay JL (2017) Face aging with conditional generative adversarial networks. In: 2017 IEEE international conference on image Processing (ICIP), Beijing, China, pp 2089–2093
Antoniou A, Storkey A, Edwards H (2017) Data augmentation generative adversarial networks. arXiv preprint arXiv:1711.04340
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223
Chan S, Elsheikh AH (2017) Parametrization and generation of geological models with generative adversarial networks. arXiv preprint arXiv:1708.01810
Chaudhari P, Agarwal H (2018) Improving feature selection using elite breeding QPSO on gene data set for cancer classification. In: Intelligent engineering informatics, advances in intelligent systems and computing book series, vol. 695, pp. 209–219
Chaudhari P, Agarwal H (2019) Data augmentation for cancer classification in oncogenomics: an improved KNN based approach. Evol Intell. https://doi.org/10.1007/s12065-019-00283-w
Chen X, Yu J, Kong S, Wu Z, Fang X, Wen L (2017) Towards quality advancement of underwater machine vision with generative adversarial networks. arXiv preprint arXiv:1712.00736
Collins F (2002) Oncogenomics: cancer and technology. Nat Genet 31:117–119
Creswell A, Bharath AA (2018) Inverting the generator of a generative adversarial network. IEEE Trans Neural Netw Learn Syst 30(7):1967–1974
Deng X, Zhu Y, Newsam S (2018) What is it like down there?: generating dense ground-level views and image features from overhead imagery using conditional generative adversarial networks. In: Proceedings of the 26th ACM SIGSPATIAL international conference on advances in geographic information systems, Seattle, Washington, pp 43–52
Deverall J, Lee J, Ayala M (2017) Using generative adversarial networks to design shoes: the preliminary steps. CS231n in Stanford. http://cs231n.stanford.edu/reports/2017/pdfs/119.pdf
Dutt RK, Premchand P (2017) Generative adversarial networks (GAN) review. CVR J Sci Technol 13:1–5
Eghbal-zadeh H, Widmer G (2017) Likelihood estimation for generative adversarial networks. arXiv preprint arXiv:1707.07530
Frid-Adar M, Klang E, Amitai M, Goldberger J, Greenspan H (2018) Synthetic data augmentation using GAN for improved liver lesion classification. In: 2018 IEEE 15th international symposium on biomedical Imaging (ISBI 2018), Washington, DC, USA, pp 289–293
Gharakhanian A (2017) Generative adversarial networks—hot topic in machine learning. http://www.kdnuggets.com/2017/01/generative-adversarial-networks-hot-topic-machine-learning.html
Ghasedi DK, Wang X, Huang H (2018) Semi-supervised generative adversarial network for gene expression inference. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, London, UK, pp 1435–1444
Gong M, Niu X, Zhang P, Li Z (2017) Generative adversarial networks for change detection in multispectral imagery. IEEE Geosci Remote Sens Lett 14(12):2310–2314
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y (2014) Generative adversarial networks. Adv Neural Inf Process Syst 3:2672–2680
Gurumurthy S, Kiran Sarvadevabhatla R, Venkatesh Babu R (2017) Deligan: generative adversarial networks for diverse and limited data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 166–174
Huang X, Li Y, Poursaeed O, Hopcroft J, Belongie S (2017) Stacked generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, vol 1, pp 5077–5086
Hui J (2018) GAN—whats generative adversary networks GAN? https://medium.com/@jonathan_hui/gan-whats-generative-adversarial-networks-and-its-application-f39ed278ef09
Huszár, F (2015) How (not) to train your generative model: scheduled sampling, likelihood, adversary?. arXiv preprint arXiv:1511.05101
Khémiri A, Echi AK, Elloumi M (2019) Bayesian versus convolutional networks for arabic handwriting recognition. Arab J Sci Eng 44(11):9301–9319
Konidaris F, Tagaris T, Sdraka M, Stafylopatis A (2018) Generative Adversarial Networks as an Advanced Data Augmentation Technique for MRI Data. IEEE Trans Med Imaging 37(3):673–679
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, pp 4681–4690
Li J, Madry A, Peebles J, Schmidt L (2017) On the limitations of first-order approximation in GAN dynamics. arXiv preprint arXiv:1706.09884
Li D, Chen D, Goh J, Ng SK (2018) Anomaly detection with generative adversarial networks for multivariate time series. arXiv preprint arXiv:1809.04758
Li Y, Xiao N, Ouyang W (2018b) Improved boundary equilibrium generative adversarial networks. IEEE Access 6:11342–11348
Li J, He H, Li L, Chen G (2019) A novel generative model with bounded-gan for reliability classification of gear safety. IEEE Trans Industr Electron 66(11):8772–8781
Liu F, Jiao L, Tang X (2019a) Task-oriented GAN for PolSAR image classification and clustering. IEEE Trans Neural Netw Learn Syst 30(9):2707–2719
Liu Y, Zhou Y, Liu X, Dong F, Wang C, Wang Z (2019b) Wasserstein GAN-based small-sample augmentation for new-generation artificial intelligence: a case study of cancer-staging data in biology. Engineering 5(1):156–163
Lu Y, Kakillioglu B, Velipasalar S (2018) Autonomously and simultaneously refining deep neural network parameters by a bi-generative adversarial network aided genetic algorithm. arXiv preprint arXiv:1809.10244
Luc P, Couprie C, Chintala S, Verbeek J (2016) Semantic segmentation using adversarial networks. arXiv preprint arXiv:1611.08408
Lucas A, Lopez-Tapiad S, Molinae R, Katsaggelos AK (2019) Generative adversarial networks and perceptual losses for video super-resolution. IEEE Trans Image Process 28(7):3312–3327
Matlab Documentation Classification using Nearest neighbours (2019). https://ch.mathworks.com/help/stats/classification-using-nearest-neighbors.html
Marchesi M (2017) Megapixel size image creation using generative adversarial networks. arXiv preprint arXiv:1706.00082
Marouf M, Machart P, Magruder DSS, Bansal V, Kilian C, Krebs CF, Bonn S (2018) Realistic in silico generation and augmentation of single cell RNA-seq data using Generative Adversarial Neural Networks. bioRxiv 390153
Metz L, Poole B, Pfau D, Sohl-Dickstein J (2016) Unrolled generative adversarial networks. arXiv preprint arXiv:1611.02163
Mustafa M, Bard D, Bhimji W, Lukić Z, Al-Rfou R, Kratochvil JM (2019) CosmoGAN: creating high-fidelity weak lensing convergence maps using Generative Adversarial Networks. Comput Astrophys Cosmol 6(1):1
Namozov A, Im Cho Y (2018) An efficient deep learning algorithm for fire and smoke detection with limited data. Adv Electr Comput Eng 18(4):121–129
Oliehoek FA, Savani R, Gallego J, van der Pol E, Groß R (2018) Beyond local nash equilibria for adversarial networks. arXiv preprint arXiv:1806.07268
Pan Z, Yu W, Yi X, Khan A, Yuan F, Zheng Y (2019) Recent progress on generative adversarial networks (GANs): a survey. IEEE Access 7:36322–36333
Quan TM, Nguyen-Duc T, Jeong WK (2018) Compressed sensing MRI reconstruction using a generative adversarial network with a cyclic loss. IEEE Trans Med Imaging 37(6):1488–1497
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434
Shang C, Palmer A, Sun J, Chen KS, Lu J, Bi J (2017) VIGAN: Missing view imputation with generative adversarial networks. In: 2017 IEEE international conference on big data (big data), Boston, MA, USA, pp 766–775
Tembine H (2019) Deep learning meets game theory: Bregman-based algorithms for interactive deep generative adversarial networks. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2018.2886238
Vertolli MO, Davies J (2017) Image quality assessment techniques show improved training and evaluation of autoencoder generative adversarial networks. arXiv preprint arXiv:1708.02237
Wan G et al (2018) Spatiotemporal regulation of liquid-like condensates in epigenetic inheritance. Nature 557:679–683. https://doi.org/10.1038/s41586-018-0132-0
Wang X, Ghasedi Dizaji K, Huang H (2018) Conditional generative adversarial network for gene expression inference. Bioinformatics 34(17):i603–i611
Wang C, Xu C, Yao X, Tao D (2019) Evolutionary generative adversarial networks. IEEE Trans Evol Comput 23(6):921–934
Weng L (2017) From GAN to WGAN. https://lilianweng.github.io/lil-log/2017/08/20/from-GAN-to-WGAN.html
Wu D, Rice CM, Wang X (2012) Cancer bioinformatics: a new approach to systems clinical medicine. BMC Bioinf 13(1):71
Xuan Q, Chen Z, Liu Y, Huang H, Bao G, Zhang D (2018) Multi-view generative adversarial network and its application in pearl classification. IEEE Trans Industr Electron 66(10):8244–8252
Yu B, Zhou L, Wang L, Shi Y, Fripp J, Bourgeat P (2019) Ea-GANs: edge-aware generative adversarial networks for cross-modality MR image synthesis. IEEE Trans Med Imaging 38(7):1750–1762
Zhu L, Chen Y, Ghamisi P, Benediktsson JA (2018) Generative adversarial networks for hyperspectral image classification. IEEE Trans Geosci Remote Sens 56(9):5046–5063
Funding
There are no funding agencies involved in this research.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Human and animal rights
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
The authors declare that they have no consent.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chaudhari, P., Agrawal, H. & Kotecha, K. Data augmentation using MG-GAN for improved cancer classification on gene expression data. Soft Comput 24, 11381–11391 (2020). https://doi.org/10.1007/s00500-019-04602-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04602-2