Skip to main content
Log in

VARGAN: variance enforcing network enhanced GAN

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Generative adversarial networks (GANs) are one of the most widely used generative models. GANs can learn complex multi-modal distributions, and generate real-like samples. Despite the major success of GANs in generating synthetic data, they might suffer from unstable training process, and mode collapse. In this paper, we propose a new GAN architecture called variance enforcing GAN (VARGAN), which incorporates a third network to introduce diversity in the generated samples. The third network measures the diversity of the generated samples, which is used to penalize the generator’s loss for low diversity samples. The network is trained on the available training data and undesired distributions with limited modality. On a set of synthetic and real-world image data, VARGAN generates a more diverse set of samples compared to the recent state-of-the-art models. High diversity and low computational complexity, as well as fast convergence, make VARGAN a promising model to alleviate mode collapse.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  2. Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. CoRR, abs/1710.10196

  3. Brock A, Donahue J, Simonyan K (2019) Large scale gan training for high fidelity natural image synthesis. In: 7th International conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6–9, 2019. https://openreview.net/forum?id=B1xsqj09Fm

  4. Huang B, Chen W, Wu X, Lin C-L, Suganthan P N (2018) High-quality face image generated with conditional boundary equilibrium generative adversarial networks. Pattern Recogn Lett 111:72–79

    Article  Google Scholar 

  5. Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690

  6. Isola P, Zhu J-Y, Zhou T, Efros A A (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134

  7. Zhu J-Y, Park T, Isola P, Efros A A (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

  8. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242

  9. Mok TCW, Chung ACS (2018) Learning data augmentation for brain tumor segmentation with coarse-to-fine generative adversarial networks. In: International MICCAI brainlesion workshop. Springer, pp 70–80

  10. Mohammadjafari S, Ozyegen O, Cevik M, Kavurmacioglu E, Ethier J, Basar A (2021) Designing mm-wave electromagnetic engineered surfaces using generative adversarial networks. Neural Comput Appl:1–15

  11. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International Conference on Machine Learning. PMLR, pp 214–223

  12. Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. In: International conference on learning representations

  13. Li W, Liang Z, Neuman J, Chen J, Cui X (2021) Multi-generator gan learning disconnected manifolds with mutual information. Knowl-Based Syst 212:106513

    Article  Google Scholar 

  14. Lin Z, Khetan A, Fanti G, Oh S (2018) Pacgan: The power of two samples in generative adversarial networks. In: Advances in neural information processing systems, pp 1498–1507

  15. Metz L, Poole B, Pfau D, Sohl-Dickstein J (2017) Unrolled generative adversarial networks. In: 5th International conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=BydrOIcle

  16. Gurumurthy S, Kiran Sarvadevabhatla R, Venkatesh Babu R (2017) Deligan: Generative adversarial networks for diverse and limited data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 166–174

  17. Li W, Xu L, Liang Z, Wang S, Cao J, Lam T C, Cui X (2021) JDGAN: Enhancing generator on extremely limited data via joint distribution. Neurocomputing 431:148–162

    Article  Google Scholar 

  18. Che T, Li Y, Jacob A P, Bengio Y, Li W (2016) Mode regularized generative adversarial networks. CoRR, abs/1612.02136

  19. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A C (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777

  20. Tolstikhin I O, Gelly S, Bousquet O, Simon-Gabriel C-J, Schölkopf B (2017) Adagan: Boosting generative models. In: Advances in neural information processing systems, pp 5424–5433

  21. Park D K, Yoo S, Bahng H, Choo J, Park N (2018) MEGAN: mixture of experts of generative adversarial networks for multimodal image generation. In: international joint conference on artificial intelligence, pp 878–884

  22. Ghosh A, Kulharia V, Namboodiri V P, Torr PHS, Dokania P K (2018) Multi-agent diverse generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8513–8521

  23. Khayatkhoei M, Elgammal A, Singh M (2018) Disconnected manifold learning for generative adversarial networks. In: Proceedings of the 32nd international conference on neural information processing systems, pp 7354–7364

  24. Liu K, Tang W, Zhou F, Qiu G (2019) Spectral regularization for combating mode collapse in gans. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6382–6390

  25. Elfeki M, Couprie C, Riviere M, Elhoseiny M (2019) GDPP: Learning diverse generations using determinantal point processes. In: International conference on machine learning, pp 1774–1783

  26. Mao Q, Lee H-Y, Tseng H-Y, Ma S, Yang M-H (2019) Mode seeking generative adversarial networks for diverse image synthesis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1429–1437

  27. Bang D, Shim H (2021) MGGAN: Solving mode collapse using manifold-guided training. In: international conference on computer vision, pp 2347–2356

  28. Li W, Fan L, Wang Z, Ma C, Cui X (2021) Tackling mode collapse in multi-generator gans with orthogonal vectors. Pattern Recogn 110:107646

    Article  Google Scholar 

  29. Tran N-T, Bui T-A, Cheung N-M (2018) Dist-gan: An improved gan using distance constraints. In: Proceedings of the european conference on computer vision (ECCV), pp 370–385

  30. Tran N-T, Bui T-A, Cheung N-M (2019) Improving gan with neighbors embedding and gradient matching. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 5191–5198

  31. Hoang Q, Nguyen T D, Le T, Phung D (2018) MGAN: Training generative adversarial nets with multiple generators. In: 6th International conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30 – May 3, 2018, Conference Track Proceedings. https://openreview.net/forum?id=rkmu5b0a-

  32. Srivastava A, Valkov L, Russell C, Gutmann M U, Sutton C (2017) Veegan: Reducing mode collapse in gans using implicit variational learning. In: Advances in neural information processing systems, pp 3308–3318

  33. Zhong P, Mo Y, Xiao C, Chen P, Zheng C (2019) Rethinking generative mode coverage: A pointwise guaranteed approach. Adv Neural Inf Process Syst 32:2088–2099

    Google Scholar 

  34. Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. CoRR, abs/1701.04862

  35. Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros A A, Wang O, Shechtman E (2017) Multimodal image-to-image translation by enforcing bi-cycle consistency. In: Advances in neural information processing systems, pp 465–476

  36. Mirza M, Osindero S (2014) Conditional generative adversarial nets. CoRR, abs/1411.1784

  37. Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: 4th International conference on learning representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, Conference Track Proceedings. 1511.06434

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mucahit Cevik.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Selection of MCR formulation and its parameters

We have explored three different formulations for the MCR values. Equation (5) presents a linear relationship between MCR value and percentage of covered modes.

$$ MCR(n) = \begin{cases} \frac{n}{N} & n = 2,\hdots, N-1, \enspace n \mid B\\ 0 & n = 1\\ \end{cases} $$
(5)

Equation (6) models a relationship that achieves early convergence to reward the model with low generator error.

$$ MCR(n) = \begin{cases} \frac{\mathrm{L}}{\mathrm{1} + S_{1} \cdot e^{-\frac{n}{N} \cdot S_{2}}} & n = 2,\hdots, N-1, \enspace n \mid B\\ 0 & n = 1\\ \end{cases} $$
(6)

Equation (7) models a relationship where it avoids rewarding the generator too early to keep it motivated.

$$ MCR(n) = \begin{cases} \frac{\mathrm{L}}{\mathrm{1} + S_{1} \cdot e^{-\frac{n-N}{N} \cdot S_{2}}} & n = 2,\hdots, N-1, \enspace n \mid B\\ 0 & n = 1\\ \end{cases} $$
(7)

Figure 12 shows the trajectory of different formulations based on the percentage of covered modes.

Fig. 12
figure 12

MCR value based on number of generated modes divided by target modes for different equations. Left, middle and right figures in order present (5), (6) and (7)

Firstly, we have experimented with constant values of L, S1 and S2 for different formulations on synthetic data with 36 modes. Our results illustrated in Fig. 13 indicate that (6) is presenting a suitable trajectory for MCR values over the percentage of covered modes.

Fig. 13
figure 13

Comparison of different MCR formulations on synthetic 2D grid data with 36 modes averaged over 5 repeats

Finally, we have used different values of L, S1 and S2 on synthetic data with 36 modes. The hyperparameter optimization results are illustrated in Fig. 14.

Fig. 14
figure 14

Comparison of different hyperparameters for (6) on synthetic 2D grid data with 36 modes averaged over 5 repeats

Equation (6) with L, S1 and S2 of 1, 10 and 5 shows an early convergence for all the metrics compared to other formulations and hyperparameters. As shown in Fig. 12, L, S1 and S2 of 1, 10 and 5 enforce a low initial MCR value with moderate speed of convergence to MCR value of one. In other words, VARGAN performance does not improve by defining a large MCR value for initial low mode coverage or a fast convergence trajectory.

B Architecture of models

In this section, a detailed summary of GAN architectures is presented. Table 7 presents the feed-forward model architecture used for all the datasets.

Table 7 Structure of FF GAN model

Table 8 shows the convolutional model architecture used for stacked MNIST dataset. We have modified the architecture for other GAN variants to implement the specific details of their methodology.

Table 8 Structure of convolutional GAN model for stacked MNIST dataset

Convolutional model architecture used for EES dataset is reported in Table 9. Number of convolutional layers is changed to implement both 9 × 9 and 19 × 19 designs.

Table 9 Structure of convolutional GAN model for EES dataset

C Detailed results

In this section, performance comparison of GAN models over the epochs is presented. Figures 15 and 16 illustrate the convergence of the models for synthetic data with 8 and 25 modes based on different performance metrics and over epochs. VARGAN and GDPP models seem to have early convergence in the beginning epochs for synthetic data with 8 modes. VARGAN shows great early convergence on synthetic data with 25 modes as well, and the rest of the models follow it by a large gap.

Fig. 15
figure 15

Comparison of different GAN models on synthetic 2D ring data with 8 modes averaged over 5 repeats

Fig. 16
figure 16

Comparison of different GAN models on synthetic 2D grid data with 25 modes averaged over 5 repeats

Figure 17 presents the convergence of performance metrics for convolutional GAN models on stacked MNIST data. MGGAN, VARGAN and PacVARGAN show great early convergence over all the metrics. Moreover, VARGAN and PacVARGAN present lower standard deviations compared to MGGAN.

Fig. 17
figure 17

Performance comparison of different GAN models with convolutional GAN structure for stacked MNIST dataset averaged over 5 repeats

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohammadjafari, S., Cevik, M. & Basar, A. VARGAN: variance enforcing network enhanced GAN. Appl Intell 53, 69–95 (2023). https://doi.org/10.1007/s10489-022-03199-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03199-8

Keywords

Navigation