VARGAN: variance enforcing network enhanced GAN

Mohammadjafari, Sanaz; Cevik, Mucahit; Basar, Ayse

doi:10.1007/s10489-022-03199-8

VARGAN: variance enforcing network enhanced GAN

Published: 12 April 2022

Volume 53, pages 69–95, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

347 Accesses
1 Citation
Explore all metrics

Abstract

Generative adversarial networks (GANs) are one of the most widely used generative models. GANs can learn complex multi-modal distributions, and generate real-like samples. Despite the major success of GANs in generating synthetic data, they might suffer from unstable training process, and mode collapse. In this paper, we propose a new GAN architecture called variance enforcing GAN (VARGAN), which incorporates a third network to introduce diversity in the generated samples. The third network measures the diversity of the generated samples, which is used to penalize the generator’s loss for low diversity samples. The network is trained on the available training data and undesired distributions with limited modality. On a set of synthetic and real-world image data, VARGAN generates a more diverse set of samples compared to the recent state-of-the-art models. High diversity and low computational complexity, as well as fast convergence, make VARGAN a promising model to alleviate mode collapse.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 9

Dual Discriminator Weighted Mixture Generative Adversarial Network for image generation

Article 03 February 2022

Selectively Increasing the Diversity of GAN-Generated Samples

Sample Generation Combining Generative Adversarial Networks and Residual Dense Networks

References

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. CoRR, abs/1710.10196
Brock A, Donahue J, Simonyan K (2019) Large scale gan training for high fidelity natural image synthesis. In: 7th International conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6–9, 2019. https://openreview.net/forum?id=B1xsqj09Fm
Huang B, Chen W, Wu X, Lin C-L, Suganthan P N (2018) High-quality face image generated with conditional boundary equilibrium generative adversarial networks. Pattern Recogn Lett 111:72–79
Article Google Scholar
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690
Isola P, Zhu J-Y, Zhou T, Efros A A (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Zhu J-Y, Park T, Isola P, Efros A A (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242
Mok TCW, Chung ACS (2018) Learning data augmentation for brain tumor segmentation with coarse-to-fine generative adversarial networks. In: International MICCAI brainlesion workshop. Springer, pp 70–80
Mohammadjafari S, Ozyegen O, Cevik M, Kavurmacioglu E, Ethier J, Basar A (2021) Designing mm-wave electromagnetic engineered surfaces using generative adversarial networks. Neural Comput Appl:1–15
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International Conference on Machine Learning. PMLR, pp 214–223
Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. In: International conference on learning representations
Li W, Liang Z, Neuman J, Chen J, Cui X (2021) Multi-generator gan learning disconnected manifolds with mutual information. Knowl-Based Syst 212:106513
Article Google Scholar
Lin Z, Khetan A, Fanti G, Oh S (2018) Pacgan: The power of two samples in generative adversarial networks. In: Advances in neural information processing systems, pp 1498–1507
Metz L, Poole B, Pfau D, Sohl-Dickstein J (2017) Unrolled generative adversarial networks. In: 5th International conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=BydrOIcle
Gurumurthy S, Kiran Sarvadevabhatla R, Venkatesh Babu R (2017) Deligan: Generative adversarial networks for diverse and limited data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 166–174
Li W, Xu L, Liang Z, Wang S, Cao J, Lam T C, Cui X (2021) JDGAN: Enhancing generator on extremely limited data via joint distribution. Neurocomputing 431:148–162
Article Google Scholar
Che T, Li Y, Jacob A P, Bengio Y, Li W (2016) Mode regularized generative adversarial networks. CoRR, abs/1612.02136
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A C (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777
Tolstikhin I O, Gelly S, Bousquet O, Simon-Gabriel C-J, Schölkopf B (2017) Adagan: Boosting generative models. In: Advances in neural information processing systems, pp 5424–5433
Park D K, Yoo S, Bahng H, Choo J, Park N (2018) MEGAN: mixture of experts of generative adversarial networks for multimodal image generation. In: international joint conference on artificial intelligence, pp 878–884
Ghosh A, Kulharia V, Namboodiri V P, Torr PHS, Dokania P K (2018) Multi-agent diverse generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8513–8521
Khayatkhoei M, Elgammal A, Singh M (2018) Disconnected manifold learning for generative adversarial networks. In: Proceedings of the 32nd international conference on neural information processing systems, pp 7354–7364
Liu K, Tang W, Zhou F, Qiu G (2019) Spectral regularization for combating mode collapse in gans. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6382–6390
Elfeki M, Couprie C, Riviere M, Elhoseiny M (2019) GDPP: Learning diverse generations using determinantal point processes. In: International conference on machine learning, pp 1774–1783
Mao Q, Lee H-Y, Tseng H-Y, Ma S, Yang M-H (2019) Mode seeking generative adversarial networks for diverse image synthesis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1429–1437
Bang D, Shim H (2021) MGGAN: Solving mode collapse using manifold-guided training. In: international conference on computer vision, pp 2347–2356
Li W, Fan L, Wang Z, Ma C, Cui X (2021) Tackling mode collapse in multi-generator gans with orthogonal vectors. Pattern Recogn 110:107646
Article Google Scholar
Tran N-T, Bui T-A, Cheung N-M (2018) Dist-gan: An improved gan using distance constraints. In: Proceedings of the european conference on computer vision (ECCV), pp 370–385
Tran N-T, Bui T-A, Cheung N-M (2019) Improving gan with neighbors embedding and gradient matching. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 5191–5198
Hoang Q, Nguyen T D, Le T, Phung D (2018) MGAN: Training generative adversarial nets with multiple generators. In: 6th International conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30 – May 3, 2018, Conference Track Proceedings. https://openreview.net/forum?id=rkmu5b0a-
Srivastava A, Valkov L, Russell C, Gutmann M U, Sutton C (2017) Veegan: Reducing mode collapse in gans using implicit variational learning. In: Advances in neural information processing systems, pp 3308–3318
Zhong P, Mo Y, Xiao C, Chen P, Zheng C (2019) Rethinking generative mode coverage: A pointwise guaranteed approach. Adv Neural Inf Process Syst 32:2088–2099
Google Scholar
Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. CoRR, abs/1701.04862
Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros A A, Wang O, Shechtman E (2017) Multimodal image-to-image translation by enforcing bi-cycle consistency. In: Advances in neural information processing systems, pp 465–476
Mirza M, Osindero S (2014) Conditional generative adversarial nets. CoRR, abs/1411.1784
Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: 4th International conference on learning representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, Conference Track Proceedings. 1511.06434

Download references

Author information

Authors and Affiliations

Data Science Lab, Ryerson University, Toronto, Canada
Sanaz Mohammadjafari, Mucahit Cevik & Ayse Basar

Authors

Sanaz Mohammadjafari
View author publications
You can also search for this author in PubMed Google Scholar
Mucahit Cevik
View author publications
You can also search for this author in PubMed Google Scholar
Ayse Basar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mucahit Cevik.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Selection of MCR formulation and its parameters

We have explored three different formulations for the MCR values. Equation (5) presents a linear relationship between MCR value and percentage of covered modes.

$$ MCR(n) = \begin{cases} \frac{n}{N} & n = 2,\hdots, N-1, \enspace n \mid B\\ 0 & n = 1\\ \end{cases} $$

(5)

Equation (6) models a relationship that achieves early convergence to reward the model with low generator error.

$$ MCR(n) = \begin{cases} \frac{\mathrm{L}}{\mathrm{1} + S_{1} \cdot e^{-\frac{n}{N} \cdot S_{2}}} & n = 2,\hdots, N-1, \enspace n \mid B\\ 0 & n = 1\\ \end{cases} $$

(6)

Equation (7) models a relationship where it avoids rewarding the generator too early to keep it motivated.

$$ MCR(n) = \begin{cases} \frac{\mathrm{L}}{\mathrm{1} + S_{1} \cdot e^{-\frac{n-N}{N} \cdot S_{2}}} & n = 2,\hdots, N-1, \enspace n \mid B\\ 0 & n = 1\\ \end{cases} $$

(7)

Figure 12 shows the trajectory of different formulations based on the percentage of covered modes.

Firstly, we have experimented with constant values of L, S₁ and S₂ for different formulations on synthetic data with 36 modes. Our results illustrated in Fig. 13 indicate that (6) is presenting a suitable trajectory for MCR values over the percentage of covered modes.

Finally, we have used different values of L, S₁ and S₂ on synthetic data with 36 modes. The hyperparameter optimization results are illustrated in Fig. 14.

Equation (6) with L, S₁ and S₂ of 1, 10 and 5 shows an early convergence for all the metrics compared to other formulations and hyperparameters. As shown in Fig. 12, L, S₁ and S₂ of 1, 10 and 5 enforce a low initial MCR value with moderate speed of convergence to MCR value of one. In other words, VARGAN performance does not improve by defining a large MCR value for initial low mode coverage or a fast convergence trajectory.

B Architecture of models

In this section, a detailed summary of GAN architectures is presented. Table 7 presents the feed-forward model architecture used for all the datasets.

Table 7 Structure of FF GAN model

Full size table

Table 8 shows the convolutional model architecture used for stacked MNIST dataset. We have modified the architecture for other GAN variants to implement the specific details of their methodology.

Table 8 Structure of convolutional GAN model for stacked MNIST dataset

Full size table

Convolutional model architecture used for EES dataset is reported in Table 9. Number of convolutional layers is changed to implement both 9 × 9 and 19 × 19 designs.

Table 9 Structure of convolutional GAN model for EES dataset

Full size table

C Detailed results

In this section, performance comparison of GAN models over the epochs is presented. Figures 15 and 16 illustrate the convergence of the models for synthetic data with 8 and 25 modes based on different performance metrics and over epochs. VARGAN and GDPP models seem to have early convergence in the beginning epochs for synthetic data with 8 modes. VARGAN shows great early convergence on synthetic data with 25 modes as well, and the rest of the models follow it by a large gap.

Figure 17 presents the convergence of performance metrics for convolutional GAN models on stacked MNIST data. MGGAN, VARGAN and PacVARGAN show great early convergence over all the metrics. Moreover, VARGAN and PacVARGAN present lower standard deviations compared to MGGAN.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mohammadjafari, S., Cevik, M. & Basar, A. VARGAN: variance enforcing network enhanced GAN. Appl Intell 53, 69–95 (2023). https://doi.org/10.1007/s10489-022-03199-8

Download citation

Accepted: 05 January 2022
Published: 12 April 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10489-022-03199-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VARGAN: variance enforcing network enhanced GAN

Abstract

Access this article

Similar content being viewed by others

Dual Discriminator Weighted Mixture Generative Adversarial Network for image generation

Selectively Increasing the Diversity of GAN-Generated Samples

Sample Generation Combining Generative Adversarial Networks and Residual Dense Networks

References