Abstract
With the usage of appropriate inductive biases, Counterfactual Generative Networks (CGNs) can generate novel images from random combinations of shape, texture, and background manifolds. These images can be utilized to train an invariant classifier, avoiding the wide spread problem of deep architectures learning spurious correlations rather than meaningful ones. As a consequence, out-of-domain robustness is improved. However, the CGN architecture comprises multiple over parameterized networks, namely BigGAN and U2-Net. Training these networks requires appropriate background knowledge and extensive computation. Since one does not always have access to the precise training details, nor do they always possess the necessary knowledge of counterfactuals, our work addresses the following question: Can we use the knowledge embedded in pre-trained CGNs to train a lower-capacity model, assuming black-box access (i.e., only access to the pretrained CGN model) to the components of the architecture? In this direction, we propose a novel work named SKDCGN that attempts knowledge transfer using Knowledge Distillation (KD). In our proposed architecture, each independent mechanism (shape, texture, background) is represented by a student ‘TinyGAN’ that learns from the pretrained teacher ‘BigGAN’. We demonstrate the efficacy of the proposed method using state-of-the-art datasets such as ImageNet, and MNIST by using KD and appropriate loss functions. Moreover, as an additional contribution, our paper conducts a thorough study on the composition mechanism of the CGNs, to gain a better understanding of how each mechanism influences the classification accuracy of an invariant classifier. Code available at: https://github.com/ambekarsameer96/SKDCGN.
S. Ambekar, M. Tafuro, A. Ankit, D. van der Mast, M. Alence—Equal contribution.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
It is noteworthy that other techniques were tested in the attempt to improve the visual quality of the results. Although they did not prove to be as beneficial, they are described in Sect. 4 of the Appendix (Supplementary Material).
References
Fridovich-Keil, S., Lopes, R.G., Roelofs, R.: Spectral bias in practice: the role of function frequency in generalization. CoRR, abs/2110.02424 (2021)
Geirhos, R., et al.: Shortcut learning in deep neural networks. CoRR, abs/2004.07780 (2020)
Sauer, A., Geiger, A.: Counterfactual generative networks. CoRR, abs/2101.06046 (2021)
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis (2019)
Qin, X., Zhang, Z.V., Huang, C., Dehghan, M., Zaïane, O.R., Jägersand, M.: U\({}^{\text{2}}\)-net: going deeper with nested u-structure for salient object detection. CoRR, abs/2005.09007 (2020)
Beyer, L., Zhai, X., Royer, A., Markeeva, L., Anil, R., Kolesnikov, A.: Knowledge distillation: a good teacher is patient and consistent. CoRR, abs/2106.05237 (2021)
Chang, T.-Y., Lu, C.-J.: Tinygan: distilling biggan for conditional image generation. CoRR, abs/2009.13829 (2020)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Khorram, S., Fuxin, L.: Cycle-consistent counterfactuals by latent transformations (2022)
Thiagarajan, J.J., Narayanaswamy, V.S., Rajan, D., Liang, J., Chaudhari, A., Spanias, A.: Designing counterfactual generators using deep model inversion. CoRR, abs/2109.14274 (2021)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: NIPS Deep Learning and Representation Learning Workshop (2015)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Hints for thin deep nets, Fitnets (2014)
Ahn, S., Hu, S.X., Damianou, A., Lawrence, N.D., Dai, Z.: Variational information distillation for knowledge transfer. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9155–9163 (2019)
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. CoRR, abs/2006.05525 (2020)
Aguinaldo, A., Chiang, P.-Y., Gain, A., Patil, A., Pearson, K., Feizi, S.: Compressing GANs using knowledge distillation. CoRR, abs/1902.00159 (2019)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2015)
Yang, S., Wang, Y., van de Weijer, J., Herranz, L., Jui, S.: Generalized source-free domain adaptation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8978–8987 (2021)
Ding, N., Xu, Y., Tang, Y., Xu, C., Wang, Y., Tao, D.: Source-free domain adaptation via distribution estimation. arXiv preprint arXiv:2204.11257 (2022)
Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. arXiv preprint arXiv:2205.11487 (2022)
Marcus, G., Davis, E., Aaronson, S.: A very preliminary analysis of dall-e 2 (2022)
Asano, Y.M., Saeed, A.: Extrapolating from a single image to a thousand classes using distillation. arXiv preprint arXiv:2112.00725 (2021)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Acknowledgments
We would like to express our sincere gratitude to Prof. dr. Efstratios Gavves and Prof. Wilker Aziz for effectively organizing the Deep Learning II course at the University of Amsterdam, which is the main reason this paper exists. We are thankful to our supervisor, Christos Athanasiadis, for his precious guidance throughout the project. Finally, we also thank the former Program Director of the MSc. Artificial Intelligence, Prof. dr. Cees G.M. Snoek, and the current Program Manager, Prof. dr. Evangelos Kanoulas, for effectively conducting the Master’s program in Artificial Intelligence at the University of Amsterdam.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ambekar, S., Tafuro, M., Ankit, A., der Mast, D.v., Alence, M., Athanasiadis, C. (2023). SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks Using cGANs. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13804. Springer, Cham. https://doi.org/10.1007/978-3-031-25069-9_43
Download citation
DOI: https://doi.org/10.1007/978-3-031-25069-9_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25068-2
Online ISBN: 978-3-031-25069-9
eBook Packages: Computer ScienceComputer Science (R0)