SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks Using cGANs

Ambekar, Sameer; Tafuro, Matteo; Ankit, Ankit; der Mast, Diego van; Alence, Mark; Athanasiadis, Christos

doi:10.1007/978-3-031-25069-9_43

SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks Using cGANs

Conference paper
First Online: 14 February 2023

1114 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13804))

Abstract

With the usage of appropriate inductive biases, Counterfactual Generative Networks (CGNs) can generate novel images from random combinations of shape, texture, and background manifolds. These images can be utilized to train an invariant classifier, avoiding the wide spread problem of deep architectures learning spurious correlations rather than meaningful ones. As a consequence, out-of-domain robustness is improved. However, the CGN architecture comprises multiple over parameterized networks, namely BigGAN and U2-Net. Training these networks requires appropriate background knowledge and extensive computation. Since one does not always have access to the precise training details, nor do they always possess the necessary knowledge of counterfactuals, our work addresses the following question: Can we use the knowledge embedded in pre-trained CGNs to train a lower-capacity model, assuming black-box access (i.e., only access to the pretrained CGN model) to the components of the architecture? In this direction, we propose a novel work named SKDCGN that attempts knowledge transfer using Knowledge Distillation (KD). In our proposed architecture, each independent mechanism (shape, texture, background) is represented by a student ‘TinyGAN’ that learns from the pretrained teacher ‘BigGAN’. We demonstrate the efficacy of the proposed method using state-of-the-art datasets such as ImageNet, and MNIST by using KD and appropriate loss functions. Moreover, as an additional contribution, our paper conducts a thorough study on the composition mechanism of the CGNs, to gain a better understanding of how each mechanism influences the classification accuracy of an invariant classifier. Code available at: https://github.com/ambekarsameer96/SKDCGN.

S. Ambekar, M. Tafuro, A. Ankit, D. van der Mast, M. Alence—Equal contribution.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
It is noteworthy that other techniques were tested in the attempt to improve the visual quality of the results. Although they did not prove to be as beneficial, they are described in Sect. 4 of the Appendix (Supplementary Material).

References

Fridovich-Keil, S., Lopes, R.G., Roelofs, R.: Spectral bias in practice: the role of function frequency in generalization. CoRR, abs/2110.02424 (2021)
Google Scholar
Geirhos, R., et al.: Shortcut learning in deep neural networks. CoRR, abs/2004.07780 (2020)
Google Scholar
Sauer, A., Geiger, A.: Counterfactual generative networks. CoRR, abs/2101.06046 (2021)
Google Scholar
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis (2019)
Google Scholar
Qin, X., Zhang, Z.V., Huang, C., Dehghan, M., Zaïane, O.R., Jägersand, M.: U\({}^{\text{2}}\)-net: going deeper with nested u-structure for salient object detection. CoRR, abs/2005.09007 (2020)
Google Scholar
Beyer, L., Zhai, X., Royer, A., Markeeva, L., Anil, R., Kolesnikov, A.: Knowledge distillation: a good teacher is patient and consistent. CoRR, abs/2106.05237 (2021)
Google Scholar
Chang, T.-Y., Lu, C.-J.: Tinygan: distilling biggan for conditional image generation. CoRR, abs/2009.13829 (2020)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar
Khorram, S., Fuxin, L.: Cycle-consistent counterfactuals by latent transformations (2022)
Google Scholar
Thiagarajan, J.J., Narayanaswamy, V.S., Rajan, D., Liang, J., Chaudhari, A., Spanias, A.: Designing counterfactual generators using deep model inversion. CoRR, abs/2109.14274 (2021)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: NIPS Deep Learning and Representation Learning Workshop (2015)
Google Scholar
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Hints for thin deep nets, Fitnets (2014)
Google Scholar
Ahn, S., Hu, S.X., Damianou, A., Lawrence, N.D., Dai, Z.: Variational information distillation for knowledge transfer. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9155–9163 (2019)
Google Scholar
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. CoRR, abs/2006.05525 (2020)
Google Scholar
Aguinaldo, A., Chiang, P.-Y., Gain, A., Patil, A., Pearson, K., Feizi, S.: Compressing GANs using knowledge distillation. CoRR, abs/1902.00159 (2019)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2015)
Google Scholar
Yang, S., Wang, Y., van de Weijer, J., Herranz, L., Jui, S.: Generalized source-free domain adaptation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8978–8987 (2021)
Google Scholar
Ding, N., Xu, Y., Tang, Y., Xu, C., Wang, Y., Tao, D.: Source-free domain adaptation via distribution estimation. arXiv preprint arXiv:2204.11257 (2022)
Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. arXiv preprint arXiv:2205.11487 (2022)
Marcus, G., Davis, E., Aaronson, S.: A very preliminary analysis of dall-e 2 (2022)
Google Scholar
Asano, Y.M., Saeed, A.: Extrapolating from a single image to a thousand classes using distillation. arXiv preprint arXiv:2112.00725 (2021)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar

Download references

Acknowledgments

We would like to express our sincere gratitude to Prof. dr. Efstratios Gavves and Prof. Wilker Aziz for effectively organizing the Deep Learning II course at the University of Amsterdam, which is the main reason this paper exists. We are thankful to our supervisor, Christos Athanasiadis, for his precious guidance throughout the project. Finally, we also thank the former Program Director of the MSc. Artificial Intelligence, Prof. dr. Cees G.M. Snoek, and the current Program Manager, Prof. dr. Evangelos Kanoulas, for effectively conducting the Master’s program in Artificial Intelligence at the University of Amsterdam.

Author information

Authors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Sameer Ambekar, Matteo Tafuro, Ankit Ankit, Diego van der Mast, Mark Alence & Christos Athanasiadis

Authors

Sameer Ambekar
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Tafuro
View author publications
You can also search for this author in PubMed Google Scholar
Ankit Ankit
View author publications
You can also search for this author in PubMed Google Scholar
Diego van der Mast
View author publications
You can also search for this author in PubMed Google Scholar
Mark Alence
View author publications
You can also search for this author in PubMed Google Scholar
Christos Athanasiadis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sameer Ambekar .

Editor information

Editors and Affiliations

IBM Research - MIT-IBM Watson AI Lab, Massachusetts, USA
Leonid Karlinsky
Technion – Israel Institute of Technology, Haifa, Israel
Tomer Michaeli
Kyoto University, Kyoto, Japan
Ko Nishino

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 10340 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ambekar, S., Tafuro, M., Ankit, A., der Mast, D.v., Alence, M., Athanasiadis, C. (2023). SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks Using cGANs. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13804. Springer, Cham. https://doi.org/10.1007/978-3-031-25069-9_43

Download citation

DOI: https://doi.org/10.1007/978-3-031-25069-9_43
Published: 14 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25068-2
Online ISBN: 978-3-031-25069-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics