Training Interpretable Convolutional Neural Networks by Differentiating Class-Specific Filters

Liang, Haoyu; Ouyang, Zhihao; Zeng, Yuyuan; Su, Hang; He, Zihao; Xia, Shu-Tao; Zhu, Jun; Zhang, Bo

doi:10.1007/978-3-030-58536-5_37

Haoyu Liang¹²,
Zhihao Ouyang^13,15,
Yuyuan Zeng^13,14,
Hang Su¹²,
Zihao He¹⁶,
Shu-Tao Xia^13,14,
Jun Zhu¹² &
…
Bo Zhang¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12347))

Included in the following conference series:

European Conference on Computer Vision

5761 Accesses
19 Citations

Abstract

Convolutional neural networks (CNNs) have been successfully used in a range of tasks. However, CNNs are often viewed as “black-box” and lack of interpretability. One main reason is due to the filter-class entanglement – an intricate many-to-many correspondence between filters and classes. Most existing works attempt post-hoc interpretation on a pre-trained model, while neglecting to reduce the entanglement underlying the model. In contrast, we focus on alleviating filter-class entanglement during training. Inspired by cellular differentiation, we propose a novel strategy to train interpretable CNNs by encouraging class-specific filters, among which each filter responds to only one (or few) class. Concretely, we design a learnable sparse Class-Specific Gate (CSG) structure to assign each filter with one (or few) class in a flexible way. The gate allows a filter’s activation to pass only when the input samples come from the specific class. Extensive experiments demonstrate the fabulous performance of our method in generating a sparse and highly class-related representation of the input, which leads to stronger interpretability. Moreover, comparing with the standard training strategy, our model displays benefits in applications like object localization and adversarial sample detection. Code link: https://github.com/hyliang96/CSGCNN.

H. Liang and Z. Ouyang—contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
\(\text {CE}(y||\tilde{{y}}^G_\theta )=- \frac{1}{|D|}\sum _{(x,y)\in D} \log ( (\tilde{{y}}^G_\theta )_y )\), where \(\tilde{{y}}^G_\theta \) is a predicted probability vector.

References

Bai, J., Li, Y., Li, J., Jiang, Y., Xia, S.: Rectified decision trees: Towards interpretability, compression and empirical soundness. arXiv preprint arXiv:1903.05965 (2019)
Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6541–6549 (2017)
Google Scholar
Bojarski, M., et al.: Explaining how a deep neural network trained with end-to-end learning steers a car. arXiv preprint arXiv:1704.07911 (2017)
Bouchacourt, D., Tomioka, R., Nowozin, S.: Multi-level variational autoencoder: learning disentangled representations from grouped observations. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Burgess, C.P., et al.: Understanding disentangling in \(\beta \)-vae. arXiv preprint arXiv:1804.03599 (2018)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: S&P (2017)
Google Scholar
Caruana, R., et al.: Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1721–1730. ACM (2015)
Google Scholar
Chen, T.Q., Li, X., Grosse, R.B., Duvenaud, D.K.: Isolating sources of disentanglement in variational autoencoders. In: Advances in Neural Information Processing Systems, pp. 2610–2620 (2018)
Google Scholar
Chen, X., et al.: Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2172–2180 (2016)
Google Scholar
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results. http://www.pascal-network.org/challenges/VOC/voc2010/workshop/index.html
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Gonzalez-Garcia, A., Modolo, D., Ferrari, V.: Do semantic parts emerge in convolutional neural networks? Int. J. Comput. Vis. 126(5), 476–494 (2018)
Article MathSciNet Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2014)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., Lerchner, A.: beta-vae: learning basic visual concepts with a constrained variational framework. Int. Conf. Learn. Represent. 2(5), 6 (2017)
Google Scholar
Jiang, Z., Wang, Y., Davis, L., Andrews, W., Rozgic, V.: Learning discriminative features via label consistent neural network. In: 2017 IEEE Winter Conference on Applications of Computer Vision, pp. 207–216. IEEE (2017)
Google Scholar
Kim, H., Mnih, A.: Disentangling by factorising. arXiv preprint arXiv:1802.05983 (2018)
Kingma, D.P., Welling, M.: Stochastic gradient VB and the variational auto-encoder. In: International Conference on Learning Representations (2014)
Google Scholar
Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical report TR-2009 (2009)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Kumar, A., Sattigeri, P., Balakrishnan, A.: Variational inference of disentangled latent concepts from unlabeled observations. arXiv preprint arXiv:1711.00848 (2017)
Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations. arXiv preprint arXiv:1811.12359 (2018)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5188–5196 (2015)
Google Scholar
Martinez, B., Modolo, D., Xiong, Y., Tighe, J.: Action recognition with spatial-temporal discriminative filter banks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5482–5491 (2019)
Google Scholar
Mordvintsev, A., Olah, C., Tyka, M.: Inceptionism: going deeper into neural networks (2015). https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html
Olah, C., et al.: The building blocks of interpretability. Distill (2018). https://doi.org/10.23915/distill.00010. https://distill.pub/2018/building-blocks
Prakash, A., Storer, J., Florencio, D., Zhang, C.: RePr: improved training of convolutional filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10666–10675 (2019)
Google Scholar
Ross, B.C.: Mutual information between discrete and continuous data sets. PloS one 9(2), e87357 (2014)
Article Google Scholar
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3856–3866 (2017)
Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)
Smith, A.G., et al.: Inhibition of pluripotential embryonic stem cell differentiation by purified polypeptides. Nature 336(6200), 688–690 (1988)
Article Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Thomas, V., et al.: Disentangling the independently controllable factors of variation by interacting with the world. arXiv preprint arXiv:1802.09484 (2018)
Wang, Y., Morariu, V.I., Davis, L.S.: Learning a discriminative filter bank within a cnn for fine-grained recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4148–4157 (2018)
Google Scholar
Wang, Y., Su, H., Zhang, B., Hu, X.: Interpret neural networks by identifying critical data routing paths. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8906–8914 (2018)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zhang, Q., Cao, R., Wu, Y.N., Zhu, S.C.: Mining object parts from CNNS via active question-answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 346–355 (2017)
Google Scholar
Zhang, Q., Cao, R., Shi, F., Wu, Y.N., Zhu, S.C.: Interpreting CNN knowledge via an explanatory graph. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Zhang, Q., Cao, R., Wu, Y.N., Zhu, S.C.: Growing interpretable part graphs on convnets via multi-shot learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Zhang, Q., et al.: Interactively transferring cnn patterns for part localization. arXiv preprint arXiv:1708.01783 (2017)
Zhang, Q., Wu, Y.N., Zhu, S.C.: Interpretable convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8827–8836 (2018)
Google Scholar
Zhang, Q., Yang, Y., Ma, H., Wu, Y.N.: Interpreting CNNS via decision trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6261–6270 (2019)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)
Google Scholar

Download references

Acknowledgement

This work was supported by the National Key R&D Program of China (2017YFA0700904), NSFC Projects (61620106010, U19B2034, U1811461, U19A2081, 61673241, 61771273), Beijing NSF Project (L172037), PCL Future Greater-Bay Area Network Facilities for Large-scale Experiments and Applications (LZC0019), Beijing Academy of Artificial Intelligence (BAAI), Tsinghua-Huawei Joint Research Program, a grant from Tsinghua Institute for Guo Qiang, Tiangong Institute for Intelligent Computing, the JP Morgan Faculty Research Program, Microsoft Research Asia, Rejoice Sport Tech. co., LTD and the NVIDIA NVAIL Program with GPU/DGX Acceleration.

Author information

Authors and Affiliations

Department of Computer Science and Technology, BNRist Center, Institute for AI, THBI Laboratory, Tsinghua University, Beijing, 100084, China
Haoyu Liang, Hang Su, Jun Zhu & Bo Zhang
Tsinghua SIGS, Shenzhen, 518055, China
Zhihao Ouyang, Yuyuan Zeng & Shu-Tao Xia
Peng Cheng Laboratory, University of Southern California, Los Angele, USA
Yuyuan Zeng & Shu-Tao Xia
ByteDance AI Lab, University of Southern California, Los Angele, USA
Zhihao Ouyang
Department of CS, University of Southern California, Los Angele, USA
Zihao He

Authors

Haoyu Liang
View author publications
You can also search for this author in PubMed Google Scholar
Zhihao Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Yuyuan Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Hang Su
View author publications
You can also search for this author in PubMed Google Scholar
Zihao He
View author publications
You can also search for this author in PubMed Google Scholar
Shu-Tao Xia
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Bo Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Hang Su or Jun Zhu .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 356 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, H. et al. (2020). Training Interpretable Convolutional Neural Networks by Differentiating Class-Specific Filters. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12347. Springer, Cham. https://doi.org/10.1007/978-3-030-58536-5_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-58536-5_37
Published: 03 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58535-8
Online ISBN: 978-3-030-58536-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics