Explainable Deep Learning

Kamath, Uday; Liu, John

doi:10.1007/978-3-030-83356-5_6

Uday Kamath³ &
John Liu⁴

2983 Accesses

Abstract

Recent advances in deep learning have made tremendous progress in the adoption of neural network models for tasks from resource utilization to autonomous driving. Most deep learning models are opaque black-box models that are not easily explainable. Unlike linear models, the weights of a neural network are not inherently interpretable to humans. The need for explainable deep learning has led to the development of a variety of methods that can help us better understand the decisions and decision-making process of neural network models. We note that many of the general post-hoc model-agnostic methods presented in Chap. 5 are applicable to deep learning models. This chapter presents a collection of explanation approaches that are specifically developed for neural networks by leveraging architecture or learning method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

D. Alvarez-Melis, T.S. Jaakkola, Towards robust interpretability with self-explaining neural networks (2018). arXiv:1806.07538 [cs.LG]
Google Scholar
S. Bach et al., On pixel-wise explanations for non-linear classifier decisions by layerwise relevance propagation. PLOS ONE 10(7), 1–46 (2015). https://doi.org/10.1371/journal.pone.0130140.9
Google Scholar
A. Binder et al., Layer-wise relevance propagation for deep neural network architectures, in Information Science and Applications (ICISA) 2016, ed. by K.J. Kim, N. Joukov, vol. 376. Lecture Notes in Electrical Engineering (Springer Singapore, Singapore, 2016), pp. 913–922. ISBN:978-981-10-0557-2
Google Scholar
C. Chen et al., This looks like that: Deep learning for interpretable image recognition (2019). arXiv:1806.10574 [cs.LG]
Google Scholar
H. Chen, S. Lundberg, S.-I. Lee, Explaining models by propagating Shapley values of local components (2019). arXiv:1911.11888 [cs.LG]
Google Scholar
Y. Dong et al., Improving interpretability of deep neural networks with semantic information (2017). arXiv:1703.04096 [cs.CV]
Google Scholar
D. Erhan, A. Courville, Y. Bengio, Understanding Representations Learned in Deep Architectures. Tech. rep. 1355. Université de Montréal/DIRO (Oct. 2010)
Google Scholar
R.C. Fong, A. Vedaldi, Interpretable explanations of black boxes by meaningful perturbation, in 2017 IEEE International Conference on Computer Vision (ICCV) (Oct. 2017). https://doi.org/10.1109/iccv.2017.371. http://dx.doi.org/10.1109/ICCV.2017.371
Y. Goyal et al., Explaining classifiers with causal concept effect (CaCE) (2020). arXiv:1907.07165 [cs.LG]
Google Scholar
G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network (2015). arXiv:1503.02531 [stat.ML]
Google Scholar
R. Iyer et al., Transparency and explanation in deep reinforcement learning neural networks (2018). arXiv:1809.06061 [cs.LG]
Google Scholar
B. Kim et al., Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV), in ICML, ed. by J.G. Dy, A. Krause, vol. 80. Proceedings of Machine Learning Research (PMLR, 2018), pp. 2673–2682
Google Scholar
T. Lei, R. Barzilay, T. Jaakkola, Rationalizing neural predictions. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (Association for Computational Linguistics, Austin, Texas, 2016), pp. 107–117. https://doi.org/10.18653/v1/D16-1011. https://www.aclweb.org/anthology/D16-1011
J. Li, W. Monroe, D. Jurafsky, Understanding neural networks through representation erasure (2017). arXiv:1612.08220 [cs.CL]
Google Scholar
O. Li et al., Deep learning for case-based reasoning through prototypes: a neural network that explains its predictions (2017). arXiv:1710.04806 [cs.AI]
Google Scholar
H. Liu, Q. Yin, W.Y. Wang, Towards explainable NLP: A generative explanation framework for text classification (2019). arXiv:1811.00196 [cs.CL]
Google Scholar
G. Montavon et al., Layer-wise relevance propagation: An overview. Explainable AI (2019)
Google Scholar
V. Petsiuk, A. Das, K. Saenko, RISE: Randomized input sampling for explanation of black-box models (2018). arXiv:1806.07421 [cs.CV]
Google Scholar
R.R. Selvaraju et al., Grad-CAM: Visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128(2), 336–359 (2019). ISSN: 1573-1405. http://doi.org/10.1007/s11263-019-01228-7
Article Google Scholar
S. Serrano, N.A. Smith, Is attention interpretable? (2019). arXiv:1906.03731 [cs.CL]
Google Scholar
A. Shrikumar, P. Greenside, A. Kundaje, Learning important features through propagating activation differences (2019). arXiv:1704.02685 [cs.CV]
Google Scholar
K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional networks: Visualising image classification models and saliency maps (2014). arXiv:1312.6034 [cs.CV]
Google Scholar
J.T. Springenberg et al., Striving for simplicity: The all convolutional net (2015). arXiv:1412.6806 [cs.LG]
Google Scholar
M. Sundararajan, A. Taly, Q. Yan, Axiomatic attribution for deep networks (2017). arXiv:1703.01365 [cs.LG]
Google Scholar
A.M. Turing, Computers & amp; thought (MIT Press, 1995), pp. 11–35. Chap. Computing Machinery and Intelligence
Google Scholar
M.D. Zeiler, G.W. Taylor, R. Fergus, Adaptive deconvolutional networks for mid and high level feature learning, in 2011 International Conference on Computer Vision (2011), pp. 2018–2025. https://doi.org/10.1109/ICCV.2011.6126474
M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional networks (2013). arXiv:1311.2901 [cs.CV]
Google Scholar
R. Zellers et al., From recognition to cognition: Visual commonsense reasoning, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019, pp. 6713–6724. https://doi.org/10.1109/CVPR.2019.00688
J. Zhang et al., Top-down neural attention by excitation backprop (2016). arXiv:1608.00507 [cs.CV]
Google Scholar
B. Zhou et al., Learning deep features for discriminative localization (2015). arXiv:1512.04150 [cs.CV]
Google Scholar
B. Zhou et al., Object detectors emerge in deep scene CNNs (2015). arXiv:1412.6856 [cs.CV]
Google Scholar
L.M. Zintgraf et al., Visualizing deep neural network decisions: Prediction difference analysis (2017). arXiv:1702.04595 [cs.CV]
Google Scholar

Download references

Author information

Authors and Affiliations

Ashburn, VA, USA
Uday Kamath
Nashville, TN, USA
John Liu

Authors

Uday Kamath
View author publications
You can also search for this author in PubMed Google Scholar
John Liu
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kamath, U., Liu, J. (2021). Explainable Deep Learning. In: Explainable Artificial Intelligence: An Introduction to Interpretable Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-83356-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-83356-5_6
Published: 03 August 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-83355-8
Online ISBN: 978-3-030-83356-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics