Skip to main content
  • 2983 Accesses

Abstract

Recent advances in deep learning have made tremendous progress in the adoption of neural network models for tasks from resource utilization to autonomous driving. Most deep learning models are opaque black-box models that are not easily explainable. Unlike linear models, the weights of a neural network are not inherently interpretable to humans. The need for explainable deep learning has led to the development of a variety of methods that can help us better understand the decisions and decision-making process of neural network models. We note that many of the general post-hoc model-agnostic methods presented in Chap. 5 are applicable to deep learning models. This chapter presents a collection of explanation approaches that are specifically developed for neural networks by leveraging architecture or learning method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. D. Alvarez-Melis, T.S. Jaakkola, Towards robust interpretability with self-explaining neural networks (2018). arXiv:1806.07538 [cs.LG]

    Google Scholar 

  2. S. Bach et al., On pixel-wise explanations for non-linear classifier decisions by layerwise relevance propagation. PLOS ONE 10(7), 1–46 (2015). https://doi.org/10.1371/journal.pone.0130140.9

    Google Scholar 

  3. A. Binder et al., Layer-wise relevance propagation for deep neural network architectures, in Information Science and Applications (ICISA) 2016, ed. by K.J. Kim, N. Joukov, vol. 376. Lecture Notes in Electrical Engineering (Springer Singapore, Singapore, 2016), pp. 913–922. ISBN:978-981-10-0557-2

    Google Scholar 

  4. C. Chen et al., This looks like that: Deep learning for interpretable image recognition (2019). arXiv:1806.10574 [cs.LG]

    Google Scholar 

  5. H. Chen, S. Lundberg, S.-I. Lee, Explaining models by propagating Shapley values of local components (2019). arXiv:1911.11888 [cs.LG]

    Google Scholar 

  6. Y. Dong et al., Improving interpretability of deep neural networks with semantic information (2017). arXiv:1703.04096 [cs.CV]

    Google Scholar 

  7. D. Erhan, A. Courville, Y. Bengio, Understanding Representations Learned in Deep Architectures. Tech. rep. 1355. Université de Montréal/DIRO (Oct. 2010)

    Google Scholar 

  8. R.C. Fong, A. Vedaldi, Interpretable explanations of black boxes by meaningful perturbation, in 2017 IEEE International Conference on Computer Vision (ICCV) (Oct. 2017). https://doi.org/10.1109/iccv.2017.371. http://dx.doi.org/10.1109/ICCV.2017.371

  9. Y. Goyal et al., Explaining classifiers with causal concept effect (CaCE) (2020). arXiv:1907.07165 [cs.LG]

    Google Scholar 

  10. G. Hinton, O. Vinyals, J. Dean, Distilling the knowledge in a neural network (2015). arXiv:1503.02531 [stat.ML]

    Google Scholar 

  11. R. Iyer et al., Transparency and explanation in deep reinforcement learning neural networks (2018). arXiv:1809.06061 [cs.LG]

    Google Scholar 

  12. B. Kim et al., Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV), in ICML, ed. by J.G. Dy, A. Krause, vol. 80. Proceedings of Machine Learning Research (PMLR, 2018), pp. 2673–2682

    Google Scholar 

  13. T. Lei, R. Barzilay, T. Jaakkola, Rationalizing neural predictions. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (Association for Computational Linguistics, Austin, Texas, 2016), pp. 107–117. https://doi.org/10.18653/v1/D16-1011. https://www.aclweb.org/anthology/D16-1011

  14. J. Li, W. Monroe, D. Jurafsky, Understanding neural networks through representation erasure (2017). arXiv:1612.08220 [cs.CL]

    Google Scholar 

  15. O. Li et al., Deep learning for case-based reasoning through prototypes: a neural network that explains its predictions (2017). arXiv:1710.04806 [cs.AI]

    Google Scholar 

  16. H. Liu, Q. Yin, W.Y. Wang, Towards explainable NLP: A generative explanation framework for text classification (2019). arXiv:1811.00196 [cs.CL]

    Google Scholar 

  17. G. Montavon et al., Layer-wise relevance propagation: An overview. Explainable AI (2019)

    Google Scholar 

  18. V. Petsiuk, A. Das, K. Saenko, RISE: Randomized input sampling for explanation of black-box models (2018). arXiv:1806.07421 [cs.CV]

    Google Scholar 

  19. R.R. Selvaraju et al., Grad-CAM: Visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128(2), 336–359 (2019). ISSN: 1573-1405. http://doi.org/10.1007/s11263-019-01228-7

    Article  Google Scholar 

  20. S. Serrano, N.A. Smith, Is attention interpretable? (2019). arXiv:1906.03731 [cs.CL]

    Google Scholar 

  21. A. Shrikumar, P. Greenside, A. Kundaje, Learning important features through propagating activation differences (2019). arXiv:1704.02685 [cs.CV]

    Google Scholar 

  22. K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional networks: Visualising image classification models and saliency maps (2014). arXiv:1312.6034 [cs.CV]

    Google Scholar 

  23. J.T. Springenberg et al., Striving for simplicity: The all convolutional net (2015). arXiv:1412.6806 [cs.LG]

    Google Scholar 

  24. M. Sundararajan, A. Taly, Q. Yan, Axiomatic attribution for deep networks (2017). arXiv:1703.01365 [cs.LG]

    Google Scholar 

  25. A.M. Turing, Computers & amp; thought (MIT Press, 1995), pp. 11–35. Chap. Computing Machinery and Intelligence

    Google Scholar 

  26. M.D. Zeiler, G.W. Taylor, R. Fergus, Adaptive deconvolutional networks for mid and high level feature learning, in 2011 International Conference on Computer Vision (2011), pp. 2018–2025. https://doi.org/10.1109/ICCV.2011.6126474

  27. M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional networks (2013). arXiv:1311.2901 [cs.CV]

    Google Scholar 

  28. R. Zellers et al., From recognition to cognition: Visual commonsense reasoning, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019, pp. 6713–6724. https://doi.org/10.1109/CVPR.2019.00688

  29. J. Zhang et al., Top-down neural attention by excitation backprop (2016). arXiv:1608.00507 [cs.CV]

    Google Scholar 

  30. B. Zhou et al., Learning deep features for discriminative localization (2015). arXiv:1512.04150 [cs.CV]

    Google Scholar 

  31. B. Zhou et al., Object detectors emerge in deep scene CNNs (2015). arXiv:1412.6856 [cs.CV]

    Google Scholar 

  32. L.M. Zintgraf et al., Visualizing deep neural network decisions: Prediction difference analysis (2017). arXiv:1702.04595 [cs.CV]

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kamath, U., Liu, J. (2021). Explainable Deep Learning. In: Explainable Artificial Intelligence: An Introduction to Interpretable Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-83356-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-83356-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-83355-8

  • Online ISBN: 978-3-030-83356-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics