Neural Networks with Feature Attribution and Contrastive Explanations

Babiker, Housam K. B.; Kim, Mi-Young; Goebel, Randy

doi:10.1007/978-3-031-26387-3_23

Housam K. B. Babiker^13,15,
Mi-Young Kim^14,15 &
Randy Goebel^13,15

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13713))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

978 Accesses

Abstract

Interpretability is becoming an expected and even essential characteristic in GDPR Europe. In the majority of existing work on natural language processing (NLP), interpretability has focused on the problem of explanatory responses to questions like “Why p?” (identifying the causal attributes that support the prediction of "p.)” This type of local explainability focuses on explaining a single prediction made by a model for a single input, by quantifying the contribution of each feature to the predicted output class. Most of these methods are based on post-hoc approaches. In this paper, we propose a technique to learn centroid vectors concurrently while building the black-box in order to support answers to “Why p?” and “Why p and not q?,” where “q” is another class that is contrastive to “p.” Across multiple datasets, our approach achieves better results than traditional post-hoc methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Towards Unifying the Explainability Evaluation Methods for NLP

Explaining Model Behavior with Global Causal Analysis

Trusting deep learning natural-language models via local and global explanations

Article Open access 22 June 2022

References

Bashier, H.K., Kim, M.Y., Goebel, R.: Disk-CSV: distilling interpretable semantic knowledge with a class semantic vector. In: Proceedings of the 16th Conference of the EACL Main Volume, pp. 3021–3030 (2021)
Google Scholar
Bastings, J., Aziz, W., Titov, I.: Interpretable neural predictions with differentiable binary variables. In: Proceedings of ACL, pp. 2963–2977 (2019)
Google Scholar
Chen, J., Song, L., Wainwright, M.J., Jordan, M.I.: L-shapley and c-shapley: efficient model interpretation for structured data. In: ICLR 2019 (2018)
Google Scholar
DeYoung, J., et al.: Eraser: a benchmark to evaluate rationalized NLP models. In: Proceedings of the 58th ACL, pp. 4443–4458 (2020)
Google Scholar
Einhorn, H.J., Hogarth, R.M.: Judging probable cause. Psychol. Bull. 99(1), 3 (1986)
Article Google Scholar
Hendricks, L.A., Hu, R., Darrell, T., Akata, Z.: Generating counterfactual explanations with natural language. In: ICML Workshop on Human Interpretability in Machine Learning, pp. 95–98 (2018)
Google Scholar
Hilton, D.J.: Conversational processes and causal explanation. Psychol. Bull. 107(1), 65 (1990)
Article Google Scholar
Ismail, A.A., Corrada Bravo, H., Feizi, S.: Improving deep learning interpretability by saliency guided training. Adv. Neural Inf. Process. Syst. 34, 26726–26739 (2021)
Google Scholar
Jacovi, A., Swayamdipta, S., Ravfogel, S., Elazar, Y., Choi, Y., Goldberg, Y.: Contrastive explanations for model interpretability. arXiv preprint arXiv:2103.01378 (2021)
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. Cornell University Library. arXiv preprint arXiv:1412.6980 (2017)
Koura, A.: An approach to why-questions. Synthese 74(2), 191–206 (1988)
Article MathSciNet Google Scholar
Lipton, P.: Contrastive explanation. Royal Inst. Philos. Suppl. 27, 247–266 (1990)
Article Google Scholar
Lipton, Z.C.: The mythos of model interpretability: in machine learning, the concept of interpretability is both important and slippery. Queue 16(3), 31–57 (2018)
Article Google Scholar
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of ACL, pp. 142–150. ACL (2011)
Google Scholar
McGill, A.L., Klein, J.G.: Contrastive and counterfactual reasoning in causal judgment. J. Pers. Soc. Psychol. 64(6), 897 (1993)
Article Google Scholar
Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2019)
Article MathSciNet MATH Google Scholar
Nguyen, D.: Comparing automatic and human evaluation of local explanations for text classification. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1069–1078 (2018)
Google Scholar
Rathi, S.: Generating counterfactual and contrastive explanations using shap. arXiv preprint arXiv:1906.09293 (2019)
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you?: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. ACM (2016)
Google Scholar
Rudin, C.: Please stop explaining black box models for high stakes decisions. In: 32nd Conference on Neural Information Processing Systems (NIPS 2018), Workshop on Critiquing and Correcting Trends in Machine Learning (2018)
Google Scholar
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 3145–3153. JMLR. org (2017)
Google Scholar
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: Proceedings of International Conference on Machine Learning (ICML), pp. 3319–3328 (2017)
Google Scholar
Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on EMNLP, pp. 1422–1432 (2015)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
van der Waa, J., Robeer, M., van Diggelen, J., Brinkhuis, M., Neerincx, M.: Contrastive explanations with local foil trees. arXiv preprint arXiv:1806.07470 (2018)
Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harv. JL Tech. 31, 841 (2017)
Google Scholar
Wang, Y., Huang, H., Rudin, C., Shaposhnik, Y.: understanding how dimension reduction tools work: an empirical approach to deciphering t-SNE, UMAP, TriMap, and PaCMAP for data visualization. J. Mach. Learn. Res. 22(201), 1–73 (2021)
MathSciNet MATH Google Scholar
Woodward, J.: Making Things Happen: A Theory of Causal Explanation. Oxford University Press, Oxford (2005)
Google Scholar
Yang, L., Kenny, E., Ng, T.L.J., Yang, Y., Smyth, B., Dong, R.: Generating plausible counterfactual explanations for deep transformers in financial text classification. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 6150–6160 (2020)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar

Download references

Acknowledgements

We would like to acknowledge the support of the Alberta Machine Intelligence Institute (Amii), and the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information

Authors and Affiliations

Department of Computing Science, University of Alberta, Edmonton, Canada
Housam K. B. Babiker & Randy Goebel
Department of Science, Augustana Faculty, University of Alberta, Edmonton, Canada
Mi-Young Kim
Alberta Machine Intelligence Institute, Edmonton, Canada
Housam K. B. Babiker, Mi-Young Kim & Randy Goebel

Authors

Housam K. B. Babiker
View author publications
You can also search for this author in PubMed Google Scholar
Mi-Young Kim
View author publications
You can also search for this author in PubMed Google Scholar
Randy Goebel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Housam K. B. Babiker .

Editor information

Editors and Affiliations

Grenoble Alpes University, Saint Martin d’Hères, France
Massih-Reza Amini
INSA Rouen Normandy, Saint Etienne du Rouvray, France
Stéphane Canu
Ruhr-Universität Bochum, Bochum, Germany
Asja Fischer
KU Leuven, Leuven, Belgium
Tias Guns
Central European University, Vienna, Austria
Petra Kralj Novak
Aristotle University of Thessaloniki, Thessaloniki, Greece
Grigorios Tsoumakas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Babiker, H.K.B., Kim, MY., Goebel, R. (2023). Neural Networks with Feature Attribution and Contrastive Explanations. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13713. Springer, Cham. https://doi.org/10.1007/978-3-031-26387-3_23

Download citation

DOI: https://doi.org/10.1007/978-3-031-26387-3_23
Published: 17 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26386-6
Online ISBN: 978-3-031-26387-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Neural Networks with Feature Attribution and Contrastive Explanations

Abstract

Access this chapter

Similar content being viewed by others

Towards Unifying the Explainability Evaluation Methods for NLP

Explaining Model Behavior with Global Causal Analysis

Trusting deep learning natural-language models via local and global explanations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Neural Networks with Feature Attribution and Contrastive Explanations

Abstract

Access this chapter

Similar content being viewed by others

Towards Unifying the Explainability Evaluation Methods for NLP

Explaining Model Behavior with Global Causal Analysis

Trusting deep learning natural-language models via local and global explanations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation