Abstract
In this paper I argue that the search for explainable models and interpretable decisions in AI must be reformulated in terms of the broader project of offering a pragmatic and naturalistic account of understanding in AI. Intuitively, the purpose of providing an explanation of a model or a decision is to make it understandable to its stakeholders. But without a previous grasp of what it means to say that an agent understands a model or a decision, the explanatory strategies will lack a well-defined goal. Aside from providing a clearer objective for XAI, focusing on understanding also allows us to relax the factivity condition on explanation, which is impossible to fulfill in many machine learning models, and to focus instead on the pragmatic conditions that determine the best fit between a model and the methods and devices deployed to understand it. After an examination of the different types of understanding discussed in the philosophical and psychological literature, I conclude that interpretative or approximation models not only provide the best way to achieve the objectual understanding of a machine learning model, but are also a necessary condition to achieve post hoc interpretability. This conclusion is partly based on the shortcomings of the purely functionalist approach to post hoc interpretability that seems to be predominant in most recent literature.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
I will use decision as the general term to encompass outputs from AI models, such as predictions, categorizations, action selection, etc.
Needless to say, each of these differences has been the subject of great philosophical controversy. I am simply reporting some of the reasons that have been stated in the literature to motivate the analysis of understanding as an independent concept.
Here I will not evaluate the merits of this thesis in the philosophy of science. For a discussion, see the collection edited by De Regt et al. (2009).
More precisely, the explanans-statement and the explanandum-statement must be true. If one holds, following Lewis (1986) and Woodward (2003), that the relata of the explanation relation are particulars, i.e., things or events, the claim amounts to saying that the things or events occurring in both the explanans and the explanandum position exist or occur.
Salmon’s (1971) reference class rule, for example, requires the probabilistic (causal) context of a single event to be complete to avoid any epistemic relativity.
I am grateful to an anonymous reviewer for pointing this out.
The relaxation of the factivity condition is often defended in the context of objectual understanding, but it remains controversial in the case of understanding why. I return to this distinction in Sect. 4.
De Graaf and Malle (2017) have also emphasized the importance of these pragmatic factors: “The success of an explanation therefore depends on several critical audience factors—assumptions, knowledge, and interests that an audience has when decoding the explanation” (p. 19).
See Falcone and Castelfranchi (2001) for a critique of the use of decision theory to understand trust in virtual environments.
Philosophers of science are much more inclined to accept this view than epistemologists, who have fiercely resisted it. See, for example, Zagzebski (2001), Kvanvig (2003), Elgin (2004) and Pritchard (2014). I do not have space to discuss the issue here, but from the text it should be clear that I side with the epistemologists.
A direct understanding of a phenomenon would be factive, based on a literal description of the explanatory elements involved. It is in this sense that models offer an indirect path towards objective understanding.
In Allahyari and Lavesson (2011), 100 non-expert users were asked to compare the understandability of decision trees and rule lists. The former method was deemed more understandable. Freitas (2014) examines the pros and cons of decision trees, classification rules, decision tables, nearest neighbors, and Bayesian network classifiers with respect to their interpretability, and discusses how to improve the comprehensibility of classification models in general. More recently, Fürnkranz et al. (2018) performed an experiment with 390 participants to question the idea that the likeliness that a user will accept a logical model such as rule sets as an explanation for a decision is determined by the simplicity of the model. Lage et al. (2019) also explore the complexities of rule sets to find features that make them more interpretable, while Piltaver et al. (2016) undertake a similar analysis in the case of classification trees. Another important aspect of this empirical line of research is the study of cognitive biases in the understanding of interpretable models. Kliegr et al. (2018) study the possible effects of biases on symbolic machine learning models.
As noted in the Introduction, none of these methods is intrinsically interpretable.
A terminological clarification is in order. Mittelstadt et al. (2019) and other researchers in XAI use the phrase “contrastive explanations” to refer to counterfactuals. But these are two very different things. In philosophy, an explanation is contrastive if it answers the question “Why p rather than q?” instead of just “Why p?” In either case the explanation provided must be factual. To turn it into a counterfactual situation, the question must be changed to: “What changes in the world would have brought about q instead of p?” And the answer will be a hypothetical or counterfactual statement, not an explanation.
To be sure, there are many scenarios where both the owner and the user (but not the developer) of the model will be satisfied with its accurate decisions without feeling the need to have an objectual understand of it. Think of the books recommended by Amazon or the movies suggested by Netflix using the simple rule: “If you liked x, you might like y.” As I argued in Sect. 2, the relation between understanding and trust is always mediated by the interests, goals, resources, and degree of risk aversion of stakeholders. In these cases, the cost–benefit relation makes it unnecessary to make the additional effort of looking for mechanisms.
References
Achinstein, P. (1983). The nature of explanation. New York: Oxford University Press.
Allahyari, H., & Lavesson, N. (2011). User-oriented assessment of classification model understandability. In Proceedings of the 11th Scandinavian conference on artificial intelligence. Amsterdam: IOS Press.
Carter, J. A., & Gordon, E. C. (2016). Objectual understanding, factivity and belief. In M. Grajner & P. Schmechtig (Eds.), Epistemic reasons, norms and goals (pp. 423–442). Berlin: De Gruyter.
Caruana, R., Kangarloo, H., Dionisio, J. D. N., Sinha, U., & Johnson, D. (1999). Case-based explanations of non-case-based learning methods. In Proceedings of the AMIA symposium (p. 212). American Medical Informatics Association.
Darwin, C. (1860/1903). Letter to Henslow, May 1860. In F. Darwin (Ed.), More letters of Charles Darwin (Vol. I). New York: D. Appleton.
De Graaf, M. M., & Malle, B. F. (2017). How people explain action (and autonomous intelligent systems should too). In AAAI fall symposium on artificial intelligence for human–robot interaction (pp. 19–26). Palo Alto: The AAAI Press.
de Regt, H. W., & Dieks, D. (2005). A contextual approach to scientific understanding. Synthese, 144, 137–170.
de Regt, H. W., Leonelli, S., & Eigner, K. (Eds.). (2009). Scientific understanding: Philosophical perspectives. Pittsburgh: University of Pittsburgh Press.
Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.
Ehsan, U., Harrison, B., Chan, L., & Riedl, M. O. (2018). Rationalization: A neural machine translation approach to generating natural language explanations. In Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society (pp. 81–87). New York: ACM.
Elgin, C. Z. (2004). True enough. Philosophical Issues, 14, 113–131.
Elgin, C. Z. (2007). Understanding and the facts. Philosophical Studies, 132, 33–42.
Elgin, C. Z. (2008). Exemplification, idealization, and scientific understanding. In M. Suárez (Ed.), Fictions in science: Philosophical essays on modelling and idealization (pp. 77–90). London: Routledge.
Elgin, C. Z. (2017). True enough. Cambridge: MIT Press.
Falcone, R., & Castelfranchi, C. (2001). Social trust: A cognitive approach. In C. Castelfranchi, & Tan, Y.-H. (Eds.), Trust and deception in virtual societies (pp. 55–90). Dordrecht: Springer.
Freitas, A. A. (2014). Comprehensible classification models: a position paper. ACM SIGKDD Explorations Newsletter, 15(1), 1–10.
Fürnkranz, J., Kliegr, T., & Paulheim, H. (2018). On cognitive preferences and the plausibility of rule-based models. arXiv preprint arXiv:1803.01316.
Gilpin, L. H., Bau, D., Yuan, B. Z., Bajwa, A., Specter, M., & Kagal, L. (2019). Explaining explanation. An overview of interpretability of machine learning. arXiv preprint arXiv:1806.00069v3.
Greco, J. (2010). Achieving knowledge. Cambridge: Cambridge University Press.
Greco, J. (2012). Intellectual virtues and their place in philosophy. In C. Jäger & W. Löffler (Eds.), Epistemology: Contexts, values, disagreement: Proceedings of the 34th international Wittgenstein symposium (pp. 117–130). Heusenstamm: Ontos.
Grimm, S. R. (2006). Is understanding a species of knowledge? British Journal for the Philosophy of Science, 57, 515–535.
Grimm, S. R. (2011). Understanding. In S. Bernecker & D. Pritchard (Eds.), The Routledge companion to epistemology (pp. 84–94). New York: Routledge.
Grimm, S. R. (2014). Understanding as knowledge of causes. In A. Fairweather (Ed.), Virtue epistemology naturalized: Bridges between virtue epistemology and philosophy of science. Dordrecht: Springer.
Grimm, S. R. (Ed.). (2018). Making sense of the world: New essays on the philosophy of understanding. New York: Oxford University Press.
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., & Pedreschi, D. (2018). A survey of methods for explaining black box models. ACM Computing Surveys (CSUR), 51(5), Article 93.
Hempel, C. G. (1965). Aspects of scientific explanation. New York: The Free Press.
Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., & Baesens, B. (2011). An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Systems, 51(1), 141–154.
Kelemen, D. (1999). Functions, goals, and intentions: Children’s teleological reasoning about objects. Trends in Cognitive Science, 12, 461–468.
Khalifa, K. (2012). Inaugurating understanding or repackaging explanation. Philosophy of Science, 79, 15–37.
Kim, B. (2015). Interactive and interpretable machine learning models for human machine collaboration. Ph.D. thesis, Massachusetts Institute of Technology.
Kliegr, T., Bahník, Š., & Fürnkranz, J. (2018). A review of possible effects of cognitive biases on interpretation of rule-based machine learning models. arXiv preprint arXiv:1804.02969.
Krening, S., Harrison, B., Feigh, K., Isbell, C., Riedl, M., & Thomaz, A. (2016). Learning from explanations using sentiment and advice in RL. IEEE Transactions on Cognitive and Developmental Systems, 9(1), 44–55.
Kvanvig, J. (2003). The value of knowledge and the pursuit of understanding. New York: Cambridge University Press.
Kvanvig, J. (2009). Response to critics. In A. Haddock, A. Millar, & D. Pritchard (Eds.), Epistemic value (pp. 339–351). New York: Oxford University Press.
Lage, I., Chen, E., He, J., Narayanan, M., Kim, B., Gershman, S., et al. (2019). An evaluation of the human-interpretability of explanation. arXiv preprint arXiv:1902.00006.
Lapuschkin, S., Wäldchen, S., Binder, A., Montavon, G., Samek, W., & Müller, K. R. (2019). Unmasking Clever Hans predictors and assessing what machines really learn. Nature Communications, 10(1), 1096.
Lepri, B., Oliver, N., Letouzé, E., Pentland, A., & Vinck, P. (2017). Fair, transparent, and accountable algorithmic decision-making processes: The premise, the proposed solutions, and the open challenges. Philosophy & Technology, 31, 611–627.
Lewis, D. K. (1986). Causal explanation. In D. K. Lewis (Ed.), Philosophical papers (Vol. II, pp. 214–240). New York: Oxford University Press.
Lipton, P. (2009). Understanding without explanation. In H. W. de Regt, S. Leonelli, & K. Eigner (Eds.), Scientific understanding: Philosophical perspectives (pp. 43–63). Pittsburgh: University of Pittsburgh Press.
Lipton, Z. C. (2016). The mythos of model interpretability. arXiv preprint arXiv:1606.03490.
Lombrozo, T., & Gwynne, N. Z. (2014). Explanation and inference: Mechanistic and functional explanations guide property generalization. Frontiers in Human Neuroscience, 8, 700.
Lombrozo, T., & Wilkenfeld, D. A. (forthcoming). Mechanistic vs. functional understanding. In S. R. Grimm (Ed.), Varieties of understanding: New perspectives from philosophy, psychology, and theology. New York: Oxford University Press.
McAuley, J., & Leskovec, J. (2013). Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM conference on recommender systems (pp. 165–172). New York: ACM.
Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–18.
Miller, T., Howe, P., & Sonenberg, L. (2017). Explainable AI: Beware of inmates running the asylum. In Proceedings of the IJCAI-17 workshop on explainable AI (XAI) (pp. 36–42). http://www.intelligentrobots.org/files/IJCAI2017/IJCAI-17_XAI_WS_Proceedings.pdf. Accessed March 10, 2019.
Mittelstadt, B., Russell, C., & Wachter, S. (2019). Explaining explanations in AI. In Proceedings of the conference on fairness, accountability, and transparency (pp. 279–288). New York: ACM.
Mizrahi, M. (2012). Idealizations and scientific understanding. Philosophical Studies, 160, 237–252.
Páez, A. (2006). Explanations in K. An analysis of explanation as a belief revision operation. Oberhausen: Athena Verlag.
Páez, A. (2009). Artificial explanations: the epistemological interpretation of explanation in AI. Synthese, 170, 131–146.
Pazzani, M. (2000). Knowledge discovery from data? IEEE Intelligent Systems, 15(2), 10–13.
Piltaver, R., Luštrek, M., Gams, M., & Martinčić-Ipšić, S. (2016). What makes classification trees comprehensible? Expert Systems with Applications: An International Journal, 62(C), 333–346.
Potochnik, A. (2017). Idealization and the aims of science. Chicago: University of Chicago Press.
Pritchard, D. (2008). Knowing the answer, Understanding and epistemic value. Grazer Philosophische Studien, 77, 325–339.
Pritchard, D. (2014). Knowledge and understanding. In A. Fairweather (Ed.), Virtue scientia: Bridges between virtue epistemology and philosophy of science (pp. 315–328). Dordrecht: Springer.
Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D. (Eds.). (2009). Dataset shift in machine learning. Cambridge: MIT Press.
Reiss, J. (2012). The explanation paradox. Journal of Economic Methodology, 19, 43–62.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135–1144). New York: ACM.
Salmon, W. C. (1971). Statistical explanation. In W. C. Salmon (Ed.), Statistical explanation and statistical relevance. Pittsburgh: Pittsburgh University Press.
Salmon, W. C. (1984). Scientific explanation and the causal structure of the world. Princeton: Princeton University Press.
Samek, W., Wiegand, T., & Müller, K. R. (2017). Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296.
Strevens, M. (2013). No understanding without explanation. Studies in the History and Philosophy of Science, 44, 510–515.
van Fraassen, B. (1980). The scientific image. Oxford: Clarendon Press.
Wilkenfeld, D. (2013). Understanding as representation manipulability. Synthese, 190, 997–1016.
Woodward, J. (2003). Making things happen. A theory of causal explanation. New York: Oxford University Press.
Zagzebski, L. (2001). Recovering understanding. In M. Steup (Ed.), Knowledge, truth, and duty: Essays on epistemic justification, responsibility, and virtue. New York: Oxford University Press.
Zagzebski, L. (2009). On epistemology. Belmont: Wadsworth.
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In 13th European conference on computer vision ECCV 2014 (pp. 818–833). Cham: Springer.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Páez, A. The Pragmatic Turn in Explainable Artificial Intelligence (XAI). Minds & Machines 29, 441–459 (2019). https://doi.org/10.1007/s11023-019-09502-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11023-019-09502-w