Abstract
Advances in AI technologies have resulted in superior levels of AI-based model performance. However, this has also led to a greater degree of model complexity, resulting in “black box” models. In response to the AI black box problem, the field of explainable AI (xAI) has emerged with the aim of providing explanations catered to human understanding, trust, and transparency. Yet, we still have a limited understanding of how xAI addresses the need for explainable AI in the context of healthcare. Our research explores the differing explanation needs amongst stakeholders during the development of an AI-system for classifying COVID-19 patients for the ICU. We demonstrate that there is a constellation of stakeholders who have different explanation needs, not just the “user”. Further, the findings demonstrate how the need for xAI emerges through concerns associated with specific stakeholder groups i.e., the development team, subject matter experts, decision makers, and the audience. Our findings contribute to the expansion of xAI by highlighting that different stakeholders have different explanation needs. From a practical perspective, the study provides insights on how AI systems can be adjusted to support different stakeholders’ needs, ensuring better implementation and operation in a healthcare context.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
A.B. Arrieta et al., Explainable Artificial Intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI, in Inf. Fusion, no. October, 2019.
C.J. Cai, S. Winter, D. Steiner, L. Wilcox, M. Terry, ‘Hello Ai’: Uncovering the onboarding needs of medical practitioners for human–AI collaborative decision-making, in Proceedings of the ACM on Human-Computer Interaction, vol. 3, no. CSCW. Association for Computing Machinery, pp. 1–24, 01-Nov-2019
A. Adadi, M. Berrada, Peeking inside the black-box: a survey on Explainable Artificial Intelligence (XAI). IEEE Access 6, 52138–52160 (2018)
F. Doshi-velez, B. Kim, Towards A Rigorous Science of Interpretable Machine Learning, no. Ml (2017) pp. 1–13
Z.C. Lipton, The Mythos of Model Interpretability, no. Whi, Jun. 2016
M.T. Ribeiro, S. Singh, C. Guestrin, ‘Why should i trust you?’ Explaining the predictions of any classifier, in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 13–17, Aug 2016, pp. 1135–1144
A. Holzinger, C. Biemann, C. S. Pattichis, D.B. Kell, What Do We Need to Build Explainable AI Systems for the Medical Domain?, no. Ml (2017) pp. 1–28
U. Pawar, D. O’Shea, S. Rea, R. O’Reilly, Incorporating explainable artificial intelligence (XAI) to aid the understanding of machine learning in the healthcare domain. CEUR Workshop Proc. 2771, 169–180 (2020)
J. Phua et al., Intensive care management of coronavirus disease 2019 (COVID-19): challenges and recommendations, in The Lancet Respiratory Medicine, vol. 8, no. 5. Lancet Publishing Group, pp. 506–517, 01 May 2020
E. Tjoa, C. Guan, A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI, vol. 1 (2019)
S. Lebovitz, Diagnostic doubt and artificial intelligence: An inductive field study of radiology work, in 40th Int. Conf. Inf. Syst. ICIS 2019 (2020)
T. Davenport, R. Kalakota, The potential for artificial intelligence in healthcare. Futur. Healthc. J. 6(2), 94–98 (2019)
T. Panch, H. Mattie, L.A. Celi, The ‘inconvenient truth’ about AI in healthcare. npj Digit. Med. 2(1), 4–6 (2019)
A.L. Fogel, J.C. Kvedar, Artificial intelligence powers digital medicine. npj Digit. Med. 1(1), 3–6 (2018)
W. Hryniewska, P. Bombiński, P. Szatkowski, P. Tomaszewska, A. Przelaskowski, P. Biecek, Do Not Repeat These Mistakes—A Critical Appraisal of Applications Of Explainable Artificial Intelligence For Image Based COVID-19 Detection, no. January, 2020
C. O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, vol. 272 (2016)
A. Shaban-Nejad, M. Michalowski, D.L. Buckeridge, Health intelligence: how artificial intelligence transforms population and personalized health. npj Digit. Med. 1(1) (2018)
A. Sharma, S. Rani, D. Gupta, Artificial intelligence-based classification of chest X-ray images into COVID-19 and other infectious diseases. Int. J. Biomed. Imaging (2020)
E. Strickland, How IBM Watson Overpromised and Underdelivered on AI Health Care - IEEE Spectrum, 2019. [Online]. Available: https://spectrum.ieee.org/biomedical/diagnostics/how-ibm-watson-overpromised-and-underdelivered-on-ai-health-care. Accessed: 26 Jan 2021
Z. Obermeyer, B. Powers, C. Vogeli, S. Mullainathan, Dissecting racial bias in an algorithm used to manage the health of populations. Science (80-) 366(6464), 447–453 (2019)
W.D. Heaven, Google’s Medical AI was Super Accurate in a Lab. Real Life was a Different Story. MIT Technology Review (Online). Available: https://www.technologyreview.com/2020/04/27/1000658/google-medical-ai-accurate-lab-real-life-clinic-covid-diabetes-retina-disease/. Accessed 16 Mar 2021
J. Angwin, J. Larson, S. Mattu, L. Kirchner, Machine Bias—ProPublica. ProPublica (Online). Available: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. Accessed: 03 Mar 2019
T.W. Kim, Explainable artificial intelligence (XAI), in The Goodness Criteria and the Grasp-Ability Test (2018), pp. 1–7
J. Gerlings, A. Shollo, I.D. Constantiou, Reviewing the need for Explainable Artificial Intelligence (xAI), in HICSS 54 (2021), pp. 1284–1293
J. Kemper, D. Kolkman, Transparent to whom? No algorithmic accountability without a critical audience. Inf. Commun. Soc. (2019)
D.D. Miller, The medical AI insurgency: what physicians must know about data to practice with intelligent machines. npj Digit. Med. 2(1) (2019)
J. Burrell, How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data Soc., vol. January (2016) pp. 1–12
A. Páez, The pragmatic turn in Explainable Artificial Intelligence (XAI). Minds Mach. (2019)
M. Bhandari, D. Jaswal, Decision making in medicine-an algorithmic approach. Med. J. Armed Forces India (2002)
J.-B. Lamy, B. Sekar, G. Guezennec, J. Bouaud, B. Séroussi, Explainable artificial intelligence for breast cancer: a visual case-based reasoning approach. Artif. Intell. Med. 94, 42–53 (2019)
A. Wodecki et al., Explainable Artificial Intelligence (XAI ) the need for explainable AI. Philos. Trans. A. Math. Phys. Eng. Sci. (2017)
S.M. Lauritsen et al., Explainable Artificial Intelligence Model to Predict Acute Critical Illness from Electronic Health Records (2019)
N. Prentzas, A. Nicolaides, E. Kyriacou, A. Kakas, C. Pattichis, Integrating machine learning with symbolic reasoning to build an explainable ai model for stroke prediction, in Proceedings—2019 IEEE 19th International Conference on Bioinformatics and Bioengineering, BIBE 2019 (2019)
S. Lebovitz, H. Lifshitz-Assaf, N. Levina, To Incorporate or Not to Incorporate AI for Critical Judgments: The Importance of Ambiguity in Professionals’ Judgment Process, 15 Jan 2020
A. Rajkomar, J. Dean, I. Kohane, Machine learning in medicine. N. Engl. J. Med. 380(14), 1347–1358 (2019)
M.T. Keane, E.M. Kenny, How case-based reasoning explains neural networks: a theoretical analysis of XAI using post-hoc explanation-by-example from a survey of ANN-CBR twin-systems, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019)
Stanford University, Artificial Intelligence and Life in 2030 (2016), pp. 52
L. Reis, C. Maier, J. Mattke, M. Creutzenberg, T. Weitzel, Addressing user resistance would have prevented a healthcare AI project failure. MIS Q. Exec. 19(4), 279–296 (2020)
A. Vellido, The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput. Appl. 32(24), 18069–18083 (2020)
A. Holzinger, G. Langs, H. Denk, K. Zatloukal, H. Müller, Causability and explainability of artificial intelligence in medicine. Wiley Interdisc. Rev.: Data Mining Knowl. Discovery 9(4), 1–13 (2019)
M. Goldstein, S. Uchida, A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4), 1–31 (2016)
C. Molnar, Interpretable Machine Learning. A Guide for Making Black Box Models Explainable, Book (2019), p. 247
Z.C. Lipton, The mythos of model interpretability. Commun. ACM 61, 35–43 (2016)
G. Ciatto, M.I. Schumacher, A. Omicini, D. Calvaresi, Agent-based explanations in AI: towards an abstract framework, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12175 LNAI (2020), pp. 3–20
L.H. Gilpin, D. Bau, B.Z. Yuan, A. Bajwa, M. Specter, L. Kagal, Explaining explanations: an overview of interpretability of machine learning, in Proceedings —2018 IEEE 5th Int. Conf. Data Sci. Adv. Anal. DSAA 2018 (2019) pp. 80–89
T. Miller, Explanation in Artificial Intelligence : Insights from the Social Sciences (2018)
A. Goldstein, A. Kapelner, J. Bleich, E. Pitkin, Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 24(1), 44–65 (2015)
R. Brandão, J. Carbonera, C. de Souza, J. Ferreira, B.N. Gonçalves, C.F. Leitão, Mediation Challenges and Socio-Technical Gaps for Explainable Deep Learning Applications. arXiv Prepr. arXiv …, no. Query date: 2020-04-16 13:43:28 (2019), pp. 1–39
O. Biran, C. Cotton, Explanation and justification in machine learning: a survey. IJCAI Work. Explain. AI 8–14 (2017)
S.T. Mueller, R.R. Hoffman, W. Clancey, A. Emrey, G. Klein, Explanation in human-AI systems: a literature meta-review. Def. Adv. Res. Proj. Agency 204 (2019)
R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A survey of methods for explaining black box models. ACM Comput. Surv. 51(5) (2018)
A. Asatiani, P. Malo, P.R. Nagbøl, E. Penttinen, T. Rinta-Kahila, A. Salovaara, Challenges of explaining the behavior of black-box AI systems. MIS Q. Exec. 19(4), 259–278 (2020)
T. Miller, P. Howe, L. Sonenberg, Explainable AI: Beware of Inmates Running the Asylum (1990)
P. Madumal, L. Sonenberg, T. Miller, F. Vetere, A grounded interaction protocol for explainable artificial intelligence. Proc. Int. Joint Conf. Autonomous Agents Multiagent Syst. AAMAS 2, 1033–1041 (2019)
W. Samek, T. Wiegand, K.-R. Müller, Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models, Aug. 2017
R.R. Hoffman, S.T. Mueller, G. Klein, J. Litman, Metrics for Explainable AI: Challenges and Prospects (2018), pp. 1–50
Z. Che, S. Purushotham, R. Khemani, Y. Liu, Interpretable deep models for ICU outcome prediction, in AMIA ... Annu. Symp. proceedings, vol. 2016, no. August (2016), pp. 371–380
M. Ilyas, H. Rehman, A. Nait-ali, Detection of Covid-19 From Chest x-ray Images Using Artificial Intelligence: An Early Review, arXiv, pp. 1–8 (2020)
C.H. Sudre et al., Anosmia and other SARS-CoV-2 positive test-associated symptoms, across three national, digital surveillance platforms as the COVID-19 pandemic and response unfolded: an observation study, in medRxiv (2020)
E. Dong, H. Du, L. Gardner, An interactive web-based dashboard to track COVID-19 in real time, in The Lancet Infectious Diseases, vol. 20, no. 5. Lancet Publishing Group, pp. 533–534, 01 May 2020
Y. Hswen, J.S. Brownstein, X. Xu, E. Yom-Tov, Early detection of COVID-19 in China and the USA: Summary of the implementation of a digital decision-support and disease surveillance tool. BMJ Open 10(12) (2020)
T. Macaulay, AI sent first coronavirus alert, but underestimated the danger, in The Next Web (2020) (Online). Available: https://thenextweb.com/neural/2020/02/21/ai-sent-first-coronavirus-alert-but-underestimated-the-danger/. Accessed: 09 Jan 2021
M.E.H. Chowdhury et al., Can AI help in screening viral and COVID-19 pneumonia? IEEE Access 8, 132665–132676 (2020)
J. Bullock, A. Luccioni, K. Hoffman Pham, C. Sin Nga Lam, M. Luengo-Oroz, Mapping the landscape of Artificial Intelligence applications against COVID-19. J. Artif. Intell. Res. 69, 807–845 (2020)
K. Murphy et al., COVID-19 on chest radiographs: a multireader evaluation of an artificial intelligence system. Radiology 296(3), E166–E172 (2020)
J. Zhang et al., Viral pneumonia screening on chest X-rays using confidence-aware anomaly detection, in IEEE Trans. Med. Imaging (2020), p. 1
X. Li, C. Li, D. Zhu, COVID-MobileXpert: On-Device COVID-19 Patient Triage and Follow-up using Chest X-rays (2020)
J.D. Arias-Londoño, J.A. Gomez-Garcia, L. Moro-Velazquez, J.I. Godino-Llorente, Artificial Intelligence Applied to Chest X-Ray Images for the Automatic Detection of COVID-19. A Thoughtful Evaluation Approach (2020) pp. 1–17
R.M. Wehbe et al., DeepCOVID-XR: an artificial intelligence algorithm to detect COVID-19 on chest radiographs trained and tested on a large US clinical dataset. Radiology, 203511 (2020)
R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128(2), 336–359 (2016)
M.M. Ahsan, K.D. Gupta, M.M. Islam, S. Sen, M.L. Rahman, M.S. Hossain, Study of Different Deep Learning Approach With Explainable AI for Screening Patients With Covid-19 Symptoms: Using CT Scan and Chest X-ray Image Dataset, arXiv (2020)
K. Conboy, G. Fitzgerald, L. Mathiassen, Qualitative methods research in information systems: motivations, themes, and contributions. Eur. J. Information Syst. 21(2), 113–118 (2012)
P. Powell, G. Walsham, Interpreting information systems in organizations. J. Oper. Res. Soc. (1993)
A. George, A. Bennett, Case Studies and Theory Development in the Social Science (MIT Press, Cambridge, MA, 2005)
J.M. Bartunek, P.G. Foster-Fishman, C.B. Keys, Using collaborative advocacy to foster intergroup cooperation: a joint insider-outsider investigation. Hum. Relations 49(6), 701–733 (1996)
D.A. Gioia, K.G. Corley, A.L. Hamilton, Seeking qualitative rigor in inductive research: notes on the Gioia methodology. Organ. Res. Methods 16(1), 15–31 (2013)
K.G. Corley, D.A. Gioia, Identity ambiguity and change in the wake of a corporate spin-off. Admin. Sci. Q. 49(2), 173–208 (2004)
J. Corbin, A. Strauss, Basics of Qualitative Research (3rd edn.): Techniques and Procedures for Developing Grounded Theory. SAGE Publications Inc (2012)
A. Borghesi, R. Maroldi, COVID-19 outbreak in Italy: experimental chest X-ray scoring system for quantifying and monitoring disease progression. Radiol. Medica 125(5), 509–513 (2020)
A. Borghesi et al., Radiographic severity index in COVID-19 pneumonia: relationship to age and sex in 783 Italian patients. Radiol. Medica 125(5), 461–464 (2020)
S.M. Lundberg, S.I. Lee, A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst., vol. 2017-Decem, no. Section 2 (2017), pp. 4766–4775
M. Veale, M. Van Kleek, R. Binns, Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making, in Conference on Human Factors in Computing Systems—Proceedings (2018)
M. Kuzba, P. Biecek, What Would You Ask the Machine Learning Model? Identification of User Needs for Model Explanations Based on Human-Model Conversations, arXiv (2020)
E. Rader, K. Cotter, J. Cho, Explanations as mechanisms for supporting algorithmic transparency,” in Conference on Human Factors in Computing Systems—Proceedings (2018)
S. Chari, D. M. Gruen, O. Seneviratne, D.L. McGuinness, Directions for Explainable Knowledge-Enabled Systems, March, 2020
G. Huang, Z. Liu, L. van der Maaten, K. Q. Weinberger, Densely connected convolutional networks, in Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-January, pp. 2261–2269, Aug 2016
R. Selvan et al., Lung Segmentation from Chest X-rays using Variational Data Imputation, August, 2020
K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 386–397 (2020)
J. Irvin et al., CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison, in 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (2019), pp. 590–597
Kaggle, RSNA Pneumonia Detection Challenge. Kaggle (2020) (Online). Available: https://www.kaggle.com/c/rsna-pneumonia-detection-challenge. Accessed: 23 Jan 2021
Z. Yue, L. Ma, R. Zhang, Comparison and validation of deep learning models for the diagnosis of pneumonia. Comput. Intell. Neurosci. (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix 1—Technical Aspects of LungX
Appendix 1—Technical Aspects of LungX
Turning to the more technical aspects of LungX, the solution is built on three different types of convolutional neural networks (CNNs). The first of the three models is a densenet 121 [86], which is able to detect the presence of the six different lung abnormalities as well as their location on a given X-ray.
The two additional models are used to calculate the severity score for COVID-19 patients. Only one of the six findings is related to COVID-19, and if this finding (edema/consolidation) is detected, the two additional models will calculate the severity score. A u-net [87] is used to segment the lungs into 12 pre-defined anatomical zones, while a mask-RCNN [88] is applied to segment the opacity in the lungs. When the outputs from the two models are mapped together, it is possible to calculate how many lung zones are affected by opacity. This score of how many of the 12 lung zones are affected is then used as the severity score indicating how badly the lungs are affected (Fig. 7.7).
The triage category indicates whether triage is low, medium or high. The score is based on the severity score and the presence of any lung abnormalities the system is able to detect. It is configurable, meaning that the clinics using the system decide how the six abnormalities and the severity score for COVID-19 patients should each rank in relation to one another, as well as the level of triage. The highest triage category detected takes precedence.
The data foundation for training the developed model consists of two open-access data sources that are carefully joined to train the full model. Sources combined 112,120 frontal-view X-ray images of 30,805 unique patients from the Kaggle RSNA Pneumonia Detection Challenge in. PNG format and 224.316 radiographs from the CheXpert dataset from 65,240 patients who underwent a radiographic examination from Stanford University Medical Center [89,90,91]. However, none of these datasets included examples of COVID-19. COVID-19 examples were only provided from the hospital in collaboration with the university project. Two hundred COVID patients who tested positive for COVID-19 were run only on the models for prediction of disease progression (Fig. 7.8).
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Gerlings, J., Jensen, M.S., Shollo, A. (2022). Explainable AI, But Explainable to Whom? An Exploratory Case Study of xAI in Healthcare. In: Lim, CP., Chen, YW., Vaidya, A., Mahorkar, C., Jain, L.C. (eds) Handbook of Artificial Intelligence in Healthcare. Intelligent Systems Reference Library, vol 212. Springer, Cham. https://doi.org/10.1007/978-3-030-83620-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-83620-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-83619-1
Online ISBN: 978-3-030-83620-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)