Skip to main content

Interpretable AI in Healthcare: Enhancing Fairness, Safety, and Trust

  • Chapter
  • First Online:
Artificial Intelligence in Medicine

Abstract

The value and future potentials of AI in healthcare are becoming self-evident, presenting an escalating body of evidence. However, the adoption into clinical practice is still significantly impacted by the lack of transparency, stemming from inadequate focus on human-comprehensive information from AI, i.e. Interpretable AI. AI interpretations of uncertainty, significance, and causality translate to a more fair, safe, and reliable AI. This is especially pertinent to safer clinical decision making and for minimising any risk to the patient. In this chapter we aim to elucidate what interpretability means and why most machine learning (i.e. AI) models fail to satisfy these definitions. We lay this phenomenon out through what we believe to be its canonical components: predictions, uncertainty, significance, and causality; explaining how these different types of interpretations can support various explanations and how overcoming this barrier can permit the adoption of AI into healthcare.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We abuse terminology here as the details are esoteric and a little beside the point. Please refer to please refer to MacDonald [24] for a more accessible details about uncertainty in deep learning.

  2. 2.

    See Davis et al. [14] for a practical prescriptive guide on how to estimate uncertainties in deep learning.

  3. 3.

    We refer the interested reader to Peters J. et al. [27] for a comprehensive review of causal inference including methods for estimating causal relationships.

References

  1. Alaa AM, van der Shaar M (2017) Bayesian inference of individualized treatment effects using multi-task Gaussian processes. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.

    Google Scholar 

  2. Beaulieu-Jones BK, Finlayson SG, Yuan W, Altman RB, Kohane IS, Prasad V, Yu K-H (2020) Examining the use of real-world evidence in the regulatory process. Clin Pharmacol Ther 107(4):843–852. https://doi.org/10.1002/cpt.1658

    Article  Google Scholar 

  3. Begoli E, Bhattacharya T, Kusnezov D (2019) The need for uncertainty quantification in machine-assisted medical decision making. Nat Mach Intell 1(1):20–23. https://doi.org/10.1038/s42256-018-0004-1

    Article  Google Scholar 

  4. Bica I, Jordon J, van der Schaar M (2020) Estimating the effects of continuous-valued interventions using generative adversarial networks. ArXiv:2002.12326 [Cs, Stat]. http://arxiv.org/abs/2002.12326

  5. Bica I, Alaa AM, Lambert C, van der Schaar M (2021) From real-world patient data to individualized treatment effects using machine learning: current and future methods to address underlying challenges. Clin Pharmacol Ther 109(1):87–100. https://doi.org/10.1002/cpt.1907

    Article  Google Scholar 

  6. Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler DM, Wu J, Winter C et al (2020) Language models are few-shot learners. ArXiv:2005.14165 [Cs]. http://arxiv.org/abs/2005.14165

  7. Ovadia Y, Fertig E, Ren J, Nado Z, Sculley D, Nowozin S, Dillon JV, Lakshminarayanan B, Snoek J (2019) Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift. 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.

    Google Scholar 

  8. Chen J, Song L, Wainwright MJ, Jordan MI (2018) Learning to explain: an information-theoretic perspective on model interpretation. arXiv. https://arxiv.org/abs/1802.07814v2

  9. Chen P, Dong W, Lu X, Kaymak U, He K, Huang Z (2019) Deep representation learning for individualized treatment effect estimation using electronic health records. J Biomed Inform 100:103303. https://doi.org/10.1016/j.jbi.2019.103303

    Article  Google Scholar 

  10. Couzin-Frankel J (2019) Medicine contends with how to use artificial intelligence. Science 364(6446):1119–1120. https://doi.org/10.1126/science.2019.6446.364_1119

    Article  Google Scholar 

  11. Fort S, Hu H, Lakshminarayanan B (2019) Deep ensembles: a loss landscape perspective. arXiv. https://arxiv.org/abs/1912.02757v2

  12. Guo C, Pleiss G, Sun Y, Weinberger KQ (2017) On calibration of modern neural networks. arXiv. https://arxiv.org/abs/1706.04599v2

  13. Healthdirect H (2021) Cancer immunotherapy [Text/html]. Healthdirect Australia, September 15. https://www.healthdirect.gov.au/cancer-immunotherapy

  14. Davis J, MacDonald S, Zhu J, Oldfather J, Trzaskowski M (2020) Quantifying uncertainty in deep learning systems. AWS Prescriptive Guidance. https://docs.aws.amazon.com/prescriptive-guidance/latest/ml-quantifying-uncertainty/welcome.html

  15. Kallus N, Puli AM, Shalit U (2018) Removing hidden confounding by experimental grounding. Adv Neural Inf Proces Syst:31. https://papers.nips.cc/paper/2018/hash/566f0ea4f6c2e947f36795c8f58ba901-Abstract.html

  16. Khan S, Hayat M, Zamir SW, Shen J, Shao L (2019) Striking the right balance with uncertainty. In: Proceedings – 2019 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2019, pp 103–112. https://doi.org/10.1109/CVPR.2019.00019

    Chapter  Google Scholar 

  17. Kristiadi A, Hein M, Hennig P (2020) Being Bayesian, even just a bit, fixes overconfidence in ReLU networks. https://arxiv.org/abs/2002.10118v2

  18. Lakshminarayanan B, Pritzel A, Blundell C (2017) Simple and scalable predictive uncertainty estimation using deep ensembles. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30. Curran Associates, Inc, pp 6402–6413. http://papers.nips.cc/paper/7219-simple-and-scalable-predictive-uncertainty-estimation-using-deep-ensembles.pdf

    Google Scholar 

  19. Ledesma P (2020) How much does a clinical trial cost? Sofpromed, January 2. https://www.sofpromed.com/how-much-does-a-clinical-trial-cost

  20. Lee H-S, Shen C, Zame W, Lee J-W, van der Schaar M (2021) SDF-Bayes: cautious optimism in safe dose-finding clinical trials with drug combinations and heterogeneous patient groups. ArXiv:2101.10998 [Cs, Stat]. http://arxiv.org/abs/2101.10998

  21. Louizos C, Shalit U, Mooij J, Sontag D, Zemel R, Welling M (2017) Causal effect inference with deep latent-variable models. In: Proceedings of the 31st international conference on neural information processing systems, pp 6449–6459

    Google Scholar 

  22. Lundberg S, Lee S-I (2017) A unified approach to interpreting model predictions. ArXiv:1705.07874 [Cs, Stat]. http://arxiv.org/abs/1705.07874

  23. MacDonald S (2019) Interpretations in Bayesian deep learning. University of Queensland. Master of Data Science Capstone Thesis Project

    Google Scholar 

  24. MacDonald S (2020) Interpretations of learning. Medium, March 3. https://towardsdatascience.com/interpretations-in-learning-part-1-4342c5741a71

  25. Obermeyer Z, Emanuel EJ (2016) Predicting the future—big data, machine learning, and clinical medicine. N Engl J Med 375(13):1216–1219. https://doi.org/10.1056/NEJMp1606181

    Article  Google Scholar 

  26. Oberst M, Johansson FD, Wei D, Gao T, Brat G, Sontag D, Varshney KR (2020) Characterization of overlap in observational studies. ArXiv:1907.04138 [Cs, Stat]. http://arxiv.org/abs/1907.04138

  27. Peters J, Janzing D, Scholkopf B (2017) Elements of causal inference: foundations and learning algorithms. MIT Press

    Google Scholar 

  28. Rasmussen CE (2004) Gaussian processes in machine learning. In: Bousquet O, von Luxburg U, Rätsch G (eds) Advanced lectures on machine learning: ML summer schools 2003, Canberra, Australia, February 2–14, 2003, Tübingen, Germany, August 4—16, 2003, Revised lectures. Springer, pp 63–71. https://doi.org/10.1007/978-3-540-28650-9_4

    Chapter  Google Scholar 

  29. Ribeiro MT, Singh S, Guestrin C (2016) “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144. https://doi.org/10.1145/2939672.2939778

    Chapter  Google Scholar 

  30. Richens JG, Lee CM, Johri S (2020) Improving the accuracy of medical diagnosis with causal machine learning. Nat Commun 11(1):3923. https://doi.org/10.1038/s41467-020-17419-7

    Article  Google Scholar 

  31. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1):139–140. https://doi.org/10.1093/bioinformatics/btp616

    Article  Google Scholar 

  32. Lee H, Zhang Y, Zame WR, Shen C, Lee J, van der Schaar M (2020) Robust Recursive Partitioning for Heterogeneous Treatment Effects with Uncertainty Quantification. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.

    Google Scholar 

  33. Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55. https://doi.org/10.1093/biomet/70.1.41

    Article  Google Scholar 

  34. Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215. https://doi.org/10.1038/s42256-019-0048-x

    Article  Google Scholar 

  35. Schwab P, Linhardt L, Bauer S, Buhmann JM, Karlen W (2020) Learning counterfactual representations for estimating individual dose-response curves. Proc AAAI Conf Artif Intell 34(04):5612–5619. https://doi.org/10.1609/aaai.v34i04.6014

    Article  Google Scholar 

  36. Shalit U, Johansson FD, Sontag D (2017) Estimating individual treatment effect: generalization bounds and algorithms. In: Proceedings of the 34th international conference on machine learning, pp 3076–3085. https://proceedings.mlr.press/v70/shalit17a.html

    Google Scholar 

  37. Smilkov D, Thorat N, Kim B, Viégas FB, Wattenberg M (2017) SmoothGrad: removing noise by adding noise. CoRR:abs/1706.03825. http://arxiv.org/abs/1706.03825

  38. Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks. In: Proceedings of the 34th international conference on machine learning – volume 70, pp 3319–3328

    Google Scholar 

  39. van Amersfoort J, Smith W, Teh YW, Gal Y (2020) Uncertainty estimation using a single deep deterministic neural network. Proceedings of the 37 th international conference on machine learning, Vienna, Austria, PMLR 119, 2020.

    Google Scholar 

  40. van Amersfoort J, Smith L, Jesson A, Key O, Gal Y (2022) On feature collapse and deep kernel learning for single forward pass uncertainty. https://arxiv.org/abs/2102.11409

  41. Wang Y, Blei DM (2019) The blessings of multiple causes. J Am Stat Assoc 114(528):1574–1596. https://doi.org/10.1080/01621459.2019.1686987

    Article  Google Scholar 

  42. Yap M, Johnston RL, Foley H, MacDonald S, Kondrashova O, Tran KA, Nones K, Koufariotis LT, Bean C, Pearson JV, Trzaskowski M, Waddell N (2021) Verifying explainability of a deep learning tissue classifier trained on RNA-seq data. Sci Rep 11(1):2641. https://doi.org/10.1038/s41598-021-81773-9

    Article  Google Scholar 

  43. Zhang L, Wang Y, Ostropolets A, Mulgrave JJ, Blei DM, Hripcsak G (2019) The medical Deconfounder: assessing treatment effects with electronic health records. In: Proceedings of the 4th machine learning for healthcare conference, pp 490–512. https://proceedings.mlr.press/v106/zhang19a.html

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maciej Trzaskowski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

MacDonald, S., Steven, K., Trzaskowski, M. (2022). Interpretable AI in Healthcare: Enhancing Fairness, Safety, and Trust. In: Raz, M., Nguyen, T.C., Loh, E. (eds) Artificial Intelligence in Medicine. Springer, Singapore. https://doi.org/10.1007/978-981-19-1223-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-1223-8_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-1222-1

  • Online ISBN: 978-981-19-1223-8

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics