Skip to main content

To Err is (only) Human. Reflections on How to Move from Accuracy to Trust for Medical AI

  • Conference paper
  • First Online:
Exploring Innovation in a Digital World

Abstract

In this paper, we contribute to the deconstruction of the concept of accuracy with respect to machine learning systems that are used in human decision making, and specifically in medicine. We argue that, by taking a socio-technical stance, it is necessary to move from the idea that these systems are “agents that can err”, to the idea that these are just tools by which humans can interpret new cases in light of the technologically-mediated interpretation of past cases, like if they were wearing a pair of tinted glasses. In this new narrative, accuracy is a meaningless construct, while it is important that beholders can “believe in their eyes” (or spectacles), and therefore trust the tool enough to make sensible decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Available at https://covid19-blood-ml.herokuapp.com/.

  2. 2.

    This is the acronym for Reverse transcriptase-polymerase chain reaction, a laboratory technique for the quantification of viral RNA in research and clinical settings.

References

  1. Assale, M., Bordogna, S., Cabitza, F.: Vague visualizations to reduce quantification bias in shared medical decision making. In: Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, IVAPP, Valletta, Malta, vol. 3, pp. 209–216 (2020)

    Google Scholar 

  2. Balcan, M.-F., Blum, A., Srebro, N.: A theory of learning with similarity functions. Mach. Learn. 72(1–2), 89–112 (2008)

    Article  Google Scholar 

  3. Baxter, J.: A model of inductive bias learning. J. Artif. Intell. Res. 12, 149–198 (2000)

    Article  Google Scholar 

  4. Bednar, P.M., Sadok, M.: Socio-technical toolbox for business systems analysis and design. In: STPIS@ CAiSE, pp. 20–31 (2015)

    Google Scholar 

  5. Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Calibration of machine learning models. In: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques, pp. 128–146. IGI Global (2010)

    Google Scholar 

  6. Brinati, D., Campagner, A., Ferrari, D., Banfi, G., Locatelli, M., Cabitza, F.: Detection of Covid-19 infection from routine blood exams with machine learning: a feasibility study. J. Med. Syst. 44(8), 135 (2020)

    Article  Google Scholar 

  7. Bush, V.: As we may think. Interactions 3(2), 35–46 (1996)

    Article  Google Scholar 

  8. Cabitza, F.: Biases affecting human decision making in AI-supported second opinion settings. In: Torra, V., Narukawa, Y., Pasi, G., Viviani, M. (eds.) MDAI 2019. LNCS (LNAI), vol. 11676, pp. 283–294. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26773-5_25

    Chapter  Google Scholar 

  9. Cabitza, F., Campagner, A., Balsano, C.: Bridging the last mile gap between AI implementation and operation: data awareness that matters. Ann. Transl. Med. 8(7) (2020)

    Google Scholar 

  10. Cabitza, F., Campagner, A., Ciucci, D., Seveso, A.: Programmed inefficiencies in DSS-supported human decision making. In: Torra, V., Narukawa, Y., Pasi, G., Viviani, M. (eds.) MDAI 2019. LNCS (LNAI), vol. 11676, pp. 201–212. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26773-5_18

    Chapter  Google Scholar 

  11. Cabitza, F., Locoro, A., Alderighi, C., Rasoini, R., Compagnone, D., Berjano, P.: The elephant in the record: on the multiplicity of data recording work. Health Inform. J. 25(3), 475–490 (2019)

    Article  Google Scholar 

  12. Cabitza, F., Rasoini, R., Gensini, G.F.: Unintended consequences of machine learning in medicine. JAMA 318(6), 517–518 (2017)

    Article  Google Scholar 

  13. Carroll, J.M., Rosson, M.B.: Getting around the task-artifact cycle: how to make claims and design by scenario. ACM Trans. Inf. Syst. (TOIS) 10(2), 181–212 (1992)

    Article  Google Scholar 

  14. Caruana, R., Niculescu-Mizil, A.: An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 161–168 (2006)

    Google Scholar 

  15. Chen, M., Herrera, F., Hwang, K.: Cognitive computing: architecture, technologies and intelligent applications. IEEE Access 6, 19774–19783 (2018)

    Article  Google Scholar 

  16. Chen, S.C., Dhillon, G.S.: Interpreting dimensions of consumer trust in e-commerce. Inf. Technol. Manag. 4(2–3), 303–318 (2003)

    Article  Google Scholar 

  17. Chen, Y., Garcia, E.K., Gupta, M.R., Rahimi, A., Cazzanti, L.: Similarity-based classification: concepts and algorithms. J. Mach. Learn. Res. 10(Mar), 747–776 (2009)

    Google Scholar 

  18. De Souza, C.S.: The Semiotic Engineering of Human-Computer Interaction. MIT Press, Cambridge (2005)

    Book  Google Scholar 

  19. De Souza, C.S., Barbosa, S.D.J., Prates, R.O.: A semiotic engineering approach to user interface design. Knowl.-Based Syst. 14(8), 461–465 (2001)

    Article  Google Scholar 

  20. Devetyarov, D., Nouretdinov, I.: Prediction with confidence based on a random forest classifier. In: Papadopoulos, H., Andreou, A.S., Bramer, M. (eds.) AIAI 2010. IAICT, vol. 339, pp. 37–44. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16239-8_8

    Chapter  Google Scholar 

  21. Eco, U.: Metaphor, dictionary, and encyclopedia. New Literary Hist. 15(2), 255–271 (1984)

    Article  Google Scholar 

  22. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)

    Google Scholar 

  23. Goddard, K., Roudsari, A., Wyatt, J.C.: Automation bias: a systematic review of frequency, effect mediators, and mitigators. J. Am. Med. Inf. Assoc. 19(1), 121–127 (2012)

    Article  Google Scholar 

  24. Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1321–1330. JMLR.org (2017)

    Google Scholar 

  25. Holzinger, A., Langs, G., Denk, H., Zatloukal, K., Müller, H.: Causability and explainability of artificial intelligence in medicine. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 9(4), e1312 (2019)

    Google Scholar 

  26. Huggard, H., Koh, Y.S., Dobbie, G., Zhang, E.: Detecting concept drift in medical triage. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1733–1736 (2020)

    Google Scholar 

  27. Hüllermeier, E., Waegeman, W.: Aleatoric and epistemic uncertainty in machine learning: a tutorial introduction. arXiv preprint arXiv:1910.09457 (2019)

  28. Klein, G.: Naturalistic decision making. Hum. Factors 50(3), 456–460 (2008)

    Article  Google Scholar 

  29. Lipshitz, R.: Decision making as argument-driven action. In: Decision Making in Action: Models and Methods, pages 172–181 (1993)

    Google Scholar 

  30. Luijken, K., Groenwold, R.H.H., Van Calster, B., Steyerberg, E.W., van Smeden, M.: Impact of predictor measurement heterogeneity across settings on the performance of prediction models: a measurement error perspective. Stat. Med. 38(18), 3444–3459 (2019)

    Article  Google Scholar 

  31. Maul, A., Mari, L., Wilson, M.: Intersubjectivity of measurement across the sciences. Measurement 131, 764–770 (2019)

    Article  Google Scholar 

  32. Niculescu-Mizil, A., Caruana, R.: Predicting good probabilities with supervised learning. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 625–632 (2005)

    Google Scholar 

  33. Papadopoulos, H., Vovk, V., Gammermam, A.: Conformal prediction with neural networks. In: 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), vol. 2, pp. 388–395. IEEE (2007)

    Google Scholar 

  34. Parasuraman, R., Manzey, D.H.: Complacency and bias in human use of automation: an attentional integration. Hum. Factors 52(3), 381–410 (2010)

    Article  Google Scholar 

  35. Pasquinelli, M.: How a machine learns and fails. Spheres: J. Digit. Cult. (5), 1–17 (2019)

    Google Scholar 

  36. Pfeffer, J.: Building sustainable organizations: the human factor. Acad. Manag. Perspect. 24(1), 34–45 (2010)

    Google Scholar 

  37. Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., Weinberger, K.Q.: On fairness and calibration. In: Advances in Neural Information Processing Systems, pp. 5680–5689 (2017)

    Google Scholar 

  38. Sadin, E.: L’intelligence artificielle ou l’enjeu du siecle: anatomie d’un antihumanisme radical. L’ećhappeé (2018)

    Google Scholar 

  39. Tenner, E.: The Efficiency Paradox: What Big Data Can’t Do. Vintage (2018)

    Google Scholar 

  40. Tsymbal, A.: The problem of concept drift: definitions and related work. Comput. Sci. Dept. Trinity Coll. Dublin 106(2), 58 (2004)

    Google Scholar 

  41. Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, Heidelberg (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Federico Cabitza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cabitza, F., Campagner, A., Datteri, E. (2021). To Err is (only) Human. Reflections on How to Move from Accuracy to Trust for Medical AI. In: Ceci, F., Prencipe, A., Spagnoletti, P. (eds) Exploring Innovation in a Digital World. Lecture Notes in Information Systems and Organisation, vol 51. Springer, Cham. https://doi.org/10.1007/978-3-030-87842-9_4

Download citation

Publish with us

Policies and ethics