Skip to main content

Abstract

Fairness in artificial intelligence (AI) for medical image analysis is a key factor for preventing new or exacerbated healthcare disparities as the use of automated decision-making tools in medicine increases. However, bias mitigation strategies to achieve group fairness have appreciable shortcomings, which may pose ethical limitations in clinical settings. In this work, we study a well-defined case example of a deep learning-based medical image analysis model exhibiting unfairness between racial subgroups. Specifically, with the task of sex classification using tabulated data from 6,276 T1-weighted brain magnetic resonance imaging (MRI) scans of 9–10 year old adolescents, we investigate how adversarial debiasing for equalized odds between White and Black subgroups affects performance of other structured and intersectional subgroups. Although the debiasing process was successful in reducing classification performance disparities between White and Black subgroups, accuracies for the highest performing subgroups were substantially degraded and disproportionate impacts on performance were seen when considering intersections of sex, race, and socioeconomic status. These results highlight one of the several challenges when attempting to define and achieve algorithmic fairness, particularly in medical imaging applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. The adolescent brain cognitive development study. https://abcdstudy.org/

  2. Adeli, E., et al.: Deep learning identifies morphological determinants of sex differences in the pre-adolescent brain. Neuroimage 1, 117293 (2020)

    Article  Google Scholar 

  3. Andrus, M., Villeneuve, S.: Demographic-reliant algorithmic fairness: characterizing the risks of demographic data collection in the pursuit of fairness. In: 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2022, pp. 1709–1721. Association for Computing Machinery (2022)

    Google Scholar 

  4. Angwin, J., Surya, M., Kirchner, L.: Machine Bias. ProPublica, Technical report (2015)

    Google Scholar 

  5. Barocas, S., Hardt, M., Narayana, A.: Fairness and machine learning: limitations and opportunities (2019)

    Google Scholar 

  6. Birhane, A., et al.: The forgotten margins of AI ethics. In: 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 948–958. ACM (2022)

    Google Scholar 

  7. Buolamwini, J., Gebru, T.: Gender shades: intersectional accuracy disparities in commercial gender classification. In: Proceedings of the 1st Conference on Fairness, Accountability and Transparency, pp. 77–91 (2018)

    Google Scholar 

  8. Celi, L.A., et al.: Sources of bias in artificial intelligence that perpetuate healthcare disparities-A global review. PLOS Digit. Health 1(3), e0000022 (2022)

    Article  Google Scholar 

  9. Char, D.S., Shah, N.H., Magnus, D.: Implementing machine learning in health care—addressing ethical challenges. N. Engl. J. Med. 378(11), 981–983 (2018)

    Article  Google Scholar 

  10. Deardorff, J., Abrams, B., Ekwaru, J.P., Rehkopf, D.H.: Socioeconomic status and age at menarche: an examination of multiple indicators in an ethnically diverse cohort. Ann. Epidemiol. 24(10), 727–733 (2014)

    Article  Google Scholar 

  11. Du, M., Yang, F., Zou, N., Hu, X.: Fairness in deep learning: a computational perspective. IEEE Intell. Syst. 36(4), 25–34 (2021)

    Article  Google Scholar 

  12. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS 2012, New York, NY, USA, pp. 214–226 (2012)

    Google Scholar 

  13. Fleisher, W.: What’s Fair about Individual Fairness?. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, pp. 480–490 (2021)

    Google Scholar 

  14. Grote, T., Keeling, G.: On algorithmic fairness in medical practice. Camb. Q. Healthc. Ethics 31(1), 83–94 (2022)

    Article  Google Scholar 

  15. Hagler, D.J., et al.: Image processing and analysis methods for the Adolescent Brain Cognitive Development Study. Neuroimage 202, 116091 (2019)

    Article  Google Scholar 

  16. Herman-Giddens, M.E., et al.: Secondary sexual characteristics in boys: data from the Pediatric Research in Office Settings Network. Pediatrics 130(5), e1058–1068 (2012)

    Google Scholar 

  17. Kearns, M., Neel, S., Roth, A., Wu, Z.S.: Preventing fairness gerrymandering: auditing and learning for subgroup fairness. In: International Conference on Machine Learning, pp. 2564–2572. PMLR (2018)

    Google Scholar 

  18. Larrazabal, A.J., Nieto, N., Peterson, V., Milone, D.H., Ferrante, E.: Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc. Natl. Acad. Sci. 117(23), 12592–12594 (2020)

    Article  Google Scholar 

  19. Martinez, N., Bertran, M., Sapiro, G.: Minimax pareto fairness: a multi objective perspective. In: Proceedings of the 37th International Conference on Machine Learning, pp. 6755–6764 (2020)

    Google Scholar 

  20. Pfohl, S., Marafino, B., Coulet, A., Rodriguez, F., Palaniappan, L., Shah, N.H.: Creating fair models of atherosclerotic cardiovascular disease risk. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 271–278. Honolulu HI USA (2019)

    Google Scholar 

  21. Pohl, K.M., Thompson, W.K., Adeli, E., Linguraru, M.G. (eds.): Adolescent Brain Cognitive Development Neurocognitive Prediction. Lecture Notes in Computer Science, 1st edn. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31901-4

  22. Puyol-Antón, E., et al.: Fairness in cardiac MR image analysis: an investigation of bias due to data imbalance in deep learning based segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12903, pp. 413–423. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_39

    Chapter  Google Scholar 

  23. Selbst, A.D., Boyd, D., Friedler, S.A., Venkatasubramanian, S., Vertesi, J.: Fairness and abstraction in sociotechnical systems. In: 2019 Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* 2019, pp. 59–68. Association for Computing Machinery, New York (2019)

    Google Scholar 

  24. Seyyed-Kalantari, L., Zhang, H., McDermott, M.B.A., Chen, I.Y., Ghassemi, M.: Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat. Med. 27(12), 2176–2182 (2021)

    Article  Google Scholar 

  25. Stanley, E.A., Wilms, M., Mouches, P., Forkert, N.D.: Fairness-related performance and explainability effects in deep learning models for brain image analysis. J. Med. Imaging 9, 061102 (2022)

    Article  Google Scholar 

  26. Williams, D.R., Priest, N., Anderson, N.: Understanding associations between race, socioeconomic status and health: patterns and prospects. Health Psychol. Off. J. Div. Health Psychol. Am. Psychol. Assoc. 35(4), 407–411 (2016)

    Google Scholar 

  27. Wu, T., Mendola, P., Buck, G.M.: Ethnic differences in the presence of secondary sex characteristics and menarche among US girls: the Third National Health and Nutrition Examination Survey, 1988–1994. Pediatrics 110(4), 752–757 (2002)

    Article  Google Scholar 

  28. Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating unwanted biases with adversarial learning. arXiv:1801.07593 (2018)

  29. Zhang, H., Dullerud, N., Roth, K., Oakden-Rayner, L., Pfohl, S., Ghassemi, M.: Improving the fairness of chest X-ray classifiers. In: Proceedings of the Conference on Health, Inference, and Learning, pp. 204–233. PMLR (2022)

    Google Scholar 

Download references

Acknowledgments

Data used in the preparation of this article were obtained from the Adolescent Brain Cognitive DevelopmentSM (ABCD) Study (https://abcdstudy.org), held in the NIMH Data Archive (NDA). This is a multisite, longitudinal study designed to recruit more than 10,000 children age 9–10 and follow them over 10 years into early adulthood. The ABCD Study® is supported by the National Institutes of Health and additional federal partners under award numbers U01DA041048, U01DA050989, U01DA051016, U01DA041022, U01DA051018, U01DA051037, U01DA050987, U01DA041174, U01DA041106, U01DA041117, U01DA041028, U01DA041134, U01DA050988, U01DA051039, U01DA041156, U01DA041025, U01DA041120, U01DA051038, U01DA041148, U01DA041093, U01DA041089, U24DA041123, U24DA041147. A full list of supporters is available at https://abcdstudy.org/federal-partners.html. A listing of participating sites and a complete listing of the study investigators can be found at https://abcdstudy.org/consortium_members/. ABCD consortium investigators designed and implemented the study and/or provided data but did not necessarily participate in the analysis or writing of this report. This manuscript reflects the views of the authors and may not reflect the opinions or views of the NIH or ABCD consortium investigators. The ABCD data used in this report came from https://doi.org/10.15154/1527782.

This work was supported by the River Fund at Calgary Foundation, Alberta Innovates, Canada Research Chairs Program, and the Canadian Institutes of Health Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emma A. M. Stanley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Stanley, E.A.M., Wilms, M., Forkert, N.D. (2022). Disproportionate Subgroup Impacts and Other Challenges of Fairness in Artificial Intelligence for Medical Image Analysis. In: Baxter, J.S.H., et al. Ethical and Philosophical Issues in Medical Imaging, Multimodal Learning and Fusion Across Scales for Clinical Decision Support, and Topological Data Analysis for Biomedical Imaging. EPIMI ML-CDS TDA4BiomedicalImaging 2022 2022 2022. Lecture Notes in Computer Science, vol 13755. Springer, Cham. https://doi.org/10.1007/978-3-031-23223-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-23223-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-23222-0

  • Online ISBN: 978-3-031-23223-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics