Bias in medical AI: Implications for clinical decision-making

James L Cross¹, Michael A Choma², John A Onofrey^{2

3

4}

Affiliations

¹ Yale School of Medicine, New Haven, Connecticut, United States of America.
² Department of Radiology & Biomedical Imaging, Yale University, New Haven, Connecticut, United States of America.
³ Department of Urology, Yale University, New Haven, Connecticut, United States of America.
⁴ Department of Biomedical Engineering, Yale University, New Haven, Connecticut, United States of America.

PMID: 39509461
PMCID: PMC11542778
DOI: 10.1371/journal.pdig.0000651

Review

Bias in medical AI: Implications for clinical decision-making

James L Cross et al. PLOS Digit Health. 2024.

. 2024 Nov 7;3(11):e0000651.

doi: 10.1371/journal.pdig.0000651. eCollection 2024 Nov.

Authors

James L Cross¹, Michael A Choma², John A Onofrey^{2

3

4}

Affiliations

¹ Yale School of Medicine, New Haven, Connecticut, United States of America.
² Department of Radiology & Biomedical Imaging, Yale University, New Haven, Connecticut, United States of America.
³ Department of Urology, Yale University, New Haven, Connecticut, United States of America.
⁴ Department of Biomedical Engineering, Yale University, New Haven, Connecticut, United States of America.

PMID: 39509461
PMCID: PMC11542778
DOI: 10.1371/journal.pdig.0000651

Abstract

Biases in medical artificial intelligence (AI) arise and compound throughout the AI lifecycle. These biases can have significant clinical consequences, especially in applications that involve clinical decision-making. Left unaddressed, biased medical AI can lead to substandard clinical decisions and the perpetuation and exacerbation of longstanding healthcare disparities. We discuss potential biases that can arise at different stages in the AI development pipeline and how they can affect AI algorithms and clinical decision-making. Bias can occur in data features and labels, model development and evaluation, deployment, and publication. Insufficient sample sizes for certain patient groups can result in suboptimal performance, algorithm underestimation, and clinically unmeaningful predictions. Missing patient findings can also produce biased model behavior, including capturable but nonrandomly missing data, such as diagnosis codes, and data that is not usually or not easily captured, such as social determinants of health. Expertly annotated labels used to train supervised learning models may reflect implicit cognitive biases or substandard care practices. Overreliance on performance metrics during model development may obscure bias and diminish a model's clinical utility. When applied to data outside the training cohort, model performance can deteriorate from previous validation and can do so differentially across subgroups. How end users interact with deployed solutions can introduce bias. Finally, where models are developed and published, and by whom, impacts the trajectories and priorities of future medical AI development. Solutions to mitigate bias must be implemented with care, which include the collection of large and diverse data sets, statistical debiasing methods, thorough model evaluation, emphasis on model interpretability, and standardized bias reporting and transparency requirements. Prior to real-world implementation in clinical settings, rigorous validation through clinical trials is critical to demonstrate unbiased application. Addressing biases across model development stages is crucial for ensuring all patients benefit equitably from the future of medical AI.

Copyright: © 2024 Cross et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

References

1. Muntner P, Colantonio LD, Cushman M, Goff DC, Howard G, Howard VJ, et al. Validation of the Atherosclerotic Cardiovascular Disease Pooled Cohort Risk Equations. JAMA. 2014;311(14):1406. doi: 10.1001/jama.2014.2630 - DOI - PMC - PubMed
1. Tătaru OS, Vartolomei MD, Rassweiler JJ, Virgil O, Lucarelli G, Porpiglia F, et al. Artificial Intelligence and Machine Learning in Prostate Cancer Patient Management—Current Trends and Future Perspectives. Diagnostics. 2021;11(2):354. doi: 10.3390/diagnostics11020354 - DOI - PMC - PubMed
1. Adlung L, Cohen Y, Mor U, Elinav E. Machine learning in clinical decision making. Medicamundi. 2021;2(6):642–665. doi: 10.1016/j.medj.2021.04.006 - DOI - PubMed
1. Gu C, Wang Y, Jiang Y, Xu F, Wang S, Liu R, et al. Application of artificial intelligence system for screening multiple fundus diseases in Chinese primary healthcare settings: a real-world, multicentre and cross-sectional study of 4795 cases. Br J Ophthalmol. 2024;108(3):424–431. doi: 10.1136/bjo-2022-322940 - DOI - PMC - PubMed
1. Elías-Cabot E, Romero-Martín S, Raya-Povedano JL, Brehl A-K, Álvarez-Benito M. Impact of real-life use of artificial intelligence as support for human reading in a population-based breast cancer screening program with mammography and tomosynthesis. Eur Radiol. 2023;34(6):3958–3966. doi: 10.1007/s00330-023-10426-4 - DOI - PMC - PubMed

Publication types

Actions

LinkOut - more resources

Full Text Sources
- PubMed Central
Medical
- ClinicalTrials.gov

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Bias in medical AI: Implications for clinical decision-making

Affiliations

Bias in medical AI: Implications for clinical decision-making

Authors

Affiliations

Abstract

Conflict of interest statement

References

Publication types

LinkOut - more resources

Full Text Sources

Medical