Large language models propagate race-based medicine

Jesutofunmi A Omiye^#^{1

2}, Jenna C Lester^#³, Simon Spichak⁴, Veronica Rotemberg⁵, Roxana Daneshjou^{6

7}

Affiliations

¹ Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA.
² Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA.
³ Department of Dermatology, University of California San Francisco, San Francisco, CA, USA.
⁴ Independent Researcher, Toronto, Ontario, Canada.
⁵ Dermatology Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁶ Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA. roxanad@stanford.edu.
⁷ Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA. roxanad@stanford.edu.

^# Contributed equally.

PMID: 37864012
PMCID: PMC10589311
DOI: 10.1038/s41746-023-00939-z

Large language models propagate race-based medicine

Jesutofunmi A Omiye et al. NPJ Digit Med. 2023.

. 2023 Oct 20;6(1):195.

doi: 10.1038/s41746-023-00939-z.

Authors

Jesutofunmi A Omiye^#^{1

2}, Jenna C Lester^#³, Simon Spichak⁴, Veronica Rotemberg⁵, Roxana Daneshjou^{6

7}

Affiliations

¹ Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA.
² Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA.
³ Department of Dermatology, University of California San Francisco, San Francisco, CA, USA.
⁴ Independent Researcher, Toronto, Ontario, Canada.
⁵ Dermatology Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁶ Department of Dermatology, Stanford School of Medicine, Stanford, CA, USA. roxanad@stanford.edu.
⁷ Department of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA. roxanad@stanford.edu.

^# Contributed equally.

PMID: 37864012
PMCID: PMC10589311
DOI: 10.1038/s41746-023-00939-z

Abstract

Large language models (LLMs) are being integrated into healthcare systems; but these models may recapitulate harmful, race-based medicine. The objective of this study is to assess whether four commercially available large language models (LLMs) propagate harmful, inaccurate, race-based content when responding to eight different scenarios that check for race-based medicine or widespread misconceptions around race. Questions were derived from discussions among four physician experts and prior work on race-based medical misconceptions believed by medical trainees. We assessed four large language models with nine different questions that were interrogated five times each with a total of 45 responses per model. All models had examples of perpetuating race-based medicine in their responses. Models were not always consistent in their responses when asked the same question repeatedly. LLMs are being proposed for use in the healthcare setting, with some models already connecting to electronic health record systems. However, this study shows that based on our findings, these LLMs could potentially cause harm by perpetuating debunked, racist ideas.

PubMed Disclaimer

Conflict of interest statement

R.D. has served as an advisor to MDAlgorithms and Revea and received consulting fees from Pfizer, L’Oreal, Frazier Healthcare Partners, and DWA, and research funding from UCB. V.R. is an expert advisor for Inhabit Brands. The remaining authors declare no competing interests.

Figures

**Fig. 1. LLM Outputs.**
For each question and each model, the rating represents the number of runs (out of 5 total runs) that had concerning race-based responses. Red correlates with a higher number of concerning race-based responses.

See this image and copyright information in PMC

References

1. Harskamp, R. E. & Clercq, L. D. Performance of ChatGPT as an AI-assisted decision support tool in medicine: a proof-of-concept study for interpreting symptoms and management of common cardiac conditions (AMSTELHEART-2). 2023.03.25.23285475. Preprint at 10.1101/2023.03.25.23285475 (2023). - PubMed
1. Aldridge MJ, Penders R. Artificial intelligence and anaesthesia examinations: exploring ChatGPT as a prelude to the future. Br. J. Anaesth. 2023;131:E36–E37. doi: 10.1016/j.bja.2023.04.033. - DOI - PubMed
1. Haver HL, et al. Appropriateness of breast cancer prevention and screening recommendations provided by ChatGPT. Radiology. 2023;307:e230424. doi: 10.1148/radiol.230424. - DOI - PubMed
1. Brown, T. et al. Language models are few-shot learners. in Advances in Neural Information Processing Systems 33 1877–1901 (Curran Associates, Inc., 2020).
1. Pichai, S. Google AI updates: Bard and new AI features in Search. https://blog.google/technology/ai/bard-google-ai-search-updates/ (2023).

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Large language models propagate race-based medicine

Affiliations

Large language models propagate race-based medicine

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources