An official website of the United States government
The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before
sharing sensitive information, make sure you’re on a federal
government site.
The site is secure.
The https:// ensures that you are connecting to the
official website and that any information you provide is encrypted
and transmitted securely.
This study assesses the diagnostic accuracy of the Generative Pre-trained Transformer 4 (GPT-4) artificial intelligence (AI) model in a series of challenging cases.
Conflict of Interest Disclosures: Dr Kanjee reported receipt of royalties for books edited and membership on a paid advisory board for medical education products not related to AI from Wolters Kluwer, as well as honoraria for continuing medical education delivered from Oakstone Publishing. Dr Crowe reported employment with Solera Health. No other disclosures were reported.
Figures
Figure.. Performance of Generative Pre-trained Transformer 4…
Figure.. Performance of Generative Pre-trained Transformer 4 (GPT-4)
Histogram of GPT-4’s performance. Performance scale scores…
Figure.. Performance of Generative Pre-trained Transformer 4 (GPT-4)
Histogram of GPT-4’s performance. Performance scale scores (Bond et al): 5 = the actual diagnosis was suggested in the differential; 4 = the suggestions included something very close, but not exact; 3 = the suggestions included something closely related that might have been helpful; 2 = the suggestions included something related, but unlikely to be helpful; 0 = no suggestions close to the target diagnosis. (The scale does not contain a score of 1.)
Kung TH, Cheatham M, Medenilla A, et al. . Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. doi:10.1371/journal.pdig.0000198
-
DOI
-
PMC
-
PubMed
Bond WF, Schwartz LM, Weaver KR, Levick D, Giuliano M, Graber ML. Differential diagnosis generators: an evaluation of currently available computer programs. J Gen Intern Med. 2012;27(2):213-219. doi:10.1007/s11606-011-1804-8
-
DOI
-
PMC
-
PubMed
Fritz P, Kleinhans A, Raoufi R, et al. . Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin. BMC Med Inform Decis Mak. 2022;22(1):254. doi:10.1186/s12911-022-01988-2
-
DOI
-
PMC
-
PubMed
Ledley RS, Lusted LB. Reasoning foundations of medical diagnosis; symbolic logic, probability, and value theory aid our understanding of how physicians reason. Science. 1959;130(3366):9-21. doi:10.1126/science.130.3366.9
-
DOI
-
PubMed
Dorr DA, Adams L, Embí P. Harnessing the promise of artificial intelligence responsibly. JAMA. 2023;329(16):1347-1348. doi:10.1001/jama.2023.2771
-
DOI
-
PubMed