Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr;32(4):466-468.
doi: 10.1038/s41431-023-01396-8. Epub 2023 May 29.

Analysis of large-language model versus human performance for genetics questions

Affiliations

Analysis of large-language model versus human performance for genetics questions

Dat Duong et al. Eur J Hum Genet. 2024 Apr.

Abstract

Large-language models like ChatGPT have recently received a great deal of attention. One area of interest pertains to how these models could be used in biomedical contexts, including related to human genetics. To assess one facet of this, we compared the performance of ChatGPT versus human respondents (13,642 human responses) in answering 85 multiple-choice questions about aspects of human genetics. Overall, ChatGPT did not perform significantly differently (p = 0.8327) than human respondents; ChatGPT was 68.2% accurate, compared to 66.6% accuracy for human respondents. Both ChatGPT and humans performed better on memorization-type questions versus critical thinking questions (p < 0.0001). When asked the same question multiple times, ChatGPT frequently provided different answers (16% of initial responses), including for both initially correct and incorrect answers, and gave plausible explanations for both correct and incorrect answers. ChatGPT's performance was impressive, but currently demonstrates significant shortcomings for clinical or other high-stakes use. Addressing these limitations will be important to guide adoption in real-life situations.

PubMed Disclaimer

Conflict of interest statement

The authors receive salary and research support from the intramural program of the National Human Genome Research Institute. BDS is the co-Editor-in-Chief of the American Journal of Medical Genetics, and has published some of the questions mentioned in this study in a book, as well as other questions [12]. Both editing/publishing activities are conducted as an approved outside activity, separate from his US Government role.

Figures

Fig. 1
Fig. 1. Summary of ChatGPT’s responses.
The Sankey plot (constructed via Flourish, https://app.flourish.studio/projects) shows ChatGPT’s initial and second responses to the 85 questions used in the study.

Update of

Comment in

  • Can ChatGPT understand genetics?
    Emmert-Streib F. Emmert-Streib F. Eur J Hum Genet. 2024 Apr;32(4):371-372. doi: 10.1038/s41431-023-01419-4. Epub 2023 Jul 5. Eur J Hum Genet. 2024. PMID: 37407734 Free PMC article. No abstract available.

References

    1. Ledgister Hanchard SE, Dwyer MC, Liu S, Hu P, Tekendo-Ngongang C, Waikel RL, et al. Scoping review and classification of deep learning in medical genetics. Genet Med. 2022;24:1593–603. doi: 10.1016/j.gim.2022.04.025. - DOI - PMC - PubMed
    1. Schaefer J, Lehne M, Schepers J, Prasser F, Thun S. The use of machine learning in rare diseases: a scoping review. Orphanet J Rare Dis. 2020;15:145. doi: 10.1186/s13023-020-01424-6. - DOI - PMC - PubMed
    1. Dias R, Torkamani A. Artificial intelligence in clinical and genomic diagnostics. Genome Med. 2019;11:70. doi: 10.1186/s13073-019-0689-8. - DOI - PMC - PubMed
    1. Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, et al. Large Language Models Encode Clinical Knowledge. arXiv preprint arXiv:221213138. 2022.
    1. Shelmerdine SC, Martin H, Shirodkar K, Shamshuddin S, Weir-McCall JR, Collaborators F-AS. Can artificial intelligence pass the Fellowship of the Royal College of Radiologists examination? Multi-reader diagnostic accuracy study. BMJ. 2022;379:e072826. doi: 10.1136/bmj-2022-072826. - DOI - PMC - PubMed

LinkOut - more resources