Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 22;25(1):bbad493.
doi: 10.1093/bib/bbad493.

Opportunities and challenges for ChatGPT and large language models in biomedicine and health

Affiliations

Opportunities and challenges for ChatGPT and large language models in biomedicine and health

Shubo Tian et al. Brief Bioinform. .

Abstract

ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically, we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction and medical education and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health.

Keywords: ChatGPT; biomedicine and health; generative AI; large language model; opportunities and challenges.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The paradigm of LLMs. Pre-training: LLMs are trained on large scale corpus using an autoregressive LM; Instruction Fine-tuning: pre-trained LLMs are fine-tuned on a dataset of human-written demonstrations of the desired output behavior on prompts using supervised learning; RLHF Fine-tuning: a reward model is trained using collected comparison data, then the supervised model is further fine-tuned against the reward model using reinforcement learning algorithm. Prompts: the instruction and/or example text added to guide LLMs to generate expected outputs. Generative outputs: the outputs produced by the LLMs in response to the users’ prompts and inputs.
Figure 2
Figure 2
Performance of LLMs versus human on the MedQA (USMLE) dataset in terms of accuracy. Accuracy of LLM performance on the MedQA (USMLE) dataset has increased from the level of human passing by GPT-3.5 to the level close to human expert by Med-PaLM 2 in less than half a year.

Update of

Comment in

References

    1. OpenAI . Introducing ChatGPT. OpenAI Blog Post 2022. https://openai.com/blog/chatgpt (4 May 2023, date last accessed).
    1. OpenAI . GPT-4 Technical Report. arXiv Preprint 2023; arXiv:2303.08774.
    1. Bommasani R, Hudson DA, Adeli E, et al. On the Opportunities and Risks of Foundation Models. arXiv Preprint 2022; arXiv:2108.07258.
    1. Shin H-C, Zhang Y, Bakhturina E, et al, BioMegatron: Larger Biomedical Domain Language Model. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;4700–6.
    1. Yang X, Chen A, PourNejatian N, et al. . GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records. arXiv Preprint 2022; arXiv:2203.03540.