Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Aug;8(1):251-274.
doi: 10.1146/annurev-biodatasci-102224-074736. Epub 2025 Apr 1.

The Development Landscape of Large Language Models for Biomedical Applications

Affiliations
Review

The Development Landscape of Large Language Models for Biomedical Applications

Zhiyuan Cao et al. Annu Rev Biomed Data Sci. 2025 Aug.

Abstract

Large language models (LLMs) have become powerful tools for biomedical applications, offering potential to transform healthcare and medical research. Since the release of ChatGPT in 2022, there has been a surge in LLMs for diverse biomedical applications. This review examines the landscape of text-based biomedical LLM development, analyzing model characteristics (e.g., architecture), development processes (e.g., training strategy), and applications (e.g., chatbots). Following PRISMA guidelines, 82 articles were selected out of 5,512 articles since 2022 that met our rigorous criteria, including the requirement of using biomedical data when training LLMs. Findings highlight the predominant use of decoder-only architectures such as Llama 7B, prevalence of task-specific fine-tuning, and reliance on biomedical literature for training. Challenges persist in balancing data openness with privacy concerns and detailing model development, including computational resources used. Future efforts would benefit from multimodal integration, LLMs for specialized medical applications, and improved data sharing and model accessibility.

Keywords: biomedical applications; clinical NLP; healthcare AI; large language models.

PubMed Disclaimer

Figures

Figure 1
Figure 1
PRISMA flowchart showing the review process and reasons for exclusion. Abbreviations: B, billion; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses. Figure adapted from flowchart created with PRISMA 2020 (https://www.prisma-statement.org) (CC BY 4.0).
Figure 2
Figure 2
Analysis of training corpora and domains. (a) Number of articles utilizing different types of training data. Note that percentages are calculated based on 82 articles; multiple corpora usage in individual papers means the total does not sum to 100%. (b) Subcategorization of textual training data. Abbreviation: EHR, electronic health record.
Figure 3
Figure 3
Analysis of computational resources. (a) Allocation of GPU memory sizes utilized in model training. (b) Distribution of training resources by training time and number of GPUs. Different colors represent different GPU types, and different shapes represent different training strategies. Abbreviation: GPU, graphics processing unit.

References

    1. Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, et al. 2020. Language models are few-shot learners. Preprint, arXiv:2005.14165v4 [cs.CL]
    1. Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, et al. 2024. GPT-4 technical report. Preprint, arXiv:2303.08774v6 [cs.CL]
    1. GLM T, Zeng A, Xu B, Wang B, Zhang C, et al. 2024. ChatGLM: a family of large language models from GLM-130B to GLM-4 all tools. Preprint, arXiv:2406.12793v2 [cs.CL]
    1. Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M-A, et al. 2023. LLaMA: open and efficient foundation language models. Preprint, arXiv:2302.13971v1 [cs.CL]
    1. Touvron H, Martin L, Stone K, Albert P, Almahairi A, et al. 2023. Llama 2: OPEN foundation and fine-tuned chat models. Preprint, arXiv:2307.09288v2 [cs.CL]

LinkOut - more resources