The Development Landscape of Large Language Models for Biomedical Applications

Zhiyuan Cao¹, Vipina K Keloth¹, Qianqian Xie¹, Lingfei Qian¹, Yuntian Liu¹, Yan Wang¹, Rui Shi^{1

2}, Weipeng Zhou¹, Gui Yang^{1

3}, Jeffrey Zhang¹, Xueqing Peng¹, Ethan Zhen^{1

4}, Ruey-Ling Weng¹, Qingyu Chen¹, Hua Xu¹

Affiliations

¹ Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, Connecticut, USA; email: hua.xu@yale.edu.
² School of Artificial Intelligence, Nanjing University, Jiangsu, China.
³ School of Life Sciences, Nanjing University, Jiangsu, China.
⁴ Newton South High School, Newton, Massachusetts, USA.

PMID: 40169010
PMCID: PMC12372014
DOI: 10.1146/annurev-biodatasci-102224-074736

Review

The Development Landscape of Large Language Models for Biomedical Applications

Zhiyuan Cao et al. Annu Rev Biomed Data Sci. 2025 Aug.

. 2025 Aug;8(1):251-274.

doi: 10.1146/annurev-biodatasci-102224-074736. Epub 2025 Apr 1.

Authors

Affiliations

¹ Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, Connecticut, USA; email: hua.xu@yale.edu.
² School of Artificial Intelligence, Nanjing University, Jiangsu, China.
³ School of Life Sciences, Nanjing University, Jiangsu, China.
⁴ Newton South High School, Newton, Massachusetts, USA.

PMID: 40169010
PMCID: PMC12372014
DOI: 10.1146/annurev-biodatasci-102224-074736

Abstract

Large language models (LLMs) have become powerful tools for biomedical applications, offering potential to transform healthcare and medical research. Since the release of ChatGPT in 2022, there has been a surge in LLMs for diverse biomedical applications. This review examines the landscape of text-based biomedical LLM development, analyzing model characteristics (e.g., architecture), development processes (e.g., training strategy), and applications (e.g., chatbots). Following PRISMA guidelines, 82 articles were selected out of 5,512 articles since 2022 that met our rigorous criteria, including the requirement of using biomedical data when training LLMs. Findings highlight the predominant use of decoder-only architectures such as Llama 7B, prevalence of task-specific fine-tuning, and reliance on biomedical literature for training. Challenges persist in balancing data openness with privacy concerns and detailing model development, including computational resources used. Future efforts would benefit from multimodal integration, LLMs for specialized medical applications, and improved data sharing and model accessibility.

Keywords: biomedical applications; clinical NLP; healthcare AI; large language models.

PubMed Disclaimer

Figures

**Figure 1**
PRISMA flowchart showing the review process and reasons for exclusion. Abbreviations: B, billion; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses. Figure adapted from flowchart created with PRISMA 2020 (https://www.prisma-statement.org) (CC BY 4.0).

**Figure 2**
Analysis of training corpora and domains. (a) Number of articles utilizing different types of training data. Note that percentages are calculated based on 82 articles; multiple corpora usage in individual papers means the total does not sum to 100%. (b) Subcategorization of textual training data. Abbreviation: EHR, electronic health record.

**Figure 3**
Analysis of computational resources. (a) Allocation of GPU memory sizes utilized in model training. (b) Distribution of training resources by training time and number of GPUs. Different colors represent different GPU types, and different shapes represent different training strategies. Abbreviation: GPU, graphics processing unit.

See this image and copyright information in PMC

References

1. Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, et al. 2020. Language models are few-shot learners. Preprint, arXiv:2005.14165v4 [cs.CL]
1. Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, et al. 2024. GPT-4 technical report. Preprint, arXiv:2303.08774v6 [cs.CL]
1. GLM T, Zeng A, Xu B, Wang B, Zhang C, et al. 2024. ChatGLM: a family of large language models from GLM-130B to GLM-4 all tools. Preprint, arXiv:2406.12793v2 [cs.CL]
1. Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M-A, et al. 2023. LLaMA: open and efficient foundation language models. Preprint, arXiv:2302.13971v1 [cs.CL]
1. Touvron H, Martin L, Stone K, Albert P, Almahairi A, et al. 2023. Llama 2: OPEN foundation and fine-tuned chat models. Preprint, arXiv:2307.09288v2 [cs.CL]

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Ingenta plc
- PubMed Central
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The Development Landscape of Large Language Models for Biomedical Applications

Affiliations

The Development Landscape of Large Language Models for Biomedical Applications

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous