Clinical and Surgical Applications of Large Language Models: A Systematic Review

doi:10.3390/jcm13113041

Review

. 2024 May 22;13(11):3041.

doi: 10.3390/jcm13113041.

Clinical and Surgical Applications of Large Language Models: A Systematic Review

Sophia M Pressman¹, Sahar Borna¹, Cesar A Gomez-Cabello¹, Syed Ali Haider¹, Clifton R Haider², Antonio Jorge Forte^{1

3}

Affiliations

¹ Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA.
² Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN 55905, USA.
³ Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA.

PMID: 38892752
PMCID: PMC11172607
DOI: 10.3390/jcm13113041

Review

Clinical and Surgical Applications of Large Language Models: A Systematic Review

Sophia M Pressman et al. J Clin Med. 2024.

. 2024 May 22;13(11):3041.

doi: 10.3390/jcm13113041.

Authors

Sophia M Pressman¹, Sahar Borna¹, Cesar A Gomez-Cabello¹, Syed Ali Haider¹, Clifton R Haider², Antonio Jorge Forte^{1

3}

Affiliations

¹ Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA.
² Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN 55905, USA.
³ Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA.

PMID: 38892752
PMCID: PMC11172607
DOI: 10.3390/jcm13113041

Abstract

Background: Large language models (LLMs) represent a recent advancement in artificial intelligence with medical applications across various healthcare domains. The objective of this review is to highlight how LLMs can be utilized by clinicians and surgeons in their everyday practice. Methods: A systematic review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Six databases were searched to identify relevant articles. Eligibility criteria emphasized articles focused primarily on clinical and surgical applications of LLMs. Results: The literature search yielded 333 results, with 34 meeting eligibility criteria. All articles were from 2023. There were 14 original research articles, four letters, one interview, and 15 review articles. These articles covered a wide variety of medical specialties, including various surgical subspecialties. Conclusions: LLMs have the potential to enhance healthcare delivery. In clinical settings, LLMs can assist in diagnosis, treatment guidance, patient triage, physician knowledge augmentation, and administrative tasks. In surgical settings, LLMs can assist surgeons with documentation, surgical planning, and intraoperative guidance. However, addressing their limitations and concerns, particularly those related to accuracy and biases, is crucial. LLMs should be viewed as tools to complement, not replace, the expertise of healthcare professionals.

Keywords: ChatGPT; artificial intelligence (AI); deep learning; diagnosis; machine learning; management; surgical specialties.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

**Figure 1**
Relationships between AI technologies, including LLMs.

**Figure 2**
Modified 2020 PRISMA flow diagram outlining the article identification and eligibility assessment process for this systematic review.

**Figure 3**
Applications of LLMs within clinical practice. Created with BioRender.com.

**Figure 4**
Applications of LLMs within surgical practice. Created with BioRender.com.

See this image and copyright information in PMC

Cited by

Building an intelligent diabetes Q&A system with knowledge graphs and large language models.
Qin Z, Wu D, Zang Z, Chen X, Zhang H, Wong CUI. Qin Z, et al. Front Public Health. 2025 Feb 20;13:1540946. doi: 10.3389/fpubh.2025.1540946. eCollection 2025. Front Public Health. 2025. PMID: 40051508 Free PMC article.
Large language models for disease diagnosis: a scoping review.
Zhou S, Xu Z, Zhang M, Xu C, Guo Y, Zhan Z, Fang Y, Ding S, Wang J, Xu K, Xia L, Yeung J, Zha D, Cai D, Melton GB, Lin M, Zhang R. Zhou S, et al. NPJ Artif Intell. 2025;1(1):9. doi: 10.1038/s44387-025-00011-z. Epub 2025 Jun 9. NPJ Artif Intell. 2025. PMID: 40607112 Free PMC article. Review.
What is the role of large language models in the management of urolithiasis?: a review.
Ates T, Tamkac N, Sukur IH, Ok F. Ates T, et al. Urolithiasis. 2025 May 15;53(1):92. doi: 10.1007/s00240-025-01761-w. Urolithiasis. 2025. PMID: 40372452 Review.
Evaluating the Efficacy of Large Language Models in Guiding Treatment Decisions for Pediatric Refractive Error.
Kang D, Wu H, Yuan L, Shen W, Feng J, Zhan J, Grzybowski A, Sun W, Jin K. Kang D, et al. Ophthalmol Ther. 2025 Apr;14(4):705-716. doi: 10.1007/s40123-025-01105-2. Epub 2025 Feb 22. Ophthalmol Ther. 2025. PMID: 39985747 Free PMC article.
Assessment of artificial intelligence performance in answering questions on onabotulinum toxin and sacral neuromodulation.
Hacibey I, Halis A. Hacibey I, et al. Investig Clin Urol. 2025 May;66(3):188-193. doi: 10.4111/icu.20250040. Investig Clin Urol. 2025. PMID: 40312898 Free PMC article.

See all "Cited by" articles

References

1. Hamet P., Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69:S36–S40. doi: 10.1016/j.metabol.2017.01.011. - DOI - PubMed
1. Manning C. Artificial Intelligence Definitions. Stanford University Human-Centered Artificial Intelligence. [(accessed on 18 October 2023)]. Available online: https://hai.stanford.edu/sites/default/files/2020-09/AI-Definitions-HAI.pdf.
1. Muftić F., Kadunić M., Mušinbegović A., Abd Almisreb A. Exploring Medical Breakthroughs: A Systematic Review of ChatGPT Applications in Healthcare. Southeast Eur. J. Soft Comput. 2023;12:13–41. doi: 10.21533/scjournal. - DOI
1. Jin Z. Analysis of the Technical Principles of ChatGPT and Prospects for Pre-trained Large Models; Proceedings of the 2023 IEEE 3rd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA); Chongqing, China. 26–28 May 2023; pp. 1755–1758.
1. Mikolov T., Karafiát M., Burget L., Cernocký J., Khudanpur S. Recurrent neural network based language model. Interspeech. 2010;2:1045–1048.

Publication types

Actions

Grants and funding

N/A/Noaber Foundation

LinkOut - more resources

Full Text Sources

[1] Hamet P., Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69:S36–S40. doi: 10.1016/j.metabol.2017.01.011. - DOI - PubMed

[2] Hamet P., Tremblay J. Artificial intelligence in medicine. Metabolism. 2017;69:S36–S40. doi: 10.1016/j.metabol.2017.01.011. - DOI - PubMed

[3] Manning C. Artificial Intelligence Definitions. Stanford University Human-Centered Artificial Intelligence. [(accessed on 18 October 2023)]. Available online: https://hai.stanford.edu/sites/default/files/2020-09/AI-Definitions-HAI.pdf.

[4] Manning C. Artificial Intelligence Definitions. Stanford University Human-Centered Artificial Intelligence. [(accessed on 18 October 2023)]. Available online: https://hai.stanford.edu/sites/default/files/2020-09/AI-Definitions-HAI.pdf.

[5] Muftić F., Kadunić M., Mušinbegović A., Abd Almisreb A. Exploring Medical Breakthroughs: A Systematic Review of ChatGPT Applications in Healthcare. Southeast Eur. J. Soft Comput. 2023;12:13–41. doi: 10.21533/scjournal. - DOI

[6] Muftić F., Kadunić M., Mušinbegović A., Abd Almisreb A. Exploring Medical Breakthroughs: A Systematic Review of ChatGPT Applications in Healthcare. Southeast Eur. J. Soft Comput. 2023;12:13–41. doi: 10.21533/scjournal. - DOI

[7] Jin Z. Analysis of the Technical Principles of ChatGPT and Prospects for Pre-trained Large Models; Proceedings of the 2023 IEEE 3rd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA); Chongqing, China. 26–28 May 2023; pp. 1755–1758.

[8] Jin Z. Analysis of the Technical Principles of ChatGPT and Prospects for Pre-trained Large Models; Proceedings of the 2023 IEEE 3rd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA); Chongqing, China. 26–28 May 2023; pp. 1755–1758.

[9] Mikolov T., Karafiát M., Burget L., Cernocký J., Khudanpur S. Recurrent neural network based language model. Interspeech. 2010;2:1045–1048.

[10] Mikolov T., Karafiát M., Burget L., Cernocký J., Khudanpur S. Recurrent neural network based language model. Interspeech. 2010;2:1045–1048.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Clinical and Surgical Applications of Large Language Models: A Systematic Review

Affiliations

Clinical and Surgical Applications of Large Language Models: A Systematic Review

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources