Comparing large language models for antibiotic prescribing in different clinical scenarios: which performs better?

Andrea De Vito¹, Nicholas Geremia², Davide Fiore Bavaro³, Susan K Seo⁴, Justin Laracy⁴, Maria Mazzitelli⁵, Andrea Marino⁶, Alberto Enrico Maraolo⁷, Antonio Russo⁸, Agnese Colpani⁹, Michele Bartoletti³, Anna Maria Cattelan⁵, Cristina Mussini¹⁰, Saverio Giuseppe Parisi¹¹, Luigi Angelo Vaira¹², Giuseppe Nunnari⁶, Giordano Madeddu⁹

Affiliations

¹ Unit of Infectious Diseases, Department of Medicine, Surgery and Pharmacy, Sassari, Italy. Electronic address: andreadevitoaho@gmail.com.
² Unit of Infectious Diseases, Department of Clinical Medicine, Ospedale dell'Angelo, Venice, Italy; Unit of Infectious Diseases, Department of Clinical Medicine, Ospedale Civile S.S. Giovanni e Paolo, Venice, Italy.
³ Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, Milan, Italy; Infectious Diseases Unit - Department of Biomedical Sciences - Istituti di Ricovero e Cura a Carattere Scientifico (IRCCS) Humanitas Research Hospital, Rozzano, Milan, Italy.
⁴ Infectious Diseases Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁵ Infectious and Tropical Diseases Unit, Department of Molecular Medicine, Padua University Hospital, Padua, Italy.
⁶ Unit of Infectious Diseases, Department of Clinical and Experimental Medicine, Azienda Ospedaliera di Rilievo Nazionale e di Alta Specializzazione (ARNAS), Garibaldi Hospital, University of Catania, Catania, Italy.
⁷ Section of Infectious Diseases, Department of Clinical Medicine and Surgery, University of Naples "Federico II," Naples, Italy.
⁸ Department of Mental Health and Public Medicine-Infectious Diseases Unit, University of Campania Luigi Vanvitelli, Naples, Italy.
⁹ Unit of Infectious Diseases, Department of Medicine, Surgery and Pharmacy, Sassari, Italy.
¹⁰ Infectious Diseases Unit, Department of Surgical, Medical, Dental and Morphological Sciences, Azienda Ospedaliera-Universitaria of Modena, University of Modena and Reggio Emilia, Modena, Italy.
¹¹ Department of Molecular Medicine, University of Padua, Padua, Italy.
¹² Maxillofacial Surgery Operative Unit, Department of Medicine, Surgery and Pharmacy, University of Sassari, Sassari, Italy.

PMID: 40113208
DOI: 10.1016/j.cmi.2025.03.002

Free article

Comparative Study

Comparing large language models for antibiotic prescribing in different clinical scenarios: which performs better?

Andrea De Vito et al. Clin Microbiol Infect. 2025 Aug.

Free article

. 2025 Aug;31(8):1336-1342.

doi: 10.1016/j.cmi.2025.03.002. Epub 2025 Mar 19.

Authors

Affiliations

¹ Unit of Infectious Diseases, Department of Medicine, Surgery and Pharmacy, Sassari, Italy. Electronic address: andreadevitoaho@gmail.com.
² Unit of Infectious Diseases, Department of Clinical Medicine, Ospedale dell'Angelo, Venice, Italy; Unit of Infectious Diseases, Department of Clinical Medicine, Ospedale Civile S.S. Giovanni e Paolo, Venice, Italy.
³ Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, Milan, Italy; Infectious Diseases Unit - Department of Biomedical Sciences - Istituti di Ricovero e Cura a Carattere Scientifico (IRCCS) Humanitas Research Hospital, Rozzano, Milan, Italy.
⁴ Infectious Diseases Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁵ Infectious and Tropical Diseases Unit, Department of Molecular Medicine, Padua University Hospital, Padua, Italy.
⁶ Unit of Infectious Diseases, Department of Clinical and Experimental Medicine, Azienda Ospedaliera di Rilievo Nazionale e di Alta Specializzazione (ARNAS), Garibaldi Hospital, University of Catania, Catania, Italy.
⁷ Section of Infectious Diseases, Department of Clinical Medicine and Surgery, University of Naples "Federico II," Naples, Italy.
⁸ Department of Mental Health and Public Medicine-Infectious Diseases Unit, University of Campania Luigi Vanvitelli, Naples, Italy.
⁹ Unit of Infectious Diseases, Department of Medicine, Surgery and Pharmacy, Sassari, Italy.
¹⁰ Infectious Diseases Unit, Department of Surgical, Medical, Dental and Morphological Sciences, Azienda Ospedaliera-Universitaria of Modena, University of Modena and Reggio Emilia, Modena, Italy.
¹¹ Department of Molecular Medicine, University of Padua, Padua, Italy.
¹² Maxillofacial Surgery Operative Unit, Department of Medicine, Surgery and Pharmacy, University of Sassari, Sassari, Italy.

PMID: 40113208
DOI: 10.1016/j.cmi.2025.03.002

Abstract

Objectives: Large language models (LLMs) show promise in clinical decision-making, but comparative evaluations of their antibiotic prescribing accuracy are limited. This study assesses the performance of various LLMs in recommending antibiotic treatments across diverse clinical scenarios.

Methods: Fourteen LLMs, including standard and premium versions of ChatGPT, Claude, Copilot, Gemini, Le Chat, Grok, Perplexity, and Pi.ai, were evaluated using 60 clinical cases with antibiograms covering 10 infection types. A standardized prompt was used for antibiotic recommendations focusing on drug choice, dosage, and treatment duration. Responses were anonymized and reviewed by a blinded expert panel assessing antibiotic appropriateness, dosage correctness, and duration adequacy.

Results: A total of 840 responses were collected and analysed. ChatGPT-o1 demonstrated the highest accuracy in antibiotic prescriptions, with 71.7% (43/60) of its recommendations classified as correct and only one (1.7%) incorrect. Gemini and Claude 3 Opus had the lowest accuracy. Dosage correctness was highest for ChatGPT-o1 (96.7%, 58/60), followed by Perplexity Pro (90.0%, 54/60) and Claude 3.5 Sonnet (91.7%, 55/60). In treatment duration, Gemini provided the most appropriate recommendations (75.0%, 45/60), whereas Claude 3.5 Sonnet tended to over-prescribe duration. Performance declined with increasing case complexity, particularly for difficult-to-treat microorganisms.

Discussion: There is significant variability among LLMs in prescribing appropriate antibiotics, dosages, and treatment durations. ChatGPT-o1 outperformed other models, indicating the potential of advanced LLMs as decision-support tools in antibiotic prescribing. However, decreased accuracy in complex cases and inconsistencies among models highlight the need for careful validation before clinical utilization.

Keywords: Antibiotic treatment; Antimicrobial susceptibility testing; ChatGPT-o1; Difficult-to-treat infection; LLMs; Large language models.

PubMed Disclaimer

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
- Elsevier Science
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparing large language models for antibiotic prescribing in different clinical scenarios: which performs better?

Affiliations

Comparing large language models for antibiotic prescribing in different clinical scenarios: which performs better?

Authors

Affiliations

Abstract

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous