Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comment
. 2025 Jul 1;185(7):818-825.
doi: 10.1001/jamainternmed.2025.0821.

Physician- and Large Language Model-Generated Hospital Discharge Summaries

Affiliations
Comment

Physician- and Large Language Model-Generated Hospital Discharge Summaries

Christopher Y K Williams et al. JAMA Intern Med. .

Abstract

Importance: High-quality discharge summaries are associated with improved patient outcomes, but contribute to clinical documentation burden. Large language models (LLMs) provide an opportunity to support physicians by drafting discharge summary narratives.

Objective: To determine whether LLM-generated discharge summary narratives are of comparable quality and safety to those of physicians.

Design, setting, and participants: This cross-sectional study conducted at the University of California, San Francisco included 100 randomly selected inpatient hospital medicine encounters of 3 to 6 days' duration between 2019 and 2022. The analysis took place in July 2024.

Exposure: A blinded evaluation of physician- and LLM-generated narratives was performed in duplicate by 22 attending physician reviewers.

Main outcomes and measures: Narratives were reviewed for overall quality, reviewer preference, comprehensiveness, concision, coherence, and 3 error types (inaccuracies, omissions, and hallucinations). Each error individually, and each narrative overall, were assigned potential harmfulness scores ranging from 0 to 7 on an adapted Agency for Healthcare Research and Quality scale.

Results: Across 100 encounters, LLM- and physician-generated narratives were comparable in overall quality on a Likert scale ranging from 1 to 5 (higher scores indicate higher quality; mean [SD] score, 3.67 [0.49] vs 3.77 [0.57]; P = .21) and reviewer preference (χ2 = 5.2; P = .27). LLM-generated narratives were more concise (mean [SD] score, 4.01 [0.37] vs 3.70 [0.59]; P < .001) and more coherent (mean [SD] score, 4.16 [0.39] vs 4.01 [0.53]; P = .02) than their physician-generated counterparts, but less comprehensive (mean [SD] score, 3.72 [0.58] vs 4.13 [0.58]; P < .001). LLM-generated narratives contained more unique errors (mean [SD] errors per summary, 2.91 [2.54]) than physician-generated narratives (mean [SD] errors per summary, 1.82 [1.94]). There was no significant difference in the potential for harm between LLM- and physician-generated narratives across individual errors (mean [SD] of 1.35 [1.07] vs 1.34 [1.05]; P = .99), with 6 and 5 individual errors, respectively, with scores of 4 (potential for permanent harm) or greater. Both LLM- and physician-generated narratives had low overall potential for harm (scores <1 on a scale ranging from 0-7), with LLM-generated narratives scoring higher than physician narratives (mean [SD] score of 0.84 [0.98] vs 0.36 [0.70]; P < .001) and only 1 LLM-generated narrative (compared with 0 physician-generated narratives) scoring 4 or greater.

Conclusions and relevance: In this cross-sectional study of 100 inpatient hospital medicine encounters, LLM-generated discharge summary narratives were of comparable quality, and were preferred equally, to those generated by physicians. LLM-generated narratives were more likely to contain errors but had low overall harmfulness scores. These results suggest that, in clinical practice, using such narratives after human review may provide a viable option for hospitalists.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: Dr Williams reported holding equity from Quality Health, Inc. Dr Subramanian reported consulting for and having equity in Evidently and receiving personal fees from Ambience Healthcare Inc for work as clinical analyst outside the submitted work. Dr Apolinario reported holding shares in NVIDIA. Dr Rosner reported equity in Kuretic and receiving personal fees from Manos Health and Network of Digital Evidence, a 501(c)3 nonprofit organization, as a consultant outside the submitted work. No other disclosures were reported.

Comment on

References

    1. Kind AJH, Smith MA. Documentation of mandated discharge summary components in transitions from acute to subacute care. In: Henriksen K, Battles JB, Keyes MA, Grady ML, eds. Advances in Patient Safety: New Directions and Alternative Approaches (Vol. 2: Culture and Redesign). Agency for Healthcare Research and Quality; 2008. Accessed July 19, 2024. https://www.ncbi.nlm.nih.gov/books/NBK43715/ - PubMed
    1. van Walraven C, Seth R, Austin PC, Laupacis A. Effect of discharge summary availability during post-discharge visits on hospital readmission. J Gen Intern Med. 2002;17(3):186-192. doi: 10.1046/j.1525-1497.2002.10741.x - DOI - PMC - PubMed
    1. Robelia PM, Kashiwagi DT, Jenkins SM, Newman JS, Sorita A. Information transfer and the hospital discharge summary: national primary care provider perspectives of challenges and opportunities. J Am Board Fam Med. 2017;30(6):758-765. doi: 10.3122/jabfm.2017.06.170194 - DOI - PubMed
    1. Moore C, Wisnivesky J, Williams S, McGinn T. Medical errors related to discontinuity of care from an inpatient to an outpatient setting. J Gen Intern Med. 2003;18(8):646-651. doi: 10.1046/j.1525-1497.2003.20722.x - DOI - PMC - PubMed
    1. Bergkvist A, Midlöv P, Höglund P, Larsson L, Bondesson A, Eriksson T. Improved quality in the hospital discharge summary reduces medication errors–LIMM: Landskrona Integrated Medicines Management. Eur J Clin Pharmacol. 2009;65(10):1037-1046. doi: 10.1007/s00228-009-0680-1 - DOI - PubMed

LinkOut - more resources