Improving Large Language Models' Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation

doi:10.2196/66476

Comparative Study

. 2025 Jul 24:13:e66476.

doi: 10.2196/66476.

Improving Large Language Models' Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation

Mahshad Koohi Habibi Dehkordi¹, Yehoshua Perl¹, Fadi P Deek², Zhe He³, Vipina K Keloth⁴, Hao Liu⁵, Gai Elhanan⁶, Andrew J Einstein⁷

Affiliations

¹ Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, United States.
² Department of Informatics, New Jersey Institute of Technology, Newark, NJ, United States.
³ School of Information, Florida State University, Tallahassee, FL, United States.
⁴ Department of Medical Informatics, Yale University, New Haven, CT, United States.
⁵ Department of Computer Science, Montclair State University, Montclair, NJ, United States.
⁶ School of Medicine, University of Nevada, Reno, NV, United States.
⁷ Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States.

PMID: 40705416
PMCID: PMC12332456
DOI: 10.2196/66476

Comparative Study

Improving Large Language Models' Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation

Mahshad Koohi Habibi Dehkordi et al. JMIR Med Inform. 2025.

. 2025 Jul 24:13:e66476.

doi: 10.2196/66476.

Authors

Mahshad Koohi Habibi Dehkordi¹, Yehoshua Perl¹, Fadi P Deek², Zhe He³, Vipina K Keloth⁴, Hao Liu⁵, Gai Elhanan⁶, Andrew J Einstein⁷

Affiliations

¹ Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, United States.
² Department of Informatics, New Jersey Institute of Technology, Newark, NJ, United States.
³ School of Information, Florida State University, Tallahassee, FL, United States.
⁴ Department of Medical Informatics, Yale University, New Haven, CT, United States.
⁵ Department of Computer Science, Montclair State University, Montclair, NJ, United States.
⁶ School of Medicine, University of Nevada, Reno, NV, United States.
⁷ Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States.

PMID: 40705416
PMCID: PMC12332456
DOI: 10.2196/66476

Abstract

Background: The American Medical Association recommends that electronic health record (EHR) notes, often dense and written in nuanced language, be made readable for patients and laypeople, a practice we refer to as the simplification of discharge notes. Our approach to achieving the simplification of discharge notes involves a process of incremental simplification steps to achieve the ideal note. In this paper, we present the first step of this process. Large language models (LLMs) have demonstrated considerable success in text summarization. Such LLM summaries represent the content of EHR notes in an easier-to-read language. However, LLM summaries can also introduce inaccuracies.

Objective: This study aims to test the hypothesis that summaries generated by LLMs from highlighted discharge notes will achieve increased accuracy compared to those generated from the original notes. For this purpose, we aim to prove a hypothesis that summaries generated by LLMs of discharge notes in which detailed information is highlighted are likely to be more accurate than summaries of the original notes.

Methods: To test our hypothesis, we randomly sampled 15 discharge notes from the MIMIC III database and highlighted their detailed information using an interface terminology we previously developed with machine learning. This interface terminology was curated to encompass detailed information from the discharge notes. The highlighted discharge notes distinguished detailed information, specifically the concepts present in the aforementioned interface terminology, by applying a blue background. To calibrate the LLMs' summaries for our simplification goal, we chose GPT-4o and used prompt engineering to ensure high-quality prompts and address issues of output inconsistency and prompt sensitivity. We provided both highlighted and unhighlighted versions of each EHR note along with their corresponding prompts to GPT-4o. Each generated summary was manually evaluated to assess its quality using the following evaluation metrics: completeness, correctness, and structural integrity.

Results: We used the study sample of 15 discharge notes. On average, summaries from highlighted notes (H-summaries) achieved 96% completeness, 8% higher than the summaries from unhighlighted notes (U-summaries). H-summaries had higher completeness in 13 notes, and U-summaries had higher or equal completeness in 2 notes, resulting in P=.01, which implied statistical significance. Moreover, H-summaries demonstrated better correctness than U-summaries, with fewer instances of erroneous information (2 vs 3 errors, respectively). The number of improper headers was smaller for H-summaries for 11 notes and U-summaries for 4 notes (P=.03; implying statistical significance). Moreover, we identified 8 instances of misplaced information in the U-summaries and only 2 in the H-summaries. We showed that our findings supported the hypothesis that summarizing highlighted discharge notes improves the accuracy of the summaries.

Conclusions: Feeding LLMs with highlighted discharge notes, combined with prompt engineering, results in higher-quality summaries in terms of correctness, completeness, and structural integrity compared to unhighlighted discharge notes.

Keywords: AI; ChatGPT; ChatGPT summaries; EHR; EHR summaries; LLM; LLM summaries; accuracy of summaries; artificial intelligence; clinical notes summarization; discharge notes; discharge notes summarization; electronic health record; highlighted EHR notes; large language model.

©Mahshad Koohi Habibi Dehkordi, Yehoshua Perl, Fadi P Deek, Zhe He, Vipina K Keloth, Hao Liu, Gai Elhanan, Andrew J Einstein. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 24.07.2025.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

**Figure 1**
Diagram of constructing CITml. Cardiology interface terminology (CIT) has versions with 2 indices. The first indicates the iteration number, and the second is binary, with 1 following concatenation and 2 following anchoring. CIT_V: updated version of the CIT; EHR: electronic health record; ICIT: initial version of the CIT; ML: machine learning; SNOMED CT: Systematized Nomenclature of Medicine–Clinical Terms.

**Figure 2**
(A) An original highlighted note with 128 words, (B) the summary from unhighlighted notes of the note with 91 words and 84% (31/37 items of information) completeness, (C) and the summary from highlighted notes of the same note with 111 words and 100% completeness. The pink highlight in (B) indicates misplaced information. The orange highlight in (B) indicates repetitive information, the first of which is redundant and misplaced. The yellow highlights in (C) indicate information items from the original text that do not appear in (B).

**Figure 3**
(A) A portion of a summary lacking proper headers and (B) the same portion of the summary with proper headers.

**Figure 4**
(A) A portion of a summary having improper headers and (B) the same portion of the summary without improper headers.

See this image and copyright information in PMC

References

1. Seymour T, Frantsvog D, Graeber T. Electronic health records (EHR) Am J Health Sci. 2012 Jul 13;3(3):201–10. doi: 10.19030/ajhs.v3i3.7139. https://www.researchgate.net/publication/267226700_Electronic_Health_Rec... - DOI
1. Polepalli Ramesh B, Houston T, Brandt C, Fang H, Yu H. Improving patients' electronic health record comprehension with NoteAid. Stud Health Technol Inform. 2013;192:714–8. - PubMed
1. Magid SK, Cohen K, Katzovitz LS. 21 Century Cures Act, an information technology-led organizational initiative. HSS J. 2022 Mar;18(1):42–7. doi: 10.1177/15563316211041613. https://journals.sagepub.com/doi/abs/10.1177/15563316211041613?url_ver=Z... 10.1177_15563316211041613 - DOI - DOI - PMC - PubMed
1. McCray AT, Loane RF, Browne AC, Bangalore AK. Terminology issues in user access to web-based medical information. Proc AMIA Symp. 1999:107–11. https://europepmc.org/abstract/MED/10566330 D005626 - PMC - PubMed
1. Weiss BD. Health literacy: a manual for clinicians. American Medical Association. [2025-06-29]. http://lib.ncfh.org/pdfs/6617.pdf .

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- JMIR Publications
- PubMed Central

[1] Seymour T, Frantsvog D, Graeber T. Electronic health records (EHR) Am J Health Sci. 2012 Jul 13;3(3):201–10. doi: 10.19030/ajhs.v3i3.7139. https://www.researchgate.net/publication/267226700_Electronic_Health_Rec... - DOI

[2] Seymour T, Frantsvog D, Graeber T. Electronic health records (EHR) Am J Health Sci. 2012 Jul 13;3(3):201–10. doi: 10.19030/ajhs.v3i3.7139. https://www.researchgate.net/publication/267226700_Electronic_Health_Rec... - DOI

[3] Polepalli Ramesh B, Houston T, Brandt C, Fang H, Yu H. Improving patients' electronic health record comprehension with NoteAid. Stud Health Technol Inform. 2013;192:714–8. - PubMed

[4] Polepalli Ramesh B, Houston T, Brandt C, Fang H, Yu H. Improving patients' electronic health record comprehension with NoteAid. Stud Health Technol Inform. 2013;192:714–8. - PubMed

[5] Magid SK, Cohen K, Katzovitz LS. 21 Century Cures Act, an information technology-led organizational initiative. HSS J. 2022 Mar;18(1):42–7. doi: 10.1177/15563316211041613. https://journals.sagepub.com/doi/abs/10.1177/15563316211041613?url_ver=Z... 10.1177_15563316211041613 - DOI - DOI - PMC - PubMed

[6] Magid SK, Cohen K, Katzovitz LS. 21 Century Cures Act, an information technology-led organizational initiative. HSS J. 2022 Mar;18(1):42–7. doi: 10.1177/15563316211041613. https://journals.sagepub.com/doi/abs/10.1177/15563316211041613?url_ver=Z... 10.1177_15563316211041613 - DOI - DOI - PMC - PubMed

[7] McCray AT, Loane RF, Browne AC, Bangalore AK. Terminology issues in user access to web-based medical information. Proc AMIA Symp. 1999:107–11. https://europepmc.org/abstract/MED/10566330 D005626 - PMC - PubMed

[8] McCray AT, Loane RF, Browne AC, Bangalore AK. Terminology issues in user access to web-based medical information. Proc AMIA Symp. 1999:107–11. https://europepmc.org/abstract/MED/10566330 D005626 - PMC - PubMed

[9] Weiss BD. Health literacy: a manual for clinicians. American Medical Association. [2025-06-29]. http://lib.ncfh.org/pdfs/6617.pdf .

[10] Weiss BD. Health literacy: a manual for clinicians. American Medical Association. [2025-06-29]. http://lib.ncfh.org/pdfs/6617.pdf .

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Improving Large Language Models' Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation

Affiliations

Improving Large Language Models' Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources