Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Observational Study
. 2025 Jul 8:9:e68817.
doi: 10.2196/68817.

Improving the Readability of Institutional Heart Failure-Related Patient Education Materials Using GPT-4: Observational Study

Affiliations
Observational Study

Improving the Readability of Institutional Heart Failure-Related Patient Education Materials Using GPT-4: Observational Study

Ryan C King et al. JMIR Cardio. .

Abstract

Background: Heart failure management involves comprehensive lifestyle modifications such as daily weights, fluid and sodium restriction, and blood pressure monitoring, placing additional responsibility on patients and caregivers, with successful adherence often requiring extensive counseling and understandable patient education materials (PEMs). Prior research has shown PEMs related to cardiovascular disease often exceed the American Medical Association's fifth- to sixth-grade recommended reading level. The large language model (LLM) ChatGPT may be a useful tool for improving PEM readability.

Objective: We aim to assess the readability of heart failure-related PEMs from prominent cardiology institutions and evaluate GPT-4's ability to improve these metrics while maintaining accuracy and comprehensiveness.

Methods: A total of 143 heart failure-related PEMs were collected from the websites of the top 10 institutions listed on the 2022-2023 US News & World Report for "Best Hospitals for Cardiology, Heart & Vascular Surgery." PEMs were individually entered into GPT-4 (version updated July 20, 2023), preceded by the prompt, "Please explain the following in simpler terms." Readability was assessed using the Flesch Reading Ease score, Flesch-Kincaid Grade Level (FKGL), Gunning Fog Index, Coleman-Liau Index, Simple Measure of Gobbledygook Index, and Automated Readability Index. The accuracy and comprehensiveness of revised GPT-4 PEMs were assessed by a board-certified cardiologist.

Results: For 143 institutional heart failure-related PEMs analyzed, the median FKGL was 10.3 (IQR 7.9-13.1; high school sophomore) compared to 7.3 (IQR 6.1-8.5; seventh grade) for GPT-4's revised PEMs (P<.001). Of the 143 institutional PEMs, there were 13 (9.1%) below the sixth-grade reading level, which improved to 33 (23.1%) after revision by GPT-4 (P<.001). No revised GPT-4 PEMs were graded as less accurate or less comprehensive compared to institutional PEMs. A total of 33 (23.1%) GPT-4 PEMs were graded as more comprehensive.

Conclusions: GPT-4 significantly improved the readability of institutional heart failure-related PEMs. The model may be a promising adjunct resource in addition to care provided by a licensed health care professional for patients living with heart failure. Further rigorous testing and validation is needed to investigate its safety, efficacy, and impact on patient health literacy.

Keywords: ChatGPT; GPT-4; artificial intelligence; health literacy; heart failure; large language models; patient education; readability.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: RG is a consultant for Pfizer, Alnylam, and AstraZeneca. None of the other authors have interests to disclose.

Figures

Figure 1.
Figure 1.. Diagram of institutional heart failure–related PEM curation, revised GPT-4 PEM generation, and subsequent assessment of readability, accuracy, and comprehensiveness. Created in BioRender [19]. FAQ: frequently asked question; PEM: patient education material.
Figure 2.
Figure 2.. Box and whiskers plot of median readability scores across 5 metrics including Automated Readability Index, Coleman-Liau Index, Flesch-Kincaid Grade Level, Gunning Fog Index, Simple Measure of Gobbledygook (SMOG) Index for institutional and GPT-4’s revised PEMs. PEMs: patient education materials. * P<.05.

References

    1. Groenewegen A, Rutten FH, Mosterd A, Hoes AW. Epidemiology of heart failure. Eur J Heart Fail. 2020 Aug;22(8):1342–1356. doi: 10.1002/ejhf.1858. doi. Medline. - DOI - PMC - PubMed
    1. Urbich M, Globe G, Pantiri K, et al. A systematic review of medical costs associated with heart failure in the USA (2014-2020) Pharmacoeconomics. 2020 Nov;38(11):1219–1236. doi: 10.1007/s40273-020-00952-0. doi. Medline. - DOI - PMC - PubMed
    1. Berkman ND, Sheridan SL, Donahue KE, Halpern DJ, Crotty K. Low health literacy and health outcomes: an updated systematic review. Ann Intern Med. 2011 Jul 19;155(2):97–107. doi: 10.7326/0003-4819-155-2-201107190-00005. doi. Medline. - DOI - PubMed
    1. Peterson PN, Shetterly SM, Clarke CL, et al. Health literacy and outcomes among patients with heart failure. JAMA. 2011 Apr 27;305(16):1695–1701. doi: 10.1001/jama.2011.512. doi. Medline. - DOI - PMC - PubMed
    1. Fast facts: adult literacy. NCES. 2019. [29-10-2024]. https://nces.ed.gov/fastfacts/display.asp?id=69 URL. Accessed.

Publication types