Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr;39(6):1115-1122.
doi: 10.1038/s41433-024-03476-5. Epub 2024 Dec 16.

Leveraging large language models to improve patient education on dry eye disease

Affiliations

Leveraging large language models to improve patient education on dry eye disease

Qais A Dihan et al. Eye (Lond). 2025 Apr.

Abstract

Background/objectives: Dry eye disease (DED) is an exceedingly common diagnosis in patients, yet recent analyses have demonstrated patient education materials (PEMs) on DED to be of low quality and readability. Our study evaluated the utility and performance of three large language models (LLMs) in enhancing and generating new patient education materials (PEMs) on dry eye disease (DED).

Subjects/methods: We evaluated PEMs generated by ChatGPT-3.5, ChatGPT-4, Gemini Advanced, using three separate prompts. Prompts A and B requested they generate PEMs on DED, with Prompt B specifying a 6th-grade reading level, using the SMOG (Simple Measure of Gobbledygook) readability formula. Prompt C asked for a rewrite of existing PEMs at a 6th-grade reading level. Each PEM was assessed on readability (SMOG, FKGL: Flesch-Kincaid Grade Level), quality (PEMAT: Patient Education Materials Assessment Tool, DISCERN), and accuracy (Likert Misinformation scale).

Results: All LLM-generated PEMs in response to Prompt A and B were of high quality (median DISCERN = 4), understandable (PEMAT understandability ≥70%) and accurate (Likert Score=1). LLM-generated PEMs were not actionable (PEMAT Actionability <70%). ChatGPT-4 and Gemini Advanced rewrote existing PEMs (Prompt C) from a baseline readability level (FKGL: 8.0 ± 2.4, SMOG: 7.9 ± 1.7) to targeted 6th-grade reading level; rewrites contained little to no misinformation (median Likert misinformation=1 (range: 1-2)). However, only ChatGPT-4 rewrote PEMs while maintaining high quality and reliability (median DISCERN = 4).

Conclusion: LLMs (notably ChatGPT-4) were able to generate and rewrite PEMs on DED that were readable, accurate, and high quality. Our study underscores the value of leveraging LLMs as supplementary tools to improving PEMs.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

References

    1. Craig JP, Nichols KK, Akpek EK, Caffery B, Dua HS, Joo CK, et al. TFOS DEWS II definition and classification report. Ocul Surf. 2017;15:276–83. - PubMed
    1. Farrand KF, Fridman M, Stillman IÖ, Schaumberg DA. Prevalence of diagnosed dry eye disease in the United States among adults aged 18 years and older. Am J Ophthalmol. 2017;182:90–8. - PubMed
    1. Elhusseiny AM, Khalil AA, El Sheikh RH, Bakr MA, Eissa MG, El Sayed YM. New approaches for diagnosis of dry eye disease. Int J Ophthalmol. 2019;12:1618–28. - PMC - PubMed
    1. Zhang X, Zhao L, Deng S, Sun X, Wang N. Dry eye syndrome in patients with diabetes mellitus: prevalence, etiology, and clinical characteristics. J Ophthalmol. 2016;2016:8201053. - PMC - PubMed
    1. de Paiva CS. Effects of aging in dry eye. Int Ophthalmol Clin. 2017;57:47–64. - PMC - PubMed

LinkOut - more resources