Tailoring glaucoma education using large language models: Addressing health disparities in patient comprehension

Aidin C Spina¹, Pirooz Fereydouni¹, Jordan N Tang¹, Saman Andalib¹, Bryce G Picton¹, Austin R Fox^{1

2}

Affiliations

¹ School of Medicine, University of California, Irvine, Irvine, CA.
² School of Medicine, Gavin Herbert Eye Institute at University of California, Irvine, Irvine, CA.

PMID: 39792725
PMCID: PMC11729625
DOI: 10.1097/MD.0000000000041059

Tailoring glaucoma education using large language models: Addressing health disparities in patient comprehension

Aidin C Spina et al. Medicine (Baltimore). 2025.

. 2025 Jan 10;104(2):e41059.

doi: 10.1097/MD.0000000000041059.

Authors

Aidin C Spina¹, Pirooz Fereydouni¹, Jordan N Tang¹, Saman Andalib¹, Bryce G Picton¹, Austin R Fox^{1

2}

Affiliations

¹ School of Medicine, University of California, Irvine, Irvine, CA.
² School of Medicine, Gavin Herbert Eye Institute at University of California, Irvine, Irvine, CA.

PMID: 39792725
PMCID: PMC11729625
DOI: 10.1097/MD.0000000000041059

Abstract

This study evaluates the efficacy of GPT-4, a Large Language Model, in simplifying medical literature for enhancing patient comprehension in glaucoma care. GPT-4 was used to transform published abstracts from 3 glaucoma journals (n = 62) and patient education materials (Patient Educational Model [PEMs], n = 9) to a 5th-grade reading level. GPT-4 was also prompted to generate de novo educational outputs at 6 different education levels (5th Grade, 8th Grade, High School, Associate's, Bachelor's and Doctorate). Readability of both transformed and de novo materials was quantified using Flesch Kincaid Grade Level (FKGL) and Flesch Reading Ease (FKRE) Score. Latent semantic analysis (LSA) using cosine similarity was applied to assess content consistency in transformed materials. The transformation of abstracts resulted in FKGL decreasing by an average of 3.21 points (30%, P < .001) and FKRE increasing by 28.6 points (66%, P < .001). For PEMs, FKGL decreased by 2.38 points (28%, P = .0272) and FKRE increased by 12.14 points (19%, P = .0459). LSA revealed high semantic consistency, with an average cosine similarity of 0.861 across all abstracts and 0.937 for PEMs, signifying topical themes were quantitatively shown to be consistent. This study shows that GPT-4 effectively simplifies medical information about glaucoma, making it more accessible while maintaining textual content. The improved readability scores for both transformed materials and GPT-4 generated content demonstrate its usefulness in patient education across different educational levels.

PubMed Disclaimer

Conflict of interest statement

The authors have no funding and conflicts of interest to disclose.

Figures

**Figure 1.**
Transformation of abstracts by GPT-4. (A) Changes in FKGL scores of abstracts pre- and post- transformation, organized by journal (P < .001). (B) Changes in FKRE scores of abstracts pre- and post-transformation, organized by journal (P < .001). FKGL = Flesch Kincaid Grade Level, FKRE = Flesch Reading Ease.

**Figure 2.**
Transformation of PEMs by GPT-4. (A) Changes in FKGL scores for PEMs, pre- and post-transformation (P = .010). (B) Changes in FKRE scores for PEMs, pre- and post-transformation (P = .010). FKGL = Flesch Kincaid Grade Level, FKRE = Flesch Reading Ease, PEMs = Patient Educational Models.

**Figure 3.**
Latent semantic analysis of transformed abstracts and PEMs. (A) Cosine similarity of abstracts pre- and post-transformation. (B) Cosine similarity of PEMs pre- and post-transformation. PEMs = Patient Educational Models.

**Figure 4.**
Readability scores for De Novo GPT-4 outputs. (A) FKGL scores for De Novo outputs (*** P < .001, ** P < .01, * P < .05). (B) FKRE scores for De Novo outputs (*** P < .001, ** P < .01, * P < .05). FKGL = Flesch Kincaid Grade Level, FKRE = Flesch Reading Ease.

See this image and copyright information in PMC

References

1. Juhn Y, Liu H. Artificial intelligence approaches using natural language processing to advance EHR-based clinical research. J Allergy Clin Immunol. 2020;145:463–9. - PMC - PubMed
1. Khurana D, Koli A, Khatter K, Singh S. Natural language processing: state of the art, current trends and challenges. Multimed Tools Appl. 2023;82:3713–44. - PMC - PubMed
1. OpenAI. GPT-4. GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses. https://openai.com/gpt-4. Accessed January 26, 2024.
1. Baker DW, Parker RM, Williams MV, Clark WS, Nurss J. The relationship of patient reading ability to self-reported health and use of health services. Am J Public Health. 1997;87:1027–30. - PMC - PubMed
1. Eid K, Eid A, Wang D, Raiker RS, Chen S, Nguyen J. Optimizing ophthalmology patient education via ChatBot-generated materials: readability analysis of AI-Generated patient education materials and the American Society of ophthalmic plastic and reconstructive surgery patient brochures. Ophthal Plast Reconstr Surg. 2023;40:212–6. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Tailoring glaucoma education using large language models: Addressing health disparities in patient comprehension

Affiliations

Tailoring glaucoma education using large language models: Addressing health disparities in patient comprehension

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical