Artificial intelligence chatbots as sources of patient education material for cataract surgery: ChatGPT-4 versus Google Bard
- PMID: 39419585
- PMCID: PMC11487885
- DOI: 10.1136/bmjophth-2024-001824
Artificial intelligence chatbots as sources of patient education material for cataract surgery: ChatGPT-4 versus Google Bard
Abstract
Objective: To conduct a head-to-head comparative analysis of cataract surgery patient education material generated by Chat Generative Pre-trained Transformer (ChatGPT-4) and Google Bard.
Methods and analysis: 98 frequently asked questions on cataract surgery in English were taken in November 2023 from 5 trustworthy online patient information resources. 59 of these were curated (20 augmented for clarity and 39 duplicates excluded) and categorised into 3 domains: condition (n=15), preparation for surgery (n=21) and recovery after surgery (n=23). They were formulated into input prompts with 'prompt engineering'. Using the Patient Education Materials Assessment Tool-Printable (PEMAT-P) Auto-Scoring Form, four ophthalmologists independently graded ChatGPT-4 and Google Bard responses. The readability of responses was evaluated using a Flesch-Kincaid calculator. Responses were also subjectively examined for any inaccurate or harmful information.
Results: Google Bard had a higher mean overall Flesch-Kincaid Level (8.02) compared with ChatGPT-4 (5.75) (p<0.001), also noted across all three domains. ChatGPT-4 had a higher overall PEMAT-P understandability score (85.8%) in comparison to Google Bard (80.9%) (p<0.001), which was also noted in the 'preparation for cataract surgery' (85.2% vs 75.7%; p<0.001) and 'recovery after cataract surgery' (86.5% vs 82.3%; p=0.004) domains. There was no statistically significant difference in overall (42.5% vs 44.2%; p=0.344) or individual domain actionability scores (p>0.10). None of the generated material contained dangerous information.
Conclusion: In comparison to Google Bard, ChatGPT-4 fared better overall, scoring higher on the PEMAT-P understandability scale and exhibiting more faithfulness to the prompt engineering instruction. Since input prompts might vary from real-world patient searches, follow-up studies with patient participation are required.
Keywords: Cataract; Medical Education; Treatment Surgery.
© Author(s) (or their employer(s)) 2024. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.
Conflict of interest statement
Competing interests: None declared.
Figures
References
-
- Moor J. The Dartmouth College artificial intelligence conference: The next fifty years. AI Mag. 2006;27:87. doi: 10.1609/aimag.v27i4.1911. - DOI
-
- OpenAI Introducing ChatGPT. 2022. https://openai.com/blog/chatgpt Available.
-
- Pichai S. An important next step on our AI journey. Google. 2023. https://blog.google/technology/ai/bard-google-ai-search-updates/ Available.
-
- OpenAI GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses. 2023. https://openai.com/gpt-4 Available.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources