Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2024 Oct 17;9(1):e001824.
doi: 10.1136/bmjophth-2024-001824.

Artificial intelligence chatbots as sources of patient education material for cataract surgery: ChatGPT-4 versus Google Bard

Affiliations
Comparative Study

Artificial intelligence chatbots as sources of patient education material for cataract surgery: ChatGPT-4 versus Google Bard

Matthew Azzopardi et al. BMJ Open Ophthalmol. .

Abstract

Objective: To conduct a head-to-head comparative analysis of cataract surgery patient education material generated by Chat Generative Pre-trained Transformer (ChatGPT-4) and Google Bard.

Methods and analysis: 98 frequently asked questions on cataract surgery in English were taken in November 2023 from 5 trustworthy online patient information resources. 59 of these were curated (20 augmented for clarity and 39 duplicates excluded) and categorised into 3 domains: condition (n=15), preparation for surgery (n=21) and recovery after surgery (n=23). They were formulated into input prompts with 'prompt engineering'. Using the Patient Education Materials Assessment Tool-Printable (PEMAT-P) Auto-Scoring Form, four ophthalmologists independently graded ChatGPT-4 and Google Bard responses. The readability of responses was evaluated using a Flesch-Kincaid calculator. Responses were also subjectively examined for any inaccurate or harmful information.

Results: Google Bard had a higher mean overall Flesch-Kincaid Level (8.02) compared with ChatGPT-4 (5.75) (p<0.001), also noted across all three domains. ChatGPT-4 had a higher overall PEMAT-P understandability score (85.8%) in comparison to Google Bard (80.9%) (p<0.001), which was also noted in the 'preparation for cataract surgery' (85.2% vs 75.7%; p<0.001) and 'recovery after cataract surgery' (86.5% vs 82.3%; p=0.004) domains. There was no statistically significant difference in overall (42.5% vs 44.2%; p=0.344) or individual domain actionability scores (p>0.10). None of the generated material contained dangerous information.

Conclusion: In comparison to Google Bard, ChatGPT-4 fared better overall, scoring higher on the PEMAT-P understandability scale and exhibiting more faithfulness to the prompt engineering instruction. Since input prompts might vary from real-world patient searches, follow-up studies with patient participation are required.

Keywords: Cataract; Medical Education; Treatment Surgery.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None declared.

Figures

Figure 1
Figure 1. Question curation flow chart. ChatGPT, Chat Generative Pre-trained Transformer.
Figure 2
Figure 2. A comparison of the large language model chatbot user interface, with an example question prompt (A) ChatGPT-4 (B) Google Bard. ChatGPT, Chat Generative Pre-trained Transformer.

References

    1. Moor J. The Dartmouth College artificial intelligence conference: The next fifty years. AI Mag. 2006;27:87. doi: 10.1609/aimag.v27i4.1911. - DOI
    1. OpenAI Introducing ChatGPT. 2022. https://openai.com/blog/chatgpt Available.
    1. Pichai S. An important next step on our AI journey. Google. 2023. https://blog.google/technology/ai/bard-google-ai-search-updates/ Available.
    1. Ting DSJ, Tan TF, Ting DSW. ChatGPT in ophthalmology: the dawn of a new era? Eye (Lond) 2024;38:4–7. doi: 10.1038/s41433-023-02619-4. - DOI - PMC - PubMed
    1. OpenAI GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses. 2023. https://openai.com/gpt-4 Available.

Publication types

LinkOut - more resources