Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 31;16(3):e57348.
doi: 10.7759/cureus.57348. eCollection 2024 Mar.

Performance of Google's Artificial Intelligence Chatbot "Bard" (Now "Gemini") on Ophthalmology Board Exam Practice Questions

Affiliations

Performance of Google's Artificial Intelligence Chatbot "Bard" (Now "Gemini") on Ophthalmology Board Exam Practice Questions

Monica Botross et al. Cureus. .

Abstract

Purpose: To assess the performance of "Bard," one of ChatGPT's competitors, in answering practice questions for the ophthalmology board certification exam.

Methods: In December 2023, 250 multiple-choice questions from the "BoardVitals" ophthalmology exam question bank were randomly selected and entered into Bard to assess the artificial intelligence chatbot's ability to comprehend, process, and answer complex scientific and clinical ophthalmic questions. A random mix of text-only and image-and-text questions were selected from 10 subsections. Each subsection included 25 questions. The percentage of correct responses was calculated per section, and an overall assessment score was determined.

Results: On average, Bard answered 62.4% (156/250) of questions correctly. The worst performance was 24% (6/25) on the topic of "Retina and Vitreous," and the best performance was on "Oculoplastics," with a score of 84% (21/25). While the majority of questions were entered with minimal difficulty, not all questions could be processed by Bard. This was particularly an issue for questions that included human images and multiple visual files. Some vignette-style questions were also not understood by Bard and were therefore omitted. Future investigations will focus on having more questions per subsection to increase available data points.

Conclusions: While Bard answered the majority of questions correctly and is capable of analyzing vast amounts of medical data, it ultimately lacks the holistic understanding and experience-informed knowledge of an ophthalmologist. An ophthalmologist's ability to synthesize diverse pieces of information and draw from clinical experience to answer complex standardized board questions is at present irreplaceable, and artificial intelligence, in its current form, can be employed as a valuable tool for supplementing clinicians' study methods.

Keywords: american board of ophthalmology; artificial intelligence; bard; chat gpt; gemini; llm; medical education; okap; ophthalmology; wqe.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

References

    1. Artificial intelligence in healthcare: Complementing, not replacing, doctors and healthcare providers. Sezgin E. Digit Health. 2023;9:20552076231186520. - PMC - PubMed
    1. A Comprehensive Guide to Fine-Tuning Large Language Models. [ Mar; 2024 ]. 2024. https://www.analyticsvidhya.com/blog/2023/08/fine-tuning-large-language-... https://www.analyticsvidhya.com/blog/2023/08/fine-tuning-large-language-...
    1. Carter R. Google Bard vs ChatGPT: Battle of the AI Chatbots. [ Feb; 2024 ]. 2023. https://www.uctoday.com/unified-communications/google-bard-vs-chatgpt-ba... https://www.uctoday.com/unified-communications/google-bard-vs-chatgpt-ba...
    1. Performance of ChatGPT, GPT-4, and Google Bard on a neurosurgery oral boards preparation question bank. Ali R, Tang OY, Connolly ID, et al. Neurosurgery. 2023;93:1090–1098. - PubMed
    1. Bard versus the 2022 American Society of Plastic Surgeons in-service examination: Performance on the examination in its intern year. Najafali D, Reiche E, Araya S, et al. Aesthet Surg J Open Forum. 2024;6:0. - PMC - PubMed

LinkOut - more resources