Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 10:15:393-400.
doi: 10.2147/AMEP.S457408. eCollection 2024.

Comparing the Performance of ChatGPT-4 and Medical Students on MCQs at Varied Levels of Bloom's Taxonomy

Affiliations

Comparing the Performance of ChatGPT-4 and Medical Students on MCQs at Varied Levels of Bloom's Taxonomy

Ambadasu Bharatha et al. Adv Med Educ Pract. .

Abstract

Introduction: This research investigated the capabilities of ChatGPT-4 compared to medical students in answering MCQs using the revised Bloom's Taxonomy as a benchmark.

Methods: A cross-sectional study was conducted at The University of the West Indies, Barbados. ChatGPT-4 and medical students were assessed on MCQs from various medical courses using computer-based testing.

Results: The study included 304 MCQs. Students demonstrated good knowledge, with 78% correctly answering at least 90% of the questions. However, ChatGPT-4 achieved a higher overall score (73.7%) compared to students (66.7%). Course type significantly affected ChatGPT-4's performance, but revised Bloom's Taxonomy levels did not. A detailed association check between program levels and Bloom's taxonomy levels for correct answers by ChatGPT-4 showed a highly significant correlation (p<0.001), reflecting a concentration of "remember-level" questions in preclinical and "evaluate-level" questions in clinical courses.

Discussion: The study highlights ChatGPT-4's proficiency in standardized tests but indicates limitations in clinical reasoning and practical skills. This performance discrepancy suggests that the effectiveness of artificial intelligence (AI) varies based on course content.

Conclusion: While ChatGPT-4 shows promise as an educational tool, its role should be supplementary, with strategic integration into medical education to leverage its strengths and address limitations. Further research is needed to explore AI's impact on medical education and student performance across educational levels and courses.

Keywords: ChatGPT-4’s; artificial intelligence; interpretation abilities; knowledge; medical students; multiple choice questions.

PubMed Disclaimer

Conflict of interest statement

Dr. Md Anwarul Azim Majumder is the Editor-in-Chief of Advances in Medical Education and Practice. The other authors report no conflicts of interest in this work.

Similar articles

Cited by

References

    1. Chen X, Xie H, Zou D, Hwang G-J. Application and theory gaps during the rise of artificial intelligence in education. Artl Intel. 2020;1:100002.
    1. Meo SA, Al-Masri AA, Alotaibi M, Meo MZS, Meo MOS. ChatGPT knowledge evaluation in basic and clinical medical sciences: Multiple choice question examination-based performance. Health care. 2023;11(14). doi:10.3390/healthcare11142046 - DOI - PMC - PubMed
    1. Ignjatović A, Stevanović L. Efficacy and limitations of ChatGPT as a biostatistical problem-solving tool in medical education in Serbia: A descriptive study. J Educ Eval Health Prof. 2023;20:28. doi:10.3352/jeehp.2023.20.28 - DOI - PMC - PubMed
    1. Roos J, Kasapovic A, Jansen T, Kaczmarczyk R. Artificial Intelligence in medical education: Comparative analysis of ChatGPT, Bing, and medical students in Germany. JMIR Med Educ. 2023;9(e46482):e46482. doi:10.2196/46482 - DOI - PMC - PubMed
    1. Agarwal M, Goswami A, Sharma P. Evaluating ChatGPT-3.5 and Claude-2 in answering and explaining conceptual medical physiology multiple-choice questions. Cureus. 2023;15(9):e46222. - PMC - PubMed