Performance of artificial intelligence chatbots in National dental licensing examination
- PMID: 41040620
- PMCID: PMC12485390
- DOI: 10.1016/j.jds.2025.05.012
Performance of artificial intelligence chatbots in National dental licensing examination
Abstract
Background/purpose: The Taiwan dental board exams comprehensively assess dental candidates across twenty distinct subjects, spanning foundational knowledge to clinical fields, using multiple-choice single-answer exams with a minimum passing score of 60 %. This study assesses the performance of artificial intelligence (AI)-powered chatbots (specifically ChatGPT3.5, Gemini, and Claude2), categorized as Large Language Models (LLMs), on these exams from 2021 to 2023.
Materials and methods: A total of 2699 multiple-choice questions spanning eight subjects in basic dentistry and twelve in clinical dentistry were analyzed. Questions involving images and tables were excluded. Statistical analyses were conducted using McNemar's test. Furthermore, annual results of LLMs were compared with the qualification rates of human candidates to provide additional context.
Results: Claude2 demonstrated the highest overall accuracy (54.89 %) on the Taiwan national dental licensing examinations, outperforming ChatGPT3.5 (49.33 %) and Gemini (44.63 %), with statistically significant differences in performance across models. In the basic dentistry domain, Claude2 scored 59.73 %, followed by ChatGPT3.5 (54.87 %) and Gemini (47.35 %). Notably, Claude2 excelled in biochemistry (73.81 %) and oral microbiology (88.89 %), while ChatGPT3.5 also performed strongly in oral microbiology (80.56 %). In the clinical dentistry domain, Claude2 led with a score of 52.45 %, surpassing ChatGPT3.5 (46.54 %) and Gemini (43.26 %), and showed strong results in dental public health (65.81 %). Despite these achievements, none of the LLMs attained passing scores overall.
Conclusion: None of the models achieved passing scores, highlighting their strengths in foundational knowledge but limitations in clinical reasoning.
Keywords: Artificial intelligence; Chatbot; Dental education; Dentistry; Large language models.
© 2025 Association for Dental Sciences of the Republic of China. Publishing services by Elsevier B.Vé.
Conflict of interest statement
The authors declare no conflict of interest.
Figures
References
-
- Omiye J.A., Gui H., Rezaei S.J., Zou J., Daneshjou R. Large language models in medicine: the potentials and pitfalls: a narrative review. Ann Intern Med. 2024;177:210–220. - PubMed
-
- Ali S.R., Dobbs T.D., Hutchings H.A., Whitaker I.S. Using ChatGPT to write patient clinic letters. Lancet Digit Health. 2023;5:e179–e181. - PubMed
LinkOut - more resources
Full Text Sources
