Assessing the Accuracy, Completeness, and Reliability of Artificial Intelligence-Generated Responses in Dentistry: A Pilot Study Evaluating the ChatGPT Model

Affiliations

¹ Department of Pediatric Dentistry, School of Dentistry of Ribeirão Preto at University of São Paulo, Ribeirão Preto, BRA.
² Department of Dental Materials and Prosthesis, School of Dentistry of Ribeirão Preto at University of São Paulo, Ribeirão Preto, BRA.
³ Department of Public Health, University of Illinois Chicago at College of Dentistry, Chicago, USA.
⁴ Department of Pediatric Dentistry, School of Dentistry of Ribeirão Preto at University of São Paulo, Ribeirão Preto, USA.
⁵ Department of Dentistry, School of Dentistry of Ribeirão Preto at University of São Paulo, São Paulo, BRA.
⁶ Department of Restorative Dentistry, University of Illinois Chicago at College of Dentistry, Chicago, USA.
⁷ Department of Pediatric Dentistry, University of Illinois Chicago College of Dentistry, Chicago, USA.

PMID: 39205730
PMCID: PMC11352766
DOI: 10.7759/cureus.65658

Assessing the Accuracy, Completeness, and Reliability of Artificial Intelligence-Generated Responses in Dentistry: A Pilot Study Evaluating the ChatGPT Model

Kelly F Molena et al. Cureus. 2024.

. 2024 Jul 29;16(7):e65658.

doi: 10.7759/cureus.65658. eCollection 2024 Jul.

Affiliations

¹ Department of Pediatric Dentistry, School of Dentistry of Ribeirão Preto at University of São Paulo, Ribeirão Preto, BRA.
² Department of Dental Materials and Prosthesis, School of Dentistry of Ribeirão Preto at University of São Paulo, Ribeirão Preto, BRA.
³ Department of Public Health, University of Illinois Chicago at College of Dentistry, Chicago, USA.
⁴ Department of Pediatric Dentistry, School of Dentistry of Ribeirão Preto at University of São Paulo, Ribeirão Preto, USA.
⁵ Department of Dentistry, School of Dentistry of Ribeirão Preto at University of São Paulo, São Paulo, BRA.
⁶ Department of Restorative Dentistry, University of Illinois Chicago at College of Dentistry, Chicago, USA.
⁷ Department of Pediatric Dentistry, University of Illinois Chicago College of Dentistry, Chicago, USA.

PMID: 39205730
PMCID: PMC11352766
DOI: 10.7759/cureus.65658

Abstract

Background: Artificial intelligence (AI) can be a tool in the diagnosis and acquisition of knowledge, particularly in dentistry, sparking debates on its application in clinical decision-making.

Objective: This study aims to evaluate the accuracy, completeness, and reliability of the responses generated by Chatbot Generative Pre-Trained Transformer (ChatGPT) 3.5 in dentistry using expert-formulated questions.

Materials and methods: Experts were invited to create three questions, answers, and respective references according to specialized fields of activity. The Likert scale was used to evaluate agreement levels between experts and ChatGPT responses. Statistical analysis compared descriptive and binary question groups in terms of accuracy and completeness. Questions with low accuracy underwent re-evaluation, and subsequent responses were compared for improvement. The Wilcoxon test was utilized (α = 0.05).

Results: Ten experts across six dental specialties generated 30 binary and descriptive dental questions and references. The accuracy score had a median of 5.50 and a mean of 4.17. For completeness, the median was 2.00 and the mean was 2.07. No difference was observed between descriptive and binary responses for accuracy and completeness. However, re-evaluated responses showed a significant improvement with a significant difference in accuracy (median 5.50 vs. 6.00; mean 4.17 vs. 4.80; p=0.042) and completeness (median 2.0 vs. 2.0; mean 2.07 vs. 2.30; p=0.011). References were more incorrect than correct, with no differences between descriptive and binary questions.

Conclusions: ChatGPT initially demonstrated good accuracy and completeness, which was further improved with machine learning (ML) over time. However, some inaccurate answers and references persisted. Human critical discernment continues to be essential to facing complex clinical cases and advancing theoretical knowledge and evidence-based practice.

Keywords: ai and machine learning; artificial intelligence in dentistry; chat-gpt; decision-making process; decision-support tools; evidence base practice; knowledge acquisition.

PubMed Disclaimer

Conflict of interest statement

Human subjects: Consent was obtained or waived by all participants in this study. Institutional Research Ethics Committee issued approval 69712923.6.0000.5419. "The research project was approved". Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue. Conflicts of interest: In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.

Figures

**Figure 1. Methodology used in this study.**
Image Credits: Kelly Fernanda Molena, Author.

**Figure 2. Wilcoxon test was used to evaluate accuracy and completeness of response generated by ChatGPT based on expert-formulated-questions.**
There were no statistical differences between the initial and final scores for accuracy (A) and completeness (C). However, when re-evaluating imprecise questions (B and D), comparing the initial values of imprecise questions (t0) with those after three days (t3), they were more accurate in t3. *Means statistical difference between the groups.

See this image and copyright information in PMC

References

1. Joiner IA. Chandos Information Professional Series, Emerging Library Technologies. Oxford: Chandos Publishing; 2018. Artificial intelligence: AI is nearby; pp. 1–22.
1. Application of artificial intelligence in clinical dentistry, a comprehensive review of literature. Ghods K, Azizi A, Jafari A, Ghods K. J Dent (Shiraz) 2023;24:356–371. - PMC - PubMed
1. Applying artificial intelligence to detect and analyse oral and maxillofacial bone loss: a scoping review. Farajollahi M, Safarian MS, Hatami M, Esmaeil Nejad A, Peters OA. Aust Endod J. 2023;49:720–734. - PubMed
1. Artificial intelligence in dentistry: past, present, and future. Agrawal P, Nikhade P. Cureus. 2022;14:0. - PMC - PubMed
1. The role of neural artificial intelligence for diagnosis and treatment planning in endodontics: a qualitative review. Asiri AF, Altuwalah AS. Saudi Dent J. 2022;34:270–281. - PMC - PubMed

LinkOut - more resources

Full Text Sources
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Assessing the Accuracy, Completeness, and Reliability of Artificial Intelligence-Generated Responses in Dentistry: A Pilot Study Evaluating the ChatGPT Model

Affiliations

Assessing the Accuracy, Completeness, and Reliability of Artificial Intelligence-Generated Responses in Dentistry: A Pilot Study Evaluating the ChatGPT Model

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources