ChatGPT in Oral Pathology: Bright Promise or Diagnostic Mirage

Ana Suárez¹, Yolanda Freire¹, Víctor Díaz-Flores García², Andrea Santamaría Laorden¹, Jaime Orejas Pérez¹, María Suárez Ajuria¹, Juan Algar³, Carmen Martín Carreras-Presas⁴

Affiliations

¹ Department of Pre-Clinic Dentistry II, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Madrid, Spain.
² Department of Pre-Clinic Dentistry I, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Madrid, Spain.
³ Department of Clinic Dentistry, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Madrid, Spain.
⁴ Department of Clinical Dentistry-Postgraduate Studies, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Madrid, Spain.

PMID: 41155731
PMCID: PMC12566193
DOI: 10.3390/medicina61101744

ChatGPT in Oral Pathology: Bright Promise or Diagnostic Mirage

Ana Suárez et al. Medicina (Kaunas). 2025.

. 2025 Sep 25;61(10):1744.

doi: 10.3390/medicina61101744.

Authors

Ana Suárez¹, Yolanda Freire¹, Víctor Díaz-Flores García², Andrea Santamaría Laorden¹, Jaime Orejas Pérez¹, María Suárez Ajuria¹, Juan Algar³, Carmen Martín Carreras-Presas⁴

Affiliations

¹ Department of Pre-Clinic Dentistry II, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Madrid, Spain.
² Department of Pre-Clinic Dentistry I, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Madrid, Spain.
³ Department of Clinic Dentistry, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Madrid, Spain.
⁴ Department of Clinical Dentistry-Postgraduate Studies, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, 28670 Madrid, Spain.

PMID: 41155731
PMCID: PMC12566193
DOI: 10.3390/medicina61101744

Abstract

Background and Objectives: The growing academic interest within the biomedical sciences regarding the diagnostic capabilities of multimodal language models, such as ChatGPT-4o, is clear. However, their ability to interpret oral clinical images remains insufficiently explored. This exploratory pilot study aimed to provide preliminary observations about the diagnostic validity of ChatGPT-4o in identifying oral squamous cell carcinoma (OSCC), oral leukoplakia (OL), and oral lichen planus (OLP) using only clinical photographs, without the inclusion of additional clinical data. Materials and Methods: Two general dentists selected 23 images of oral lesions suspected to be OSCC, OL, or OLP. ChatGPT-4o was asked to provide a probable diagnosis for each image on 30 occasions, generating a total of 690 responses. The responses were then evaluated against the reference diagnosis set up by an expert to calculate sensitivity, specificity, predictive values, and the area under the ROC curve. Results: ChatGPT-4o demonstrated high specificity across the three conditions (97.1% for OSCC, 100% for OL, and 96.1% for OLP), correctly classifying 90% of OSCC cases (AUC = 0.81). However, this overall accuracy was largely driven by correct negative classifications, while the clinically relevant sensitivity for OSCC was only 65%. In spite of that, sensitivity was highly variable: 60% for OL and just 25% for OLP, which limits its usefulness in a clinical setting for ruling out these conditions. The model achieved positive predictive values of 86.7% for OSCC and 100% for OL. Given the small dataset, these findings should be interpreted only as preliminary evidence. Conclusions: ChatGPT-4o demonstrates potential as a complementary tool for the screening of OSCC in clinical oral images. Nevertheless, the pilot nature of this study and the reduced sample size highlight that larger, adequately powered studies (with several hundred cases per pathology) are needed to obtain robust and generalizable results. Nevertheless, its sensitivity remains insufficient, as a significant proportion of true cases were missed, underscoring that the model cannot be relied upon as a standalone diagnostic tool.

Keywords: ChatGPT; diagnostic accuracy; multimodal large language models (LLMs); oral leukoplakia (OL); oral lichen planus (OLP); oral pathology; oral squamous cell carcinoma (OSCC).

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

References

1. Umer F., Batool I., Naved N. Innovation and Application of Large Language Models (LLMs) in Dentistry—A Scoping Review. BDJ Open. 2024;10:90. doi: 10.1038/s41405-024-00277-6. - DOI - PMC - PubMed
1. Nia M.F., Ahmadi M., Irankhah E. Transforming dental diagnostics with artificial intelligence: Advanced integration of ChatGPT and large language models for patient care. Front. Dent. Med. 2025;5:1456208. doi: 10.3389/fdmed.2024.1456208. - DOI - PMC - PubMed
1. Kämmer J.E., Hautz W.E., Krummrey G., Sauter T.C., Penders D., Birrenbach T., Bienefeld N. Effects of Interacting with a Large Language Model Compared with a Human Coach on the Clinical Diagnostic Process and Outcomes among Fourth-Year Medical Students: Study Protocol for a Prospective, Randomised Experiment Using Patient Vignettes. BMJ Open. 2024;14:e087469. doi: 10.1136/bmjopen-2024-087469. - DOI - PMC - PubMed
1. Savage T., Nayak A., Gallo R., Rangan E., Chen J.H. Diagnostic Reasoning Prompts Reveal the Potential for Large Language Model Interpretability in Medicine. NPJ Digit. Med. 2024;7:20. doi: 10.1038/s41746-024-01010-1. - DOI - PMC - PubMed
1. Abbasian Ardakani A., Airom O., Khorshidi H., Bureau N.J., Salvi M., Molinari F., Acharya U.R. Interpretation of Artificial Intelligence Models in Healthcare. J. Ultrasound Med. 2024;43:1789–1818. doi: 10.1002/jum.16524. - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- MDPI
- PubMed Central
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

ChatGPT in Oral Pathology: Bright Promise or Diagnostic Mirage

Affiliations

ChatGPT in Oral Pathology: Bright Promise or Diagnostic Mirage

Authors

Affiliations

Abstract

Conflict of interest statement

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical