Artificial Intelligence in Gastroenterology Education: DeepSeek Passes the Gastroenterology Board Examination and Outperforms Legacy ChatGPT Models

Andrew F Ibrahim¹, Pojsakorn Danpanichkul², Alexander Hayek¹, Edwin Paul¹, Annmarie Farag¹, Masab Mansoor¹, Charat Thongprayoon³, Wisit Cheungpasitporn³, Mohamed O Othman⁴

Affiliations

¹ School of Medicine, Texas Tech University Health Sciences Center, Lubbock, Texas, USA.
² Department of Internal Medicine, Texas Tech University Health Sciences Center, Lubbock, Texas, USA.
³ Department of Internal Medicine, Mayo Clinic, Rochester, Minnesota, USA.
⁴ Department of Gastroenterology and Hepatology, Baylor College of Medicine, Houston, Texas, USA .

PMID: 40392256
DOI: 10.14309/ajg.0000000000003552

Artificial Intelligence in Gastroenterology Education: DeepSeek Passes the Gastroenterology Board Examination and Outperforms Legacy ChatGPT Models

Andrew F Ibrahim et al. Am J Gastroenterol. 2025.

. 2025 May 20.

doi: 10.14309/ajg.0000000000003552. Online ahead of print.

Authors

Andrew F Ibrahim¹, Pojsakorn Danpanichkul², Alexander Hayek¹, Edwin Paul¹, Annmarie Farag¹, Masab Mansoor¹, Charat Thongprayoon³, Wisit Cheungpasitporn³, Mohamed O Othman⁴

Affiliations

¹ School of Medicine, Texas Tech University Health Sciences Center, Lubbock, Texas, USA.
² Department of Internal Medicine, Texas Tech University Health Sciences Center, Lubbock, Texas, USA.
³ Department of Internal Medicine, Mayo Clinic, Rochester, Minnesota, USA.
⁴ Department of Gastroenterology and Hepatology, Baylor College of Medicine, Houston, Texas, USA .

PMID: 40392256
DOI: 10.14309/ajg.0000000000003552

Abstract

Introduction: DeepSeek was evaluated in gastroenterology board examination performance against legacy ChatGPT offline models, which previously showed failing performance.

Methods: The performances of the DeepSeek base R1 model and search-augmented R1 model were assessed using American College of Gastroenterology self-assessments (455 questions).

Results: DeepSeek exceeded the passing threshold. Search-augmented DeepSeek scored 81.5% across all questions, and the R1 base model scored 77.1%. Both search-augmented and offline DeepSeek models surpassed offline ChatGPT-3 (65.1%) and ChatGPT-4 (62.4%) ( P < 0.001).

Discussion: DeepSeek exhibited passing performance on the gastroenterology board examination but had gaps in niche topics and image exclusion limit utility. It may supplement education if validated by specialists.

Keywords: artificial intelligence; large language models; medical education.

PubMed Disclaimer

References

1. Topol EJ. High-performance medicine: The convergence of human and artificial intelligence. Nat Med 2019;25(1):44–56.
1. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health 2023;2(2):e0000198.
1. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States medical Licensing Examination (USMLE)? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 2023;9:e45312.
1. Sarwar S, Dent A, Faust K, et al. Physician perspectives on integration of artificial intelligence into diagnostic pathology. npj Digital Med 2019;2(1):28.
1. Suchman K, Garg S, Trindade AJ. Chat generative pretrained transformer fails the multiple-choice American College of Gastroenterology Self-Assessment Test. Am J Gastroenterol 2023;118(12):2280–2.

LinkOut - more resources

Full Text Sources
- Ovid Technologies, Inc.
- Wolters Kluwer

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Artificial Intelligence in Gastroenterology Education: DeepSeek Passes the Gastroenterology Board Examination and Outperforms Legacy ChatGPT Models

Affiliations

Artificial Intelligence in Gastroenterology Education: DeepSeek Passes the Gastroenterology Board Examination and Outperforms Legacy ChatGPT Models

Authors

Affiliations

Abstract

References

LinkOut - more resources

Full Text Sources