Harnessing GPT-4 for automated error detection in pathology reports: Implications for oncology diagnostics
- PMID: 40453047
- PMCID: PMC12123116
- DOI: 10.1177/20552076251346703
Harnessing GPT-4 for automated error detection in pathology reports: Implications for oncology diagnostics
Abstract
Objective: Accurate pathology reports are crucial for the diagnosis and treatment planning of cancer patients. However, these reports are prone to errors due to time pressures, subjective interpretation, and inconsistencies among professionals. Addressing these errors is vital for improving oncology care outcomes. Artificial intelligence (AI) systems, such as GPT-4, offer the potential to enhance diagnostic accuracy and efficiency.
Methods: A total of 700 malignant tumor pathology reports were collected from four hospitals. Of these, 350 reports had deliberate errors introduced by a senior pathologist, mimicking real-world reporting challenges. Error detection performance was evaluated by comparing GPT-4 to six human pathologists (two seniors, two attending pathologists, and two residents). Key metrics included error detection rates with Wilson confidence intervals and processing time per report.
Results: GPT-4 detected 88% of errors (350/400; 95% CI: [84, 91]), compared to a 95% detection rate by the top senior pathologist (382/400; 95% CI: [93, 97]). GPT-4 significantly reduced the average processing time to 4.03 seconds per report, compared to 65.64 seconds for the fastest human pathologist. However, GPT-4 exhibited a higher rate of false positives (2.3%; 95% CI: [1.52, 3.01]) compared to the best-performing senior pathologist (0.3%; 95% CI: [0.01, 0.91]).
Conclusions: GPT-4 demonstrates substantial potential in improving the efficiency and accuracy of pathology error detection, which could accelerate clinical workflows and enhance cancer diagnostics. However, its higher false-positive rate emphasizes the need for human oversight to ensure safe implementation in clinical practice.
Keywords: Large language model; artificial intelligence in oncology; cancer diagnostics workflow; diagnostic accuracy; pathology report error detection.
© The Author(s) 2025.
Conflict of interest statement
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Figures



Similar articles
-
Potential of GPT-4 for Detecting Errors in Radiology Reports: Implications for Reporting Accuracy.Radiology. 2024 Apr;311(1):e232714. doi: 10.1148/radiol.232714. Radiology. 2024. PMID: 38625012
-
An assessment of ChatGPT in error detection for thyroid ultrasound reports: A comparative study with ultrasound physicians.Digit Health. 2025 Mar 13;11:20552076251326019. doi: 10.1177/20552076251326019. eCollection 2025 Jan-Dec. Digit Health. 2025. PMID: 40093707 Free PMC article.
-
Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4.J Med Internet Res. 2024 Jun 27;26:e54571. doi: 10.2196/54571. J Med Internet Res. 2024. PMID: 38935937 Free PMC article.
-
Artificial intelligence for breast cancer detection and its health technology assessment: A scoping review.Comput Biol Med. 2025 Jan;184:109391. doi: 10.1016/j.compbiomed.2024.109391. Epub 2024 Nov 22. Comput Biol Med. 2025. PMID: 39579663
-
Advancing radiology with GPT-4: Innovations in clinical applications, patient engagement, research, and learning.Eur J Radiol Open. 2024 Jul 26;13:100589. doi: 10.1016/j.ejro.2024.100589. eCollection 2024 Dec. Eur J Radiol Open. 2024. PMID: 39170856 Free PMC article. Review.
References
-
- Ellis DW, Srigley J. Does standardised structured reporting contribute to quality in diagnostic pathology? The importance of evidence-based datasets. Virchows Arch 2016; 468: 51–59. - PubMed
-
- Ahmad Z, Idrees R, Uddin N, et al. Errors in surgical pathology reports: a study from a major center in Pakistan. Asian Pac J Cancer Prev 2016; 17: 1869–1874. - PubMed
-
- Huang S, Lee PV, B J. Errors encountered in the diagnostic pathway: a prospective single-institution study. J Cutan Pathol 2023; 50: 828–834. - PubMed
-
- Monique Freire S, Luiz Carlos de LF. Chapter 7: Errors in surgical pathology laboratory. In: Sarwar Z G. (ed) Quality control in laboratory. Rijeka: IntechOpen, 2018, pp. 89–107.
-
- Yang X, Chu XP, Huang S, et al. A novel image deep learning-based sub-centimeter pulmonary nodule management algorithm to expedite resection of the malignant and avoid over-diagnosis of the benign. Eur Radiol 2024; 34: 2048–2061. 20230902. - PubMed
LinkOut - more resources
Full Text Sources