Assessing the accuracy and clinical utility of GPT-4O in abnormal blood cell morphology recognition

Xinjian Cai¹, Lili Zhan¹, Yiteng Lin¹

Affiliations

Affiliation

¹ Department of Clinical Laboratory Medicine, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China.

PMID: 39502485
PMCID: PMC11536573
DOI: 10.1177/20552076241298503

Assessing the accuracy and clinical utility of GPT-4O in abnormal blood cell morphology recognition

Xinjian Cai et al. Digit Health. 2024.

. 2024 Nov 5:10:20552076241298503.

doi: 10.1177/20552076241298503. eCollection 2024 Jan-Dec.

Authors

Xinjian Cai¹, Lili Zhan¹, Yiteng Lin¹

Affiliation

¹ Department of Clinical Laboratory Medicine, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Shenzhen, China.

PMID: 39502485
PMCID: PMC11536573
DOI: 10.1177/20552076241298503

Abstract

Objectives: To evaluate the accuracy and clinical utility of GPT-4O in recognizing abnormal blood cell morphology, a critical component of hematologic diagnostics.

Methods: GPT-4O's blood cell morphology recognition capabilities were assessed by comparing its performance with hematologists. A total of 70 images from the Chinese National Center for Clinical Laboratories, External Quality Assessment (EQA) from 2022 to 2024 were analyzed. Two experienced hematology experts evaluated GPT-4O's recognition accuracy using a Likert scale.

Results: GPT-4O achieved an overall accuracy of 70% in blood cell morphology recognition, significantly lower than the 95.42% accuracy of hematologists (p < 0.05). For peripheral blood smears and bone marrow smears, GPT-4O's accuracy was 77.14% and 62.86% respectively. Likert scale evaluations revealed further discrepancies, with GPT-4O scoring 288.50 out of 350, compared to higher manual scores. GPT-4O accurately recognized certain intracellular inclusions such as Howell-Jolly bodies and Auer rods, while it misidentified fragmented red blood cells as neutrophilic metamyelocytes and oval-shaped red blood cells as sickle cells. Additionally, GPT-4O had difficulty accurately identifying intracellular granules and distinguishing cell nuclei and cytoplasm.

Conclusion: GPT-4O's performance in recognizing abnormal blood cell morphology is currently inadequate compared to hematologists. Despite its potential as a supplementary tool, significant improvements in its recognition algorithms and an expanded dataset are necessary for it to be reliable for clinical use. Future research should focus on enhancing GPT-4O's diagnostic accuracy and addressing its current limitations.

Keywords: ChatGPT; GPT-4O; blood cell morphology; clinical laboratory; morphology recognition.

PubMed Disclaimer

Conflict of interest statement

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

**Figure 1.**
Comparison of GPT-4O and hematologists recognition accuracy.

See this image and copyright information in PMC

References

1. Du RC, Liu X, Lai YK, et al. Exploring the performance of ChatGPT on acute pancreatitis-related questions. J Transl Med 2024; 22: 527. - PMC - PubMed
1. Huang X, Raja H, Madadi Yet al. et al. Predicting glaucoma before onset using a large language model chatbot. Am J Ophthalmol. 2024; 266: 289–299. - PMC - PubMed
1. Imtiaz A, King J, Holmes S, et al. ChatGPT versus Bing: a clinician assessment of the accuracy of AI platforms when responding to COPD questions. Eur Respir J 2024; 63: 2400163. - PubMed
1. De Vito A, Geremia N, Marino A, et al. Assessing ChatGPT's theoretical knowledge and prescriptive accuracy in bacterial infections: a comparative study with infectious diseases residents and specialists. Infection 2024. Online ahead of print. doi: 10.1007/s15010-024-02350-6 - DOI - PMC - PubMed
1. Lechien JR, Naunheim MR, Maniaci A, et al. Performance and consistency of ChatGPT-4 versus otolaryngologists: a clinical case series. Otolaryngol Head Neck Surg 2024; 170: 1519–1526. - PubMed

LinkOut - more resources

Full Text Sources
- Atypon
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Assessing the accuracy and clinical utility of GPT-4O in abnormal blood cell morphology recognition

Affiliation

Assessing the accuracy and clinical utility of GPT-4O in abnormal blood cell morphology recognition

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources