Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 5:10:20552076241298503.
doi: 10.1177/20552076241298503. eCollection 2024 Jan-Dec.

Assessing the accuracy and clinical utility of GPT-4O in abnormal blood cell morphology recognition

Affiliations

Assessing the accuracy and clinical utility of GPT-4O in abnormal blood cell morphology recognition

Xinjian Cai et al. Digit Health. .

Abstract

Objectives: To evaluate the accuracy and clinical utility of GPT-4O in recognizing abnormal blood cell morphology, a critical component of hematologic diagnostics.

Methods: GPT-4O's blood cell morphology recognition capabilities were assessed by comparing its performance with hematologists. A total of 70 images from the Chinese National Center for Clinical Laboratories, External Quality Assessment (EQA) from 2022 to 2024 were analyzed. Two experienced hematology experts evaluated GPT-4O's recognition accuracy using a Likert scale.

Results: GPT-4O achieved an overall accuracy of 70% in blood cell morphology recognition, significantly lower than the 95.42% accuracy of hematologists (p < 0.05). For peripheral blood smears and bone marrow smears, GPT-4O's accuracy was 77.14% and 62.86% respectively. Likert scale evaluations revealed further discrepancies, with GPT-4O scoring 288.50 out of 350, compared to higher manual scores. GPT-4O accurately recognized certain intracellular inclusions such as Howell-Jolly bodies and Auer rods, while it misidentified fragmented red blood cells as neutrophilic metamyelocytes and oval-shaped red blood cells as sickle cells. Additionally, GPT-4O had difficulty accurately identifying intracellular granules and distinguishing cell nuclei and cytoplasm.

Conclusion: GPT-4O's performance in recognizing abnormal blood cell morphology is currently inadequate compared to hematologists. Despite its potential as a supplementary tool, significant improvements in its recognition algorithms and an expanded dataset are necessary for it to be reliable for clinical use. Future research should focus on enhancing GPT-4O's diagnostic accuracy and addressing its current limitations.

Keywords: ChatGPT; GPT-4O; blood cell morphology; clinical laboratory; morphology recognition.

PubMed Disclaimer

Conflict of interest statement

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
Comparison of GPT-4O and hematologists recognition accuracy.

Similar articles

Cited by

References

    1. Du RC, Liu X, Lai YK, et al. Exploring the performance of ChatGPT on acute pancreatitis-related questions. J Transl Med 2024; 22: 527. - PMC - PubMed
    1. Huang X, Raja H, Madadi Yet al. et al. Predicting glaucoma before onset using a large language model chatbot. Am J Ophthalmol. 2024; 266: 289–299. - PMC - PubMed
    1. Imtiaz A, King J, Holmes S, et al. ChatGPT versus Bing: a clinician assessment of the accuracy of AI platforms when responding to COPD questions. Eur Respir J 2024; 63: 2400163. - PubMed
    1. De Vito A, Geremia N, Marino A, et al. Assessing ChatGPT's theoretical knowledge and prescriptive accuracy in bacterial infections: a comparative study with infectious diseases residents and specialists. Infection 2024. Online ahead of print. doi:10.1007/s15010-024-02350-6 - DOI - PubMed
    1. Lechien JR, Naunheim MR, Maniaci A, et al. Performance and consistency of ChatGPT-4 versus otolaryngologists: a clinical case series. Otolaryngol Head Neck Surg 2024; 170: 1519–1526. - PubMed

LinkOut - more resources