Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2024 Jun;170(6):1519-1526.
doi: 10.1002/ohn.759. Epub 2024 Apr 9.

Performance and Consistency of ChatGPT-4 Versus Otolaryngologists: A Clinical Case Series

Affiliations
Comparative Study

Performance and Consistency of ChatGPT-4 Versus Otolaryngologists: A Clinical Case Series

Jérôme R Lechien et al. Otolaryngol Head Neck Surg. 2024 Jun.

Abstract

Objective: To study the performance of Chatbot Generative Pretrained Transformer-4 (ChatGPT-4) in the management of cases in otolaryngology-head and neck surgery.

Study design: Prospective case series.

Setting: Multicenter University Hospitals.

Methods: History, clinical, physical, and additional examinations of adult outpatients consulting in otolaryngology departments of CHU Saint-Pierre and Dour Medical Center were presented to ChatGPT-4, which was interrogated for differential diagnoses, management, and treatment(s). According to specialty, the ChatGPT-4 responses were assessed by 2 distinct, blinded board-certified otolaryngologists with the Artificial Intelligence Performance Instrument.

Results: One hundred cases were presented to ChatGPT-4. ChaGPT-4 indicated a mean of 3.34 (95% confidence interval [CI]: 3.09, 3.59) additional examinations per patient versus 2.10 (95% CI: 1.76, 2.34; P = .001) for the practitioners. There was strong consistency (k > 0.600) between otolaryngologists and ChatGPT-4 for the indication of upper aerodigestive tract endoscopy, positron emission tomography and computed tomography, audiometry, tympanometry, and psychophysical evaluations. Primary diagnosis was correctly performed by ChatGPT-4 in 38% to 86% of cases depending on subspecialty. Additional examinations indicated by ChatGPT-4 were pertinent and necessary in 8% to 31% of cases, while the treatment regimen was pertinent in 12% to 44% of cases. The performance of ChatGPT-4 was not influenced by the human-reported level of difficulty of clinical cases.

Conclusion: ChatGPT-4 may be a promising adjunctive tool in otolaryngology, providing extensive documentation about additional examinations, primary and differential diagnoses, and treatments. The ChatGPT-4 is more effective in providing a primary diagnosis, and less effective in the selection of additional examinations and treatments.

Keywords: ChatGPT‐4; artificial intelligence; head neck surgery; otolaryngology; performance.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Vaira LA, Lechien JR, Abbate V, et al. Accuracy of ChatGPT‐generated information on head and neck and oromaxillofacial surgery: a multicenter collaborative analysis. Otolaryngol Head Neck Surg. 2023. doi:10.1002/ohn.489
    1. Lechien JR, Maniaci A, Gengler I, Hans S, Chiesa‐Estomba CM, Vaira LA. Validity and reliability of an instrument evaluating the performance of intelligent chatbot: the Artificial Intelligence Performance Instrument (AIPI). Eur Arch Otrhinolaryngol. 2023;281:2063‐2079. doi:10.1007/s00405-023-08219-y
    1. Lechien JR, Gorton A, Robertson J, Vaira LA. Is ChatGPT‐4 accurate in proofread a manuscript in otolaryngology‐head and neck surgery? Otolaryngol Head Neck Surg. 2023. doi:10.1002/ohn.526
    1. Lechien JR, Georgescu BM, Hans S, Chiesa‐Estomba CM. ChatGPT performance in laryngology and head and neck surgery: a clinical case‐series. Eur Arch Otrhinolaryngol. 2023;281:319‐333. doi:10.1007/s00405-023-08282-5
    1. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. J Clin Epidemiol. 2008;61(4):344‐349.

LinkOut - more resources