Evaluation of Oropharyngeal Cancer Information from Revolutionary Artificial Intelligence Chatbot
- PMID: 37983846
- DOI: 10.1002/lary.31191
Evaluation of Oropharyngeal Cancer Information from Revolutionary Artificial Intelligence Chatbot
Abstract
Objective: With burgeoning popularity of artificial intelligence-based chatbots, oropharyngeal cancer patients now have access to a novel source of medical information. Because chatbot information is not reviewed by experts, we sought to evaluate an artificial intelligence-based chatbot's oropharyngeal cancer-related information for accuracy.
Methods: Fifteen oropharyngeal cancer-related questions were developed and input into ChatGPT version 3.5. Four physician-graders independently assessed accuracy, comprehensiveness, and similarity to a physician response using 5-point Likert scales. Responses graded lower than three were then critiqued by physician-graders. Critiques were analyzed using inductive thematic analysis. Readability of responses was assessed using Flesch Reading Ease (FRE) and Flesch-Kincaid Reading Grade Level (FKRGL) scales.
Results: Average accuracy, comprehensiveness, and similarity to a physician response scores were 3.88 (SD = 0.99), 3.80 (SD = 1.14), and 3.67 (SD = 1.08), respectively. Posttreatment-related questions were most accurate, comprehensive, and similar to a physician response, followed by treatment-related, then diagnosis-related questions. Posttreatment-related questions scored significantly higher than diagnosis-related questions in all three domains (p < 0.01). Two themes of the physician critiques were identified: suboptimal education value and potential to misinform patients. The mean FRE and FKRGL scores both indicated greater than an 11th grade readability level-higher than the 6th grade level recommended for patients.
Conclusion: ChatGPT responses may not educate patients to an appropriate degree, could outright misinform them, and read at a more difficult grade level than is recommended for patient material. As oropharyngeal cancer patients represent a vulnerable population facing complex, life-altering diagnoses, and treatments, they should be cautious when consuming chatbot-generated medical information.
Level of evidence: NA Laryngoscope, 134:2252-2257, 2024.
Keywords: ChatGPT; artificial intelligence; communication; head and neck cancer.
© 2023 The Authors. The Laryngoscope published by Wiley Periodicals LLC on behalf of The American Laryngological, Rhinological and Otological Society, Inc.
References
BIBLIOGRAPHY
-
- American Cancer Society. Facts & Figures 2023. American Cancer Society; 2023.
-
- Schorr M, Carlson LE, Lau HY, et al. Distress levels in patients with oropharyngeal vs. non‐oropharyngeal squamous cell carcinomas of the head and neck over 1 year after diagnosis: a retrospective cohort study. Support Care Cancer. 2017;25(10):3225‐3233. https://doi.org/10.1007/s00520-017-3733-5.
-
- Parmar A, Macluskey M, Mc Goldrick N, et al. Interventions for the treatment of oral cavity and oropharyngeal cancer: chemotherapy. Cochrane Database Syst Rev. 2021;12(12):CD006386. https://doi.org/10.1002/14651858.CD006386.pub4.
-
- Høxbroe Michaelsen S, Grønhøj C, Høxbroe Michaelsen J, Friborg J, Von Buchwald C. Quality of life in survivors of oropharyngeal cancer: a systematic review and meta‐analysis of 1366 patients. Eur J Cancer. 2017;78:91‐102. https://doi.org/10.1016/j.ejca.2017.03.006.
-
- Bundorf MK, Wagner TH, Singer SJ, Baker LC. Who searches the internet for health information? Health Serv Res. 2006;41(3p1):819‐836. https://doi.org/10.1111/j.1475-6773.2006.00510.x.
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials
