Use of artificial intelligence chatbots in clinical management of immune-related adverse events

Affiliations

¹ Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
² Department of Oncology, Johns Hopkins University, Baltimore, Maryland, USA.
³ Harold C Simmons Comprehensive Cancer Center, The University of Texas Southwestern Medical Center, Dallas, Texas, USA.
⁴ Department of Medicine, Roswell Park Comprehensive Cancer Center, Buffalo, New York, USA.
⁵ Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
⁶ RCSI Cancer Centre, Beaumont Hospital, Dublin, Ireland.
⁷ Department of Melanoma, Cancer Immunotherapy and Development Therapeutics, Istituto Nazionale Tumori IRCCS Fondazione Pascale, Napoli, Campania, Italy.
⁸ ImmunoOncology Branch (IOB), Developmental Therapeutics Program, Cancer Therapy and Diagnosis Division, National Cancer Institute (NCI), National Institutes of Health, Bethesda, Maryland, USA.
⁹ Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA douglas.b.johnson@vumc.org.

PMID: 38816231
PMCID: PMC11141185
DOI: 10.1136/jitc-2023-008599

Use of artificial intelligence chatbots in clinical management of immune-related adverse events

Hannah Burnette et al. J Immunother Cancer. 2024.

. 2024 May 30;12(5):e008599.

doi: 10.1136/jitc-2023-008599.

Authors

Affiliations

¹ Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
² Department of Oncology, Johns Hopkins University, Baltimore, Maryland, USA.
³ Harold C Simmons Comprehensive Cancer Center, The University of Texas Southwestern Medical Center, Dallas, Texas, USA.
⁴ Department of Medicine, Roswell Park Comprehensive Cancer Center, Buffalo, New York, USA.
⁵ Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
⁶ RCSI Cancer Centre, Beaumont Hospital, Dublin, Ireland.
⁷ Department of Melanoma, Cancer Immunotherapy and Development Therapeutics, Istituto Nazionale Tumori IRCCS Fondazione Pascale, Napoli, Campania, Italy.
⁸ ImmunoOncology Branch (IOB), Developmental Therapeutics Program, Cancer Therapy and Diagnosis Division, National Cancer Institute (NCI), National Institutes of Health, Bethesda, Maryland, USA.
⁹ Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA douglas.b.johnson@vumc.org.

PMID: 38816231
PMCID: PMC11141185
DOI: 10.1136/jitc-2023-008599

Abstract

Background: Artificial intelligence (AI) chatbots have become a major source of general and medical information, though their accuracy and completeness are still being assessed. Their utility to answer questions surrounding immune-related adverse events (irAEs), common and potentially dangerous toxicities from cancer immunotherapy, are not well defined.

Methods: We developed 50 distinct questions with answers in available guidelines surrounding 10 irAE categories and queried two AI chatbots (ChatGPT and Bard), along with an additional 20 patient-specific scenarios. Experts in irAE management scored answers for accuracy and completion using a Likert scale ranging from 1 (least accurate/complete) to 4 (most accurate/complete). Answers across categories and across engines were compared.

Results: Overall, both engines scored highly for accuracy (mean scores for ChatGPT and Bard were 3.87 vs 3.5, p<0.01) and completeness (3.83 vs 3.46, p<0.01). Scores of 1-2 (completely or mostly inaccurate or incomplete) were particularly rare for ChatGPT (6/800 answer-ratings, 0.75%). Of the 50 questions, all eight physician raters gave ChatGPT a rating of 4 (fully accurate or complete) for 22 questions (for accuracy) and 16 questions (for completeness). In the 20 patient scenarios, the average accuracy score was 3.725 (median 4) and the average completeness was 3.61 (median 4).

Conclusions: AI chatbots provided largely accurate and complete information regarding irAEs, and wildly inaccurate information ("hallucinations") was uncommon. However, until accuracy and completeness increases further, appropriate guidelines remain the gold standard to follow.

Keywords: Colitis; Immune Checkpoint Inhibitor; Immune related adverse event - irAE; Pneumonitis; Thyroiditis.

PubMed Disclaimer

Conflict of interest statement

Competing interests: AP reports grants and personal fees from Bristol-Myers Squibb; personal fees from AstraZeneca, Pfizer, Merck, Roche, and Canadian Agency for Drugs and Technologies in Health; and grants from Alberta Cancer Foundation outside the submitted work. IP has served on advisory boards for Nektar, Iovance, Nouscom, I-O Bio and has stock ownership in Ideaya. DBJ has served on advisory boards or as a consultant for BMS, Catalyst Biopharma, Iovance, Mallinckrodt, Merck, Mosaic ImmunoEngineering, Novartis, Pfizer, Targovax, and Teiko, has received research funding from BMS and Incyte, and has patents pending for use of MHC-II as a biomarker for immune checkpoint inhibitor response, and abatacept as treatment for immune-related adverse events. DEG reports research funding from AstraZeneca, BerGenBio, Karyopharm, and Novocure; stock ownership in Gilead; service on advisory boards or consulting for AstraZeneca, Catalyst Pharmaceuticals, Daiichi-Sankyo, Elevation Oncology, Janssen Scientific Affairs, LLC, Jazz Pharmaceuticals, Regeneron Pharmaceuticals, Sanofi; U.S. patent 11,747,345, patent applications 17/045,482, 63/386,387, 63/382,972, 63/382,257; and is Co-founder and Chief Scientific Officer of OncoSeer Diagnostics, LLC.

References

1. Shen Y, Heacock L, Elias J, et al. Chatgpt and other large language models are double-edged swords. Radiology 2023;307:e230163. 10.1148/radiol.230163 - DOI - PubMed
1. Pan A, Musheyev D, Bockelman D, et al. Assessment of artificial intelligence Chatbot responses to top searched queries about cancer. JAMA Oncol 2023;9:1437–40. 10.1001/jamaoncol.2023.2947 - DOI - PMC - PubMed
1. Chen S, Kann BH, Foote MB, et al. Use of artificial intelligence Chatbots for cancer treatment information. JAMA Oncol 2023;9:1459–62. 10.1001/jamaoncol.2023.2954 - DOI - PMC - PubMed
1. Goodman RS, Patrinely JR, Stone CA, et al. Accuracy and reliability of Chatbot responses to physician questions. JAMA Netw Open 2023;6:e2336483. 10.1001/jamanetworkopen.2023.36483 - DOI - PMC - PubMed
1. El-Metwally A, Toivola P, AlAhmary K, et al. The epidemiology of migraine headache in Arab countries: A systematic review. ScientificWorldJournal 2020;2020:4790254. 10.1155/2020/4790254 - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 CA227481/CA/NCI NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Use of artificial intelligence chatbots in clinical management of immune-related adverse events

Affiliations

Use of artificial intelligence chatbots in clinical management of immune-related adverse events

Authors

Affiliations

Abstract

Conflict of interest statement

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources