Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer
- PMID: 37615960
- PMCID: PMC10450581
- DOI: 10.1001/jamaoncol.2023.2947
Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer
Abstract
Importance: Consumers are increasingly using artificial intelligence (AI) chatbots as a source of information. However, the quality of the cancer information generated by these chatbots has not yet been evaluated using validated instruments.
Objective: To characterize the quality of information and presence of misinformation about skin, lung, breast, colorectal, and prostate cancers generated by 4 AI chatbots.
Design, setting, and participants: This cross-sectional study assessed AI chatbots' text responses to the 5 most commonly searched queries related to the 5 most common cancers using validated instruments. Search data were extracted from the publicly available Google Trends platform and identical prompts were used to generate responses from 4 AI chatbots: ChatGPT version 3.5 (OpenAI), Perplexity (Perplexity.AI), Chatsonic (Writesonic), and Bing AI (Microsoft).
Exposures: Google Trends' top 5 search queries related to skin, lung, breast, colorectal, and prostate cancer from January 1, 2021, to January 1, 2023, were input into 4 AI chatbots.
Main outcomes and measures: The primary outcomes were the quality of consumer health information based on the validated DISCERN instrument (scores from 1 [low] to 5 [high] for quality of information) and the understandability and actionability of this information based on the understandability and actionability domains of the Patient Education Materials Assessment Tool (PEMAT) (scores of 0%-100%, with higher scores indicating a higher level of understandability and actionability). Secondary outcomes included misinformation scored using a 5-item Likert scale (scores from 1 [no misinformation] to 5 [high misinformation]) and readability assessed using the Flesch-Kincaid Grade Level readability score.
Results: The analysis included 100 responses from 4 chatbots about the 5 most common search queries for skin, lung, breast, colorectal, and prostate cancer. The quality of text responses generated by the 4 AI chatbots was good (median [range] DISCERN score, 5 [2-5]) and no misinformation was identified. Understandability was moderate (median [range] PEMAT Understandability score, 66.7% [33.3%-90.1%]), and actionability was poor (median [range] PEMAT Actionability score, 20.0% [0%-40.0%]). The responses were written at the college level based on the Flesch-Kincaid Grade Level score.
Conclusions and relevance: Findings of this cross-sectional study suggest that AI chatbots generally produce accurate information for the top cancer-related search queries, but the responses are not readily actionable and are written at a college reading level. These limitations suggest that AI chatbots should be used supplementarily and not as a primary source for medical information.
Conflict of interest statement
Comment in
-
Artificial Intelligence-From Starting Pilots to Scalable Privilege.JAMA Oncol. 2023 Oct 1;9(10):1341-1342. doi: 10.1001/jamaoncol.2023.2867. JAMA Oncol. 2023. PMID: 37615950 No abstract available.
-
Re: Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer.Eur Urol. 2024 Sep;86(3):278-279. doi: 10.1016/j.eururo.2024.03.033. Epub 2024 Apr 9. Eur Urol. 2024. PMID: 38599992 No abstract available.
References
-
- Agency for Healthcare Research and Quality . The Patient Education Materials Assessment Tool (PEMAT) and user’s guide. Accessed April 4, 2023. https://www.ahrq.gov/health-literacy/patient-education/pemat.html
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
