Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 30:25:e47479.
doi: 10.2196/47479.

Reliability of Medical Information Provided by ChatGPT: Assessment Against Clinical Guidelines and Patient Information Quality Instrument

Affiliations

Reliability of Medical Information Provided by ChatGPT: Assessment Against Clinical Guidelines and Patient Information Quality Instrument

Harriet Louise Walker et al. J Med Internet Res. .

Abstract

Background: ChatGPT-4 is the latest release of a novel artificial intelligence (AI) chatbot able to answer freely formulated and complex questions. In the near future, ChatGPT could become the new standard for health care professionals and patients to access medical information. However, little is known about the quality of medical information provided by the AI.

Objective: We aimed to assess the reliability of medical information provided by ChatGPT.

Methods: Medical information provided by ChatGPT-4 on the 5 hepato-pancreatico-biliary (HPB) conditions with the highest global disease burden was measured with the Ensuring Quality Information for Patients (EQIP) tool. The EQIP tool is used to measure the quality of internet-available information and consists of 36 items that are divided into 3 subsections. In addition, 5 guideline recommendations per analyzed condition were rephrased as questions and input to ChatGPT, and agreement between the guidelines and the AI answer was measured by 2 authors independently. All queries were repeated 3 times to measure the internal consistency of ChatGPT.

Results: Five conditions were identified (gallstone disease, pancreatitis, liver cirrhosis, pancreatic cancer, and hepatocellular carcinoma). The median EQIP score across all conditions was 16 (IQR 14.5-18) for the total of 36 items. Divided by subsection, median scores for content, identification, and structure data were 10 (IQR 9.5-12.5), 1 (IQR 1-1), and 4 (IQR 4-5), respectively. Agreement between guideline recommendations and answers provided by ChatGPT was 60% (15/25). Interrater agreement as measured by the Fleiss κ was 0.78 (P<.001), indicating substantial agreement. Internal consistency of the answers provided by ChatGPT was 100%.

Conclusions: ChatGPT provides medical information of comparable quality to available static internet information. Although currently of limited quality, large language models could become the future standard for patients and health care professionals to gather medical information.

Keywords: ChatGPT; EQIP tool; artificial intelligence; bile; biliary; chatbot; chatbots; conversational agent; conversational agents; gall; gallstone; hepatic; internal medicine; internet information; liver; medical information; pancreas; pancreatic; pancreatitis; patient information.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

References

    1. ChatGPT. OpenAI. [2023-06-20]. https://chat.openai.com/chat .
    1. Gilson A, Safranek C, Huang Thomas, Socrates Vimig, Chi Ling, Taylor Richard Andrew, Chartash David. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023 Feb 08;9:e45312. doi: 10.2196/45312. https://mededu.jmir.org/2023//e45312/ v9i1e45312 - DOI - PMC - PubMed
    1. Eysenbach Gunther. The role of ChatGPT, generative language models, and artificial intelligence in medical education: A conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023 Mar 06;9:e46885. doi: 10.2196/46885. https://mededu.jmir.org/2023//e46885/ v9i1e46885 - DOI - PMC - PubMed
    1. Hassani H, Silva Es. The role of ChatGPT in data science: how AI-assisted conversational interfaces are revolutionizing the field. Big Data Cogn Comput. 2023 Mar 27;7(2):62. doi: 10.3390/bdcc7020062. doi: 10.3390/bdcc7020062. - DOI - DOI
    1. Rudolph J, Tan S, Tan S. ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? J Appl Learn Teach. 2023;6(1):342–362. doi: 10.37074/jalt.2023.6.1.9. https://journals.sfu.ca/jalt/index.php/jalt/article/view/689/539 - DOI

LinkOut - more resources