Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 12;1(3):226-234.
doi: 10.1016/j.mcpdig.2023.05.004. eCollection 2023 Sep.

Learning to Fake It: Limited Responses and Fabricated References Provided by ChatGPT for Medical Questions

Affiliations

Learning to Fake It: Limited Responses and Fabricated References Provided by ChatGPT for Medical Questions

Jocelyn Gravel et al. Mayo Clin Proc Digit Health. .

Abstract

Objective: To evaluate the quality of the answers and the references provided by ChatGPT for medical questions.

Patients and methods: Three researchers asked ChatGPT 20 medical questions and prompted it to provide the corresponding references. The responses were evaluated for the quality of content by medical experts using a verbal numeric scale going from 0% to 100%. These experts were the corresponding authors of the 20 articles from where the medical questions were derived. We planned to evaluate 3 references per response for their pertinence, but this was amended on the basis of preliminary results showing that most references provided by ChatGPT were fabricated. This experimental observational study was conducted in February 2023.

Results: ChatGPT provided responses varying between 53 and 244 words long and reported 2 to 7 references per answer. Seventeen of the 20 invited raters provided feedback. The raters reported limited quality of the responses, with a median score of 60% (first and third quartiles: 50% and 85%, respectively). In addition, they identified major (n=5) and minor (n=7) factual errors among the 17 evaluated responses. Of the 59 references evaluated, 41 (69%) were fabricated, although they appeared real. Most fabricated citations used names of authors with previous relevant publications, a title that seemed pertinent and a credible journal format.

Conclusion: When asked multiple medical questions, ChatGPT provided answers of limited quality for scientific publication. More importantly, ChatGPT provided deceptively real references. Users of ChatGPT should pay particular attention to the references provided before integration into medical manuscripts.

PubMed Disclaimer

Conflict of interest statement

The authors report no competing interests.

Figures

Figure 1
Figure 1
Screenshot of an example of responses provided by ChatGPT.
Figure 2
Figure 2
Distribution of the references provided by ChatGPT (n=59).
Figure 3
Figure 3
Example of responses from ChatGPT when questioned about the veracity of its references.

References

    1. Kitamura F.C. ChatGPT is shaping the future of medical writing but still requires human judgment. Radiology. 2023;307(2) doi: 10.1148/radiol.230171. - DOI - PubMed
    1. ChatGPT Optimizing language models for dialogue. https://openai.com/blog/chatgpt/
    1. Biswas S. ChatGPT and the future of medical writing. Radiology. 2023;307(2) doi: 10.1148/radiol.223312. - DOI - PubMed
    1. O'Connor S. Open artificial intelligence platforms in nursing education: Tools for academic progress or abuse? Nurse Educ Pract. 2023;66 doi: 10.1016/j.nepr.2022.103537. - DOI - PubMed
    1. Chat GPT Generative pre-trained transformer. Zhavoronkov A. Rapamycin in the context of Pascal's Wager: generative pre-trained transformer perspective. Oncoscience. 2022;9:82–84. - PMC - PubMed

LinkOut - more resources