Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Nov;10(2):e002455.
doi: 10.1136/openhrt-2023-002455.

Heart-to-heart with ChatGPT: the impact of patients consulting AI for cardiovascular health advice

Affiliations
Review

Heart-to-heart with ChatGPT: the impact of patients consulting AI for cardiovascular health advice

Anton Danholt Lautrup et al. Open Heart. 2023 Nov.

Abstract

Objectives: The advent of conversational artificial intelligence (AI) systems employing large language models such as ChatGPT has sparked public, professional and academic debates on the capabilities of such technologies. This mixed-methods study sets out to review and systematically explore the capabilities of ChatGPT to adequately provide health advice to patients when prompted regarding four topics from the field of cardiovascular diseases.

Methods: As of 30 May 2023, 528 items on PubMed contained the term ChatGPT in their title and/or abstract, with 258 being classified as journal articles and included in our thematic state-of-the-art review. For the experimental part, we systematically developed and assessed 123 prompts across the four topics based on three classes of users and two languages. Medical and communications experts scored ChatGPT's responses according to the 4Cs of language model evaluation proposed in this article: correct, concise, comprehensive and comprehensible.

Results: The articles reviewed were fairly evenly distributed across discussing how ChatGPT could be used for medical publishing, in clinical practice and for education of medical personnel and/or patients. Quantitatively and qualitatively assessing the capability of ChatGPT on the 123 prompts demonstrated that, while the responses generally received above-average scores, they occupy a spectrum from the concise and correct via the absurd to what only can be described as hazardously incorrect and incomplete. Prompts formulated at higher levels of health literacy generally yielded higher-quality answers. Counterintuitively, responses in a lower-resource language were often of higher quality.

Conclusions: The results emphasise the relationship between prompt and response quality and hint at potentially concerning futures in personalised medicine. The widespread use of large language models for health advice might amplify existing health inequalities and will increase the pressure on healthcare systems by providing easy access to many seemingly likely differential diagnoses and recommendations for seeing a doctor for even harmless ailments.

Keywords: computer simulation; myocardial infarction; systematic reviews as topic.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None declared.

Figures

Figure 1
Figure 1
Number of publications mentioning ChatGPT in title and/or abstract by month indexed in the PubMed database.
Figure 2
Figure 2
Upset plot of the distribution of the five main themes found in the 258 articles from PubMed included.
Figure 3
Figure 3
Histogram illustrating the distribution of scores across the four criteria correct, concise, comprehensive,= and comprehensible, on the whole dataset of 123 prompt-response pairs.
Figure 4
Figure 4
Clustered relative population size histogram (corrected for class imbalance) visualising how the different characteristics influence the distributions of scores. If the filled and half-filled bars meet at 50% then there is no difference in the coverings, if they are tending to one side, one of the categories received fewer of this score than the other.
Figure 5
Figure 5
Pairwise mutual information of the variables we tracked in the systematic exploration of prompting and scoring the responses. Mutual information measures how much knowing one variable tells us about another variable.

References

    1. Arif TB, Munaf U, Ul-Haque I. The future of medical education and research: is ChatGPT a blessing or blight in disguise Med Educ Online 2023;28:2181052. 10.1080/10872981.2023.2181052 - DOI - PMC - PubMed
    1. Gordijn B, Have HT. Chatgpt: evolution or revolution? Med Health Care Philos 2023;26:1–2. 10.1007/s11019-023-10136-0 - DOI - PubMed
    1. Ahn C. Exploring ChatGPT for information of cardiopulmonary resuscitation. Resuscitation 2023;185:109729. 10.1016/j.resuscitation.2023.109729 - DOI - PubMed
    1. Hopkins AM, Logan JM, Kichenadasse G, et al. . AI chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr 2023;7:pkad010. 10.1093/jncics/pkad010 - DOI - PMC - PubMed
    1. Sallam M, Salim NA, Al-Tammemi AB, et al. . Chatgpt output regarding compulsory vaccination and COVID-19 vaccine conspiracy: a descriptive study at the outset of a paradigm shift in online search for information. Cureus 2023;15:e35029. 10.7759/cureus.35029 - DOI - PMC - PubMed