Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan:88:99-108.
doi: 10.1016/j.bjps.2023.10.119. Epub 2023 Oct 29.

Modern Machiavelli? The illusion of ChatGPT-generated patient reviews in plastic and aesthetic surgery based on 9000 review classifications

Affiliations

Modern Machiavelli? The illusion of ChatGPT-generated patient reviews in plastic and aesthetic surgery based on 9000 review classifications

Samuel Knoedler et al. J Plast Reconstr Aesthet Surg. 2024 Jan.

Abstract

Background: Online patient reviews are crucial in guiding individuals who seek plastic surgery, but artificial chatbots pose a threat of disseminating fake reviews. This study aimed to compare real patient feedback with ChatGPT-generated reviews for the top five US plastic surgery procedures.

Methods: Thirty real patient reviews on rhinoplasty, blepharoplasty, facelift, liposuction, and breast augmentation were collected from RealSelf and used as templates for ChatGPT to generate matching patient reviews. Prolific users (n = 30) assessed 150 pairs of reviews to identify human-written and artificial intelligence (AI)-generated reviews. Patient reviews were further assessed using AI content detector software (Copyleaks AI).

Results: Among the 9000 classification tasks, 64.3% and 35.7% of reviews were classified as authentic and fake, respectively. On an average, the author (human versus machine) was correctly identified in 59.6% of cases, and this poor classification performance was consistent across all procedures. Patients with prior aesthetic treatment showed poorer classification performance than those without (p < 0.05). The mean character count in human-written reviews was significantly higher (p < 0.001) that that in AI-generated reviews, with a significant correlation between character count and participants' accuracy rate (p < 0.001). Emotional timbre of reviews differed significantly with "happiness" being more prevalent in human-written reviews (p < 0.001), and "disappointment" being more prevalent in AI reviews (p = 0.005). Copyleaks AI correctly classified 96.7% and 69.3% of human-written and ChatGPT-generated reviews, respectively.

Conclusion: ChatGPT convincingly replicates authentic patient reviews, even deceiving commercial AI detection software. Analyzing emotional tone and review length can help differentiate real from fake reviews, underscoring the need to educate both patients and physicians to prevent misinformation and mistrust.

Keywords: Artificial intelligence; ChatGPT; GPT-4; Large language model; Patient review; Plastic Surgery.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest None declared.

Comment in

LinkOut - more resources