Three versions of an atopic dermatitis case report written by humans, artificial intelligence, or both: Identification of authorship and preferences
- PMID: 39759954
- PMCID: PMC11699452
- DOI: 10.1016/j.jacig.2024.100373
Three versions of an atopic dermatitis case report written by humans, artificial intelligence, or both: Identification of authorship and preferences
Abstract
Background: The use of artificial intelligence (AI) in scientific writing is rapidly increasing, raising concerns about authorship identification, content quality, and writing efficiency.
Objectives: This study investigates the real-world impact of ChatGPT, a large language model, on those aspects in a simulated publication scenario.
Methods: Forty-eight individuals representing 3 medical expertise levels (medical students, residents, and experts in allergy or dermatology) evaluated 3 blinded versions of an atopic dermatitis case report: one each human written (HUM), AI generated (AI), and combined written (COM). The survey assessed authorship, ranked their preference, and graded 13 quality criteria for each text. Time taken to generate each manuscript was also recorded.
Results: Authorship identification accuracy mirrored the odds at 33%. Expert participants (50.9%) demonstrated significantly higher accuracy compared to residents (27.7%) and students (19.6%, P < .001). Participants favored AI-assisted versions (AI and COM) over HUM (P < .001), with COM receiving the highest quality scores. COM and AI achieved 83.8% and 84.3% reduction in writing time, respectively, compared to HUM, while showing 13.9% (P < .001) and 11.1% improvement in quality (P < .001), respectively. However, experts assigned the lowest score for the references of the AI manuscript, potentially hindering its publication.
Conclusion: AI can deceptively mimic human writing, particularly for less experienced readers. Although AI-assisted writing is appealing and offers significant time savings, human oversight remains crucial to ensure accuracy, ethical considerations, and optimal quality. These findings underscore the need for transparency in AI use and highlight the potential of human-AI collaboration in the future of scientific writing.
Keywords: ChatGPT; Generative Pre-training Transformer (GPT); artificial intelligence; large language model (LLM); medical survey; scientific writing.
© 2024 The Author(s).
Conflict of interest statement
Declaration of generative AI in scientific writing: AMVD applied Gemini 1.5 Pro and GPT-4o in writing small sections of the introduction and methods after being given a content outline. These models were also used in reviewing the writing of the entire article as well as simulating peer review feedback. After using this tool/service, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication. Disclosure of potential conflict of interest: M. G. Bianchi reports serving as speaker for AbbVie and Sanofi. P. G. Bianchi reports serving as speaker for AbbVie, AstraZeneca, CSL Behring, GSK, Novartis, Pint Pharma, Sanofi, and Takeda/Shire. The rest of the authors declare that they have no relevant conflicts of interest.
Figures
References
-
- Geng M., Trotta R. Is ChatGPT transforming academics’ writing style? arXiv, April 12, 2024 [v1] doi: 10.48550/arXiv.2404.08627. - DOI
-
- Anderson N., Belavy D.L., Perle S.M., Hendricks S., Hespanhol L., Verhagen E., et al. AI did not write this manuscript, or did it? Can we trick the AI text detector into generated texts? The potential future of ChatGPT and AI in sports and exercise medicine manuscript generation. BMJ Open Sport Exerc Med. 2023;9 - PMC - PubMed
-
- Bender E, Gebru T, McMillan-Major A, Shmitchell S. On the dangers of stochastic parrots: can language models be too big? In: FAccT ’21: proceedings of the 2021 ACM conference on fairness, accountability, and transparency, p. 610-623. 10.1145/3442188.3445922 - DOI
LinkOut - more resources
Full Text Sources