Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 26;4(1):100373.
doi: 10.1016/j.jacig.2024.100373. eCollection 2025 Feb.

Three versions of an atopic dermatitis case report written by humans, artificial intelligence, or both: Identification of authorship and preferences

Collaborators, Affiliations

Three versions of an atopic dermatitis case report written by humans, artificial intelligence, or both: Identification of authorship and preferences

Mara Giavina Bianchi et al. J Allergy Clin Immunol Glob. .

Abstract

Background: The use of artificial intelligence (AI) in scientific writing is rapidly increasing, raising concerns about authorship identification, content quality, and writing efficiency.

Objectives: This study investigates the real-world impact of ChatGPT, a large language model, on those aspects in a simulated publication scenario.

Methods: Forty-eight individuals representing 3 medical expertise levels (medical students, residents, and experts in allergy or dermatology) evaluated 3 blinded versions of an atopic dermatitis case report: one each human written (HUM), AI generated (AI), and combined written (COM). The survey assessed authorship, ranked their preference, and graded 13 quality criteria for each text. Time taken to generate each manuscript was also recorded.

Results: Authorship identification accuracy mirrored the odds at 33%. Expert participants (50.9%) demonstrated significantly higher accuracy compared to residents (27.7%) and students (19.6%, P < .001). Participants favored AI-assisted versions (AI and COM) over HUM (P < .001), with COM receiving the highest quality scores. COM and AI achieved 83.8% and 84.3% reduction in writing time, respectively, compared to HUM, while showing 13.9% (P < .001) and 11.1% improvement in quality (P < .001), respectively. However, experts assigned the lowest score for the references of the AI manuscript, potentially hindering its publication.

Conclusion: AI can deceptively mimic human writing, particularly for less experienced readers. Although AI-assisted writing is appealing and offers significant time savings, human oversight remains crucial to ensure accuracy, ethical considerations, and optimal quality. These findings underscore the need for transparency in AI use and highlight the potential of human-AI collaboration in the future of scientific writing.

Keywords: ChatGPT; Generative Pre-training Transformer (GPT); artificial intelligence; large language model (LLM); medical survey; scientific writing.

PubMed Disclaimer

Conflict of interest statement

Declaration of generative AI in scientific writing: AMVD applied Gemini 1.5 Pro and GPT-4o in writing small sections of the introduction and methods after being given a content outline. These models were also used in reviewing the writing of the entire article as well as simulating peer review feedback. After using this tool/service, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication. Disclosure of potential conflict of interest: M. G. Bianchi reports serving as speaker for AbbVie and Sanofi. P. G. Bianchi reports serving as speaker for AbbVie, AstraZeneca, CSL Behring, GSK, Novartis, Pint Pharma, Sanofi, and Takeda/Shire. The rest of the authors declare that they have no relevant conflicts of interest.

Figures

Fig 1
Fig 1
Methodologic overview of authorship models and AI-assisted writing.
Fig 2
Fig 2
Confusion matrix depicting participants’ predictions versus actual authorship of case reports.
Fig 3
Fig 3
Ranking of preferred versions for 48 participants combining 3 different scoring criteria.

References

    1. Geng M., Trotta R. Is ChatGPT transforming academics’ writing style? arXiv, April 12, 2024 [v1] doi: 10.48550/arXiv.2404.08627. - DOI
    1. Anderson N., Belavy D.L., Perle S.M., Hendricks S., Hespanhol L., Verhagen E., et al. AI did not write this manuscript, or did it? Can we trick the AI text detector into generated texts? The potential future of ChatGPT and AI in sports and exercise medicine manuscript generation. BMJ Open Sport Exerc Med. 2023;9 - PMC - PubMed
    1. Preiksaitis C., Rose C. Opportunities, challenges, and future directions of generative artificial intelligence in medical education: scoping review. JMIR Med Educ. 2023;9 - PMC - PubMed
    1. Bender E, Gebru T, McMillan-Major A, Shmitchell S. On the dangers of stochastic parrots: can language models be too big? In: FAccT ’21: proceedings of the 2021 ACM conference on fairness, accountability, and transparency, p. 610-623. 10.1145/3442188.3445922 - DOI
    1. Chen S., Kann B.H., Foote M.B., Aerts H.J.W.L., Savova G.K., Mak R.H., et al. Use of artificial intelligence chatbots for cancer treatment information. JAMA Oncol. 2023;9:1459–1462. - PMC - PubMed

LinkOut - more resources