Artificial Intelligence-Generated Editorials in Radiology: Can Expert Editors Detect Them?

Affiliations

¹ From the Department of Neuroradiology (B.B.O., M.W.), The University of Texas MD Anderson Center, Houston, Texas.
² Joint Department of Medical Imaging (A.B.), University of Toronto, Toronto, Ontario, Canada.
³ Department of Biostatistics (B.A.C.), University of Washington, Seattle, Washington.
⁴ Department of Radiology (J.V.G.), Antwerp University Hospital, Antwerp, Belgium.
⁵ Department of Radiology (T.A.G.M.H.), Texas Children's Hospital and Baylor College of Medicine, Houston, Texas.
⁶ Department of Radiology (J.S.R.), Mayo Clinic Arizona, Phoenix, Arizona.
⁷ Department of Radiology (L.S.), University of Cagliari, Cagliari, Italy.
⁸ Department of Radiology (L.M.S.), University of Utah, Salt Lake City, Utah.
⁹ Department of Radiology (M.C.), University of North Carolina School of Medicine, Chapel Hill, North Carolina mauricio_castillo@med.unc.edu.

PMID: 39288967
PMCID: PMC11979811 (available on 2026-03-01)
DOI: 10.3174/ajnr.A8505

Artificial Intelligence-Generated Editorials in Radiology: Can Expert Editors Detect Them?

Burak Berksu Ozkara et al. AJNR Am J Neuroradiol. 2025.

. 2025 Mar 4;46(3):559-566.

doi: 10.3174/ajnr.A8505.

Authors

Affiliations

¹ From the Department of Neuroradiology (B.B.O., M.W.), The University of Texas MD Anderson Center, Houston, Texas.
² Joint Department of Medical Imaging (A.B.), University of Toronto, Toronto, Ontario, Canada.
³ Department of Biostatistics (B.A.C.), University of Washington, Seattle, Washington.
⁴ Department of Radiology (J.V.G.), Antwerp University Hospital, Antwerp, Belgium.
⁵ Department of Radiology (T.A.G.M.H.), Texas Children's Hospital and Baylor College of Medicine, Houston, Texas.
⁶ Department of Radiology (J.S.R.), Mayo Clinic Arizona, Phoenix, Arizona.
⁷ Department of Radiology (L.S.), University of Cagliari, Cagliari, Italy.
⁸ Department of Radiology (L.M.S.), University of Utah, Salt Lake City, Utah.
⁹ Department of Radiology (M.C.), University of North Carolina School of Medicine, Chapel Hill, North Carolina mauricio_castillo@med.unc.edu.

PMID: 39288967
PMCID: PMC11979811 (available on 2026-03-01)
DOI: 10.3174/ajnr.A8505

Abstract

Background and purpose: Artificial intelligence is capable of generating complex texts that may be indistinguishable from those written by humans. We aimed to evaluate the ability of GPT-4 to write radiology editorials and to compare these with human-written counterparts, thereby determining their real-world applicability for scientific writing.

Materials and methods: Sixteen editorials from 8 journals were included. To generate the artificial intelligence (AI)-written editorials, the summary of 16 human-written editorials was fed into GPT-4. Six experienced editors reviewed the articles. First, an unpaired approach was used. The raters were asked to evaluate the content of each article by using a 1-5 Likert scale across specified metrics. Then, they determined whether the editorials were written by humans or AI. The articles were then evaluated in pairs to determine which article was generated by AI and which should be published. Finally, the articles were analyzed with an AI detector and for plagiarism.

Results: The human-written articles had a median AI probability score of 2.0%, whereas the AI-written articles had 58%. The median similarity score among AI-written articles was 3%. Fifty-eight percent of unpaired articles were correctly classified regarding authorship. Rating accuracy was increased to 70% in the paired setting. AI-written articles received slightly higher scores in most metrics. When stratified by perception, human-written perceived articles were rated higher in most categories. In the paired setting, raters strongly preferred publishing the article they perceived as human-written (82%).

Conclusions: GPT-4 can write high-quality articles that iThenticate does not flag as plagiarized, which may go undetected by editors, and that detection tools can detect to a limited extent. Editors showed a positive bias toward human-written articles.

PubMed Disclaimer

References

1. Sorin V, Klang E. Large language models and the emergence phenomena. Eur J Radiology Open 2023;10:100494 10.1016/j.ejro.2023.100494 - DOI - PMC - PubMed
1. Hwang SI, Lim JS, Lee RW, et al. Is ChatGPT a “fire of Prometheus” for non-native English-speaking researchers in academic writing? Korean J Radiology 2023;24:952–59 10.3348/kjr.2023.0773 - DOI - PMC - PubMed
1. Shen Y, Heacock L, Elias J, et al. ChatGPT and other large language models are double-edged swords. Radiology 2023;307:e230163 10.1148/radiol.230163 - DOI - PubMed
1. OpenAI. GPT-4 [Large language model]. March 14, 2023. https://chat.openai.com/chat.
1. Peng C, Yang X, Chen A, et al. A study of generative large language model for medical research and healthcare. NPJ Digit Med 2023;6:210 10.1038/s41746-023-00958-w - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- HighWire

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Artificial Intelligence-Generated Editorials in Radiology: Can Expert Editors Detect Them?

Affiliations

Artificial Intelligence-Generated Editorials in Radiology: Can Expert Editors Detect Them?

Authors

Affiliations

Abstract

References

MeSH terms

LinkOut - more resources

Full Text Sources