Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2024 Jun 26:26:e52001.
doi: 10.2196/52001.

Assessing the Reproducibility of the Structured Abstracts Generated by ChatGPT and Bard Compared to Human-Written Abstracts in the Field of Spine Surgery: Comparative Analysis

Affiliations
Comparative Study

Assessing the Reproducibility of the Structured Abstracts Generated by ChatGPT and Bard Compared to Human-Written Abstracts in the Field of Spine Surgery: Comparative Analysis

Hong Jin Kim et al. J Med Internet Res. .

Abstract

Background: Due to recent advances in artificial intelligence (AI), language model applications can generate logical text output that is difficult to distinguish from human writing. ChatGPT (OpenAI) and Bard (subsequently rebranded as "Gemini"; Google AI) were developed using distinct approaches, but little has been studied about the difference in their capability to generate the abstract. The use of AI to write scientific abstracts in the field of spine surgery is the center of much debate and controversy.

Objective: The objective of this study is to assess the reproducibility of the structured abstracts generated by ChatGPT and Bard compared to human-written abstracts in the field of spine surgery.

Methods: In total, 60 abstracts dealing with spine sections were randomly selected from 7 reputable journals and used as ChatGPT and Bard input statements to generate abstracts based on supplied paper titles. A total of 174 abstracts, divided into human-written abstracts, ChatGPT-generated abstracts, and Bard-generated abstracts, were evaluated for compliance with the structured format of journal guidelines and consistency of content. The likelihood of plagiarism and AI output was assessed using the iThenticate and ZeroGPT programs, respectively. A total of 8 reviewers in the spinal field evaluated 30 randomly extracted abstracts to determine whether they were produced by AI or human authors.

Results: The proportion of abstracts that met journal formatting guidelines was greater among ChatGPT abstracts (34/60, 56.6%) compared with those generated by Bard (6/54, 11.1%; P<.001). However, a higher proportion of Bard abstracts (49/54, 90.7%) had word counts that met journal guidelines compared with ChatGPT abstracts (30/60, 50%; P<.001). The similarity index was significantly lower among ChatGPT-generated abstracts (20.7%) compared with Bard-generated abstracts (32.1%; P<.001). The AI-detection program predicted that 21.7% (13/60) of the human group, 63.3% (38/60) of the ChatGPT group, and 87% (47/54) of the Bard group were possibly generated by AI, with an area under the curve value of 0.863 (P<.001). The mean detection rate by human reviewers was 53.8% (SD 11.2%), achieving a sensitivity of 56.3% and a specificity of 48.4%. A total of 56.3% (63/112) of the actual human-written abstracts and 55.9% (62/128) of AI-generated abstracts were recognized as human-written and AI-generated by human reviewers, respectively.

Conclusions: Both ChatGPT and Bard can be used to help write abstracts, but most AI-generated abstracts are currently considered unethical due to high plagiarism and AI-detection rates. ChatGPT-generated abstracts appear to be superior to Bard-generated abstracts in meeting journal formatting guidelines. Because humans are unable to accurately distinguish abstracts written by humans from those produced by AI programs, it is crucial to exercise special caution and examine the ethical boundaries of using AI programs, including ChatGPT and Bard.

Keywords: AI; Bard; ChatGPT; abstract; artificial intelligence; chatbot; ethics; formatting guidelines; journal guidelines; language model; orthopedic surgery; plagiarism; scientific abstract; spine; spine surgery; surgery.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: LGL receives grants from Setting Scoliosis Straight Foundation, AO Spine, and ISSG, royalties from Medtronic and Acuity Surgical, consulting fees from Medtronic and Acuity Surgical. GMMJ owns stocks in Nuvasive; reports consulting fees from Orthofix and Carlsmed, Inc; receives royalties from SI-BONE. The other authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Flowchart of the study. AI: artificial intelligence.
Figure 2
Figure 2
Assessment of artificial intelligence program detection. Receiver operative characteristics analysis showed an area under the curve of 0.863 (P<.001; 95% CI 0.806-0.920) and a cut-off value of 52.5%, with 73.7% sensitivity and 85% specificity. AUC: area under the curve.

References

    1. Will ChatGPT transform healthcare? Nat Med. 2023;29(3):505–506. doi: 10.1038/s41591-023-02289-5.10.1038/s41591-023-02289-5 - DOI - PubMed
    1. Bi AS. What's important: the next academic-ChatGPT AI? J Bone Joint Surg Am. 2023:00. doi: 10.2106/JBJS.23.00269.00004623-990000000-00788 - DOI - PubMed
    1. Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023;9:e46885. doi: 10.2196/46885. https://mededu.jmir.org/2023//e46885/ v9i1e46885 - DOI - PMC - PubMed
    1. Else H. Abstracts written by ChatGPT fool scientists. Nature. 2023;613(7944):423. doi: 10.1038/d41586-023-00056-7.10.1038/d41586-023-00056-7 - DOI - PubMed
    1. Gao CA, Howard FM, Markov NS, Dyer EC, Ramesh S, Luo Y, Pearson AT. Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers. NPJ Digit Med. 2023;6(1):75. doi: 10.1038/s41746-023-00819-6. doi: 10.1038/s41746-023-00819-6.10.1038/s41746-023-00819-6 - DOI - DOI - PMC - PubMed

Publication types