AUTOGEN: A Personalized Large Language Model for Academic Enhancement-Ethics and Proof of Principle

Sebastian Porsdam Mann¹, Brian D Earp¹, Nikolaj Møller¹, Suren Vynn², Julian Savulescu³

Affiliations

PMID: 37487183
DOI: 10.1080/15265161.2023.2233356

Free article

AUTOGEN: A Personalized Large Language Model for Academic Enhancement-Ethics and Proof of Principle

Sebastian Porsdam Mann et al. Am J Bioeth. 2023 Oct.

Free article

. 2023 Oct;23(10):28-41.

doi: 10.1080/15265161.2023.2233356. Epub 2023 Jul 24.

Authors

Sebastian Porsdam Mann¹, Brian D Earp¹, Nikolaj Møller¹, Suren Vynn², Julian Savulescu³

Affiliations

¹ University of Oxford.
² Independent Researcher.
³ National University of Singapore.

PMID: 37487183
DOI: 10.1080/15265161.2023.2233356

Abstract

In this article, we explore the potential of enhancing academic prose and idea generation by fine-tuning a large language model (here, GPT-3) on one's own previously published writings: AUTOGEN ("AI Unique Tailored Output GENerator"). We develop, test, and describe three distinct AUTOGEN models trained on the prior scholarly output of three of the current authors (SBM, BDE, JS), with a fourth model trained on the combined works of all three. Our AUTOGEN models demonstrate greater variance in quality than the base GPT-3 model, with many outputs outperforming the base model in format, style, overall quality, and novel idea generation. As proof of principle, we present and discuss examples of AUTOGEN-written sections of existing and hypothetical research papers. We further discuss ethical opportunities, concerns, and open questions associated with personalized academic prose and idea generators. Ethical opportunities for personalized LLMs such as AUTOGEN include increased productivity, preservation of writing styles and cultural traditions, and aiding consensus building. However, ethical concerns arise due to the potential for personalized LLMs to reduce output diversity, violate privacy and intellectual property rights, and facilitate plagiarism or fraud. The use of coauthored or multiple-source trained models further complicates issues surrounding ownership and attribution. Open questions concern a potential credit-blame asymmetry for LLM outputs, the legitimacy of licensing agreements in authorship ascription, and the ethical implications of coauthorship attribution for data contributors. Ensuring the output is sufficiently distinct from the source material is crucial to maintaining ethical standards in academic writing. These opportunities, risks, and open issues highlight the intricate ethical landscape surrounding the use of personalized LLMs in academia. We also discuss open technical questions concerning the integration of AUTOGEN-style personalized LLMs with other LLMs, such as GPT-4, for iterative refinement and improvement of generated text. In conclusion, we argue that AUTOGEN-style personalized LLMs offer significant potential benefits in terms of both prose generation and, to a lesser extent, idea generation. If associated ethical issues are appropriately addressed, AUTOGEN alone or in combination with other LLMs can be seen as a potent form of academic enhancement.

Keywords: AUTOGEN; Fine-tuning; bioethics; ethics; large language models (LLM); personalised LLM.

PubMed Disclaimer

Comment in

Why Personalized Large Language Models Fail to Do What Ethics is All About.
Laacke S, Gauckler C. Laacke S, et al. Am J Bioeth. 2023 Oct;23(10):60-63. doi: 10.1080/15265161.2023.2250292. Epub 2023 Oct 9. Am J Bioeth. 2023. PMID: 37812095 No abstract available.
Publish with AUTOGEN or Perish? Some Pitfalls to Avoid in the Pursuit of Academic Enhancement via Personalized Large Language Models.
Erler A. Erler A. Am J Bioeth. 2023 Oct;23(10):94-96. doi: 10.1080/15265161.2023.2250291. Epub 2023 Oct 9. Am J Bioeth. 2023. PMID: 37812096 No abstract available.
The Impact of AUTOGEN and Similar Fine-Tuned Large Language Models on the Integrity of Scholarly Writing.
Resnik DB, Hosseini M. Resnik DB, et al. Am J Bioeth. 2023 Oct;23(10):50-52. doi: 10.1080/15265161.2023.2250276. Epub 2023 Oct 9. Am J Bioeth. 2023. Update in: Am J Bioeth. 2024 Mar;24(3):W6-W14. doi: 10.1080/15265161.2024.2308175. PMID: 37812101 Free PMC article. Updated. No abstract available.
Reimagining Scholarship: A Response to the Ethical Concerns of AUTOGEN.
Zohny H. Zohny H. Am J Bioeth. 2023 Oct;23(10):96-99. doi: 10.1080/15265161.2023.2250315. Epub 2023 Oct 9. Am J Bioeth. 2023. PMID: 37812109 No abstract available.
Generative AI and Ethical Analysis.
McMillan J. McMillan J. Am J Bioeth. 2023 Oct;23(10):42-44. doi: 10.1080/15265161.2023.2249852. Epub 2023 Oct 9. Am J Bioeth. 2023. PMID: 37812114 No abstract available.
Meaning by Courtesy: LLM-Generated Texts and the Illusion of Content.
Ostertag G. Ostertag G. Am J Bioeth. 2023 Oct;23(10):91-93. doi: 10.1080/15265161.2023.2249851. Epub 2023 Oct 9. Am J Bioeth. 2023. PMID: 37812115 No abstract available.
Large Language Models and Inclusivity in Bioethics Scholarship.
Varma S. Varma S. Am J Bioeth. 2023 Oct;23(10):105-107. doi: 10.1080/15265161.2023.2250286. Epub 2023 Oct 9. Am J Bioeth. 2023. PMID: 37812117 No abstract available.
Is Academic Enhancement Possible by Means of Generative AI-Based Digital Twins?
Nyholm S. Nyholm S. Am J Bioeth. 2023 Oct;23(10):44-47. doi: 10.1080/15265161.2023.2249846. Epub 2023 Oct 9. Am J Bioeth. 2023. PMID: 37812121 No abstract available.
Generative-AI-Generated Challenges for Health Data Research.
Spector-Bagdady K. Spector-Bagdady K. Am J Bioeth. 2023 Oct;23(10):1-5. doi: 10.1080/15265161.2023.2252311. Epub 2023 Oct 9. Am J Bioeth. 2023. PMID: 37831940 Free PMC article. No abstract available.
AUTOGEN and the Ethics of Co-Creation with Personalized LLMs-Reply to the Commentaries.
Porsdam Mann S, Earp BD, Møller N, Suren V, Savulescu J. Porsdam Mann S, et al. Am J Bioeth. 2024 Mar;24(3):W6-W14. doi: 10.1080/15265161.2024.2308175. Epub 2024 Feb 12. Am J Bioeth. 2024. PMID: 38346141 No abstract available.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Atypon
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

AUTOGEN: A Personalized Large Language Model for Academic Enhancement-Ethics and Proof of Principle

Affiliations

AUTOGEN: A Personalized Large Language Model for Academic Enhancement-Ethics and Proof of Principle

Authors

Affiliations

Abstract

Comment in

MeSH terms

LinkOut - more resources

Full Text Sources

Research Materials