Development of a liver disease-specific large language model chat interface using retrieval-augmented generation

Jin Ge¹, Steve Sun², Joseph Owens², Victor Galvez², Oksana Gologorskaya^{2

3}, Jennifer C Lai¹, Mark J Pletcher⁴, Ki Lai²

Affiliations

¹ Department of Medicine, Division of Gastroenterology and Hepatology, University of California-San Francisco, San Francisco, California, USA.
² UCSF Health Information Technology, University of California-San Francisco, San Francisco, California, USA.
³ Bakar Computational Health Sciences Institute, University of California-San Francisco, San Francisco, California, USA.
⁴ Department of Epidemiology and Biostatistics, University of California-San Francisco, San Francisco, California, USA.

PMID: 38451962
PMCID: PMC11706764
DOI: 10.1097/HEP.0000000000000834

Development of a liver disease-specific large language model chat interface using retrieval-augmented generation

Jin Ge et al. Hepatology. 2024.

. 2024 Nov 1;80(5):1158-1168.

doi: 10.1097/HEP.0000000000000834. Epub 2024 Mar 7.

Authors

Jin Ge¹, Steve Sun², Joseph Owens², Victor Galvez², Oksana Gologorskaya^{2

3}, Jennifer C Lai¹, Mark J Pletcher⁴, Ki Lai²

Affiliations

¹ Department of Medicine, Division of Gastroenterology and Hepatology, University of California-San Francisco, San Francisco, California, USA.
² UCSF Health Information Technology, University of California-San Francisco, San Francisco, California, USA.
³ Bakar Computational Health Sciences Institute, University of California-San Francisco, San Francisco, California, USA.
⁴ Department of Epidemiology and Biostatistics, University of California-San Francisco, San Francisco, California, USA.

PMID: 38451962
PMCID: PMC11706764
DOI: 10.1097/HEP.0000000000000834

Abstract

Background and aims: Large language models (LLMs) have significant capabilities in clinical information processing tasks. Commercially available LLMs, however, are not optimized for clinical uses and are prone to generating hallucinatory information. Retrieval-augmented generation (RAG) is an enterprise architecture that allows the embedding of customized data into LLMs. This approach "specializes" the LLMs and is thought to reduce hallucinations.

Approach and results: We developed "LiVersa," a liver disease-specific LLM, by using our institution's protected health information-complaint text embedding and LLM platform, "Versa." We conducted RAG on 30 publicly available American Association for the Study of Liver Diseases guidance documents to be incorporated into LiVersa. We evaluated LiVersa's performance by conducting 2 rounds of testing. First, we compared LiVersa's outputs versus those of trainees from a previously published knowledge assessment. LiVersa answered all 10 questions correctly. Second, we asked 15 hepatologists to evaluate the outputs of 10 hepatology topic questions generated by LiVersa, OpenAI's ChatGPT 4, and Meta's Large Language Model Meta AI 2. LiVersa's outputs were more accurate but were rated less comprehensive and safe compared to those of ChatGPT 4.

Results: We evaluated LiVersa's performance by conducting 2 rounds of testing. First, we compared LiVersa's outputs versus those of trainees from a previously published knowledge assessment. LiVersa answered all 10 questions correctly. Second, we asked 15 hepatologists to evaluate the outputs of 10 hepatology topic questions generated by LiVersa, OpenAI's ChatGPT 4, and Meta's Large Language Model Meta AI 2. LiVersa's outputs were more accurate but were rated less comprehensive and safe compared to those of ChatGPT 4.

Conclusions: In this demonstration, we built disease-specific and protected health information-compliant LLMs using RAG. While LiVersa demonstrated higher accuracy in answering questions related to hepatology, there were some deficiencies due to limitations set by the number of documents used for RAG. LiVersa will likely require further refinement before potential live deployment. The LiVersa prototype, however, is a proof of concept for utilizing RAG to customize LLMs for clinical use cases.

PubMed Disclaimer

Figures

See this image and copyright information in PMC

Update of

Development of a Liver Disease-Specific Large Language Model Chat Interface using Retrieval Augmented Generation.
Ge J, Sun S, Owens J, Galvez V, Gologorskaya O, Lai JC, Pletcher MJ, Lai K. Ge J, et al. medRxiv [Preprint]. 2023 Nov 10:2023.11.10.23298364. doi: 10.1101/2023.11.10.23298364. medRxiv. 2023. Update in: Hepatology. 2024 Nov 1;80(5):1158-1168. doi: 10.1097/HEP.0000000000000834. PMID: 37986764 Free PMC article. Updated. Preprint.

References

1. Ge J, Li M, Delk MB, Lai JC. A comparison of a large language model vs manual chart review for the extraction of data elements from the electronic health record. Gastroenterology. 2023; - PMC - PubMed
1. Rahman M, Terano HJR, Rahman N, Salamzadeh A, Rahaman S. Chatgpt and academic research: A review and recommendations based on practical examples. J. Educ., Mngt., and Dev. Studies 2023;3:1–12.
1. Nayak A, Alkaitis MS, Nayak K, Nikolov M, Weinfurt KP, Schulman K. Comparison of history of present illness summaries generated by a chatbot and senior internal medicine residents. JAMA Intern. Med 2023;183:1026–1027. - PMC - PubMed
1. Han C, Kim DW, Kim S, You SC, Park JY, Bae S, et al. Evaluation Of GPT-4 for 10-Year Cardiovascular Risk Prediction: Insights from the UK Biobank and KoGES Data. 2023; - PMC - PubMed
1. ChatGPT: Optimizing Language Models for Dialogue [Internet]. [cited 2022 Dec 17];Available from: https://openai.com/blog/chatgpt/

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Development of a liver disease-specific large language model chat interface using retrieval-augmented generation

Affiliations

Development of a liver disease-specific large language model chat interface using retrieval-augmented generation

Authors

Affiliations

Abstract

Figures

Update of

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical