Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 3:249:10393.
doi: 10.3389/ebm.2024.10393. eCollection 2024.

Enhancing pharmacogenomic data accessibility and drug safety with large language models: a case study with Llama3.1

Affiliations

Enhancing pharmacogenomic data accessibility and drug safety with large language models: a case study with Llama3.1

Dan Li et al. Exp Biol Med (Maywood). .

Abstract

Pharmacogenomics (PGx) holds the promise of personalizing medical treatments based on individual genetic profiles, thereby enhancing drug efficacy and safety. However, the current landscape of PGx research is hindered by fragmented data sources, time-consuming manual data extraction processes, and the need for comprehensive and up-to-date information. This study aims to address these challenges by evaluating the ability of Large Language Models (LLMs), specifically Llama3.1-70B, to automate and improve the accuracy of PGx information extraction from the FDA Table of Pharmacogenomic Biomarkers in Drug Labeling (FDA PGx Biomarker table), which is well-structured with drug names, biomarkers, therapeutic area, and related labeling texts. Our primary goal was to test the feasibility of LLMs in streamlining PGx data extraction, as an alternative to traditional, labor-intensive approaches. Llama3.1-70B achieved 91.4% accuracy in identifying drug-biomarker pairs from single labeling texts and 82% from mixed texts, with over 85% consistency in aligning extracted PGx categories from FDA PGx Biomarker table and relevant scientific abstracts, demonstrating its effectiveness for PGx data extraction. By integrating data from diverse sources, including scientific abstracts, this approach can support pharmacologists, regulatory bodies, and healthcare researchers in updating PGx resources more efficiently, making critical information more accessible for applications in personalized medicine. In addition, this approach shows potential of discovering novel PGx information, particularly of underrepresented minority ethnic groups. This study highlights the ability of LLMs to enhance the efficiency and completeness of PGx research, thus laying a foundation for advancements in personalized medicine by ensuring that drug therapies are tailored to the genetic profiles of diverse populations.

Keywords: LLMs; biomarker; large language models; minority ethnic groups; pharmacogenomics.

PubMed Disclaimer

Conflict of interest statement

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

FIGURE 1
FIGURE 1
(A) The frequency of the Therapeutic Area in the FDA PGx Biomarker table. Majority of the records were related to Oncology. (B) The percentage of listed drug-biomarker pairs identified correctly by the model in structured and mixed texts, respectively. (C) The number and partition of the drug-biomarker identification results in structured texts.
FIGURE 2
FIGURE 2
PGx categories summarized from the FDA PGx Biomarker table and relevant scientific abstracts. (A) The frequency of predefined PGx categories summarized by Llama3.1-70B for the 178 ethnic records from the FDA PGx Biomarker table. (B) The concordance rate of PGx categories between the FDA PGx Biomarker table and abstracts. The highest rate based on a single abstract and the rate based on an aggregated abstract set were compared. (C) A comparison of the highest and the aggregated concordance rate for each individual record.

References

    1. Haga SB, Burke W. Using pharmacogenetics to improve drug safety and efficacy. Jama (2004) 291(23):2869–71. 10.1001/jama.291.23.2869 - DOI - PubMed
    1. Johnson JA, Cavallari LH. Pharmacogenetics and cardiovascular disease—implications for personalized medicine. Pharmacol Rev (2013) 65(3):987–1009. 10.1124/pr.112.007252 - DOI - PMC - PubMed
    1. Kalow W. Pharmacogenetics and pharmacogenomics: origin, status, and the hope for personalized medicine. Pharmacogenomics J (2006) 6(3):162–5. 10.1038/sj.tpj.6500361 - DOI - PubMed
    1. Micaglio E, Locati ET, Monasky MM, Romani F, Heilbron F, Pappone C. Role of pharmacogenetics in adverse drug reactions: an update towards personalized medicine. Front Pharmacol (2021) 12:651720. 10.3389/fphar.2021.651720 - DOI - PMC - PubMed
    1. Miteva-Marcheva NN, Ivanov HY, Dimitrov DK, Stoyanova VK. Application of pharmacogenetics in oncology. Biomarker Res (2020) 8(1):32. 10.1186/s40364-020-00213-4 - DOI - PMC - PubMed

LinkOut - more resources