Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 31:16:1617142.
doi: 10.3389/fphar.2025.1617142. eCollection 2025.

PolyLLM: polypharmacy side effect prediction via LLM-based SMILES encodings

Affiliations

PolyLLM: polypharmacy side effect prediction via LLM-based SMILES encodings

Sadra Hakim et al. Front Pharmacol. .

Abstract

Polypharmacy, the concurrent use of multiple drugs, is a common approach to treating patients with complex diseases or multiple conditions. Although consuming a combination of drugs can be beneficial in some cases, it can lead to unintended drug-drug interactions (DDI) and increase the risk of adverse side effects. Predicting these adverse side effects using state-of-the-art models like Large Language Models (LLMs) can greatly assist clinicians. In this study, we assess the impact of using different LLMs to predict polypharmacy. First, the chemical structure of drugs is vectorized using several LLMs such as ChemBERTa, GPT, etc., and are then combined to obtain a single representation for each drug pair. The drug pair representation is then fed into two separate models including a Multilayer Perceptron (MLP) and a Graph Neural Network (GNN) to predict the side effects. Our experimental evaluations show that integrating the embeddings of Deepchem ChemBERTa with the GNN architecture yields more effective results than other methods. Additionally, we demonstrated that utilizing complex models like LLMs to predict polypharmacy side effects using only chemical structures of drugs can be highly effective, even without incorporating other entities such as proteins or cell lines, which is particularly advantageous in scenarios where these entities are not available.

Keywords: drug combination; graph neural networks; large language models; polypharmacy side effect; smiles.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
(a) Distribution of SMILES string lengths for 645 drugs in the dataset. (b) Number of side effects associated with each drug pair in the reformatted dataset.
FIGURE 2
FIGURE 2
Pipeline of PolyLLM. (a) Drug-pair representations created by combining encoded drug features using language models (b) MLP approach: Multilayer perceptron to predict side effects from drug-pair representations. (c) Graph approach: Bipartite Graph constructed with drug-pair nodes and encoded side-effect features to predict whether or not a drug-pair is associated with side effects, considered as a binary link prediction task.
FIGURE 3
FIGURE 3
Overview of (a) the Multilayer Perceptron and (b) the Graph Neural Network approaches for predicting drug-pair side effects.
FIGURE 4
FIGURE 4
AUC scores for various fusion strategies across multiple encoders.

Similar articles

References

    1. Ahlmann M., Hempel G. (2016). The effect of cyclophosphamide on the immune system: implications for clinical cancer therapy. Cancer Chemother. Pharmacol. 78, 661–671. 10.1007/s00280-016-3152-1 - DOI - PubMed
    1. Bento A. P., Gaulton A., Hersey A., Bellis L. J., Chambers J., Davies M., et al. (2014). The ChEMBL bioactivity database: an update. Nucleic Acids Res. 42, D1083–D1090. 10.1093/nar/gkt1031 - DOI - PMC - PubMed
    1. Brown N., Fiscato M., Segler M. H., Vaucher A. C. (2019). GuacaMol: benchmarking Models for de novo Molecular Design. J. Chem. Inf. Model. 59, 1096–1108. 10.1021/acs.jcim.8b00839 - DOI - PubMed
    1. Cheng N., Wang L., Liu Y., Song B., Ding C. (2024). HANSynergy: heterogeneous graph attention network for drug synergy prediction. J. Chem. Inf. Model. 64, 4334–4347. 10.1021/acs.jcim.4c00003 - DOI - PMC - PubMed
    1. Chithrananda S., Grand G., Ramsundar B. (2020). ChemBERTa: large-scale self-supervised pretraining for molecular property prediction. 10.48550/ARXIV.2010.09885 - DOI

LinkOut - more resources