Explainable drug repurposing via path based knowledge graph completion

doi:10.1038/s41598-024-67163-x

. 2024 Jul 18;14(1):16587.

doi: 10.1038/s41598-024-67163-x.

Explainable drug repurposing via path based knowledge graph completion

Ana Jiménez^#¹, María José Merino^#¹, Juan Parras², Santiago Zazo¹

Affiliations

¹ Information Processing and Telecommunications Center, Universidad Politécnica de Madrid, ETSI Telecomunicación, Avda. Complutense, 30, 28040, Madrid, Spain.
² Information Processing and Telecommunications Center, Universidad Politécnica de Madrid, ETSI Telecomunicación, Avda. Complutense, 30, 28040, Madrid, Spain. j.parras@upm.es.

^# Contributed equally.

PMID: 39025897
PMCID: PMC11258358
DOI: 10.1038/s41598-024-67163-x

Explainable drug repurposing via path based knowledge graph completion

Ana Jiménez et al. Sci Rep. 2024.

. 2024 Jul 18;14(1):16587.

doi: 10.1038/s41598-024-67163-x.

Authors

Ana Jiménez^#¹, María José Merino^#¹, Juan Parras², Santiago Zazo¹

Affiliations

¹ Information Processing and Telecommunications Center, Universidad Politécnica de Madrid, ETSI Telecomunicación, Avda. Complutense, 30, 28040, Madrid, Spain.
² Information Processing and Telecommunications Center, Universidad Politécnica de Madrid, ETSI Telecomunicación, Avda. Complutense, 30, 28040, Madrid, Spain. j.parras@upm.es.

^# Contributed equally.

PMID: 39025897
PMCID: PMC11258358
DOI: 10.1038/s41598-024-67163-x

Abstract

Drug repurposing aims to find new therapeutic applications for existing drugs in the pharmaceutical market, leading to significant savings in time and cost. The use of artificial intelligence and knowledge graphs to propose repurposing candidates facilitates the process, as large amounts of data can be processed. However, it is important to pay attention to the explainability needed to validate the predictions. We propose a general architecture to understand several explainable methods for graph completion based on knowledge graphs and design our own architecture for drug repurposing. We present XG4Repo (eXplainable Graphs for Repurposing), a framework that takes advantage of the connectivity of any biomedical knowledge graph to link compounds to the diseases they can treat. Our method allows methapaths of different types and lengths, which are automatically generated and optimised based on data. XG4Repo focuses on providing meaningful explanations to the predictions, which are based on paths from compounds to diseases. These paths include nodes such as genes, pathways, side effects, or anatomies, so they provide information about the targets and other characteristics of the biomedical mechanism that link compounds and diseases. Paths make predictions interpretable for experts who can validate them and use them in further research on drug repurposing. We also describe three use cases where we analyse new uses for Epirubicin, Paclitaxel, and Predinisone and present the paths that support the predictions.

Keywords: Drug repurposing; Heterogeneous knowledge graphs; Hetionet; Interpretability; Knowledge graph completion; Rule-based link prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
General architecture to train path-based drug repurposing models. The information is presented in the form of a knowledge graph composed of triples (in yellow). The input of the model is the graph and the query that has to be solved. The model (in blue) generates paths between compounds and different diseases. A score is computed to assess the quality of the paths and diseases that they propose. The model is optimised (green block) so that the ground truth diseases have the highest score.

**Figure 2**
Description of AnyBURL based on our architecture. In this case, the input of the model is the whole graph (training set) including the ground truth. Paths are sampled based on random walk, and they are used to generate rules using a bottom-up approach. Then the confidence of the rule is computed, so only the rules with a confidence higher than a threshold are used for prediction. Rules are applied to the graph to obtain predictions which are ranked using the confidence of the rule.

**Figure 3**
Description of MINERVA based on our architecture. The core of the algorithm is the policy generator, which is trained to obtain the best policy through the reward using policy search. Paths are sampled from the generator to obtain candidate diseases which are ranked according to the path that proposes them.

**Figure 4**
Description of RNNLogic based on our architecture. In addition to the graph and the query, there is another input which is a set of prior rules to initialize the generator. The model consists of a rule generator and a reasoning predictor. A set of rules is sampled and used for prediction. During training, the predictor is updated using maximum likelihood estimation (MLE). Combining information of the generation and the prediction, a score for each rule $H (z_{i})$ is computed and it is using during the training of the generator which is based on expectation maximisation.

**Figure 5**
Description of XG4Repo architecture. The first step of the process is to generate a set of prior rules using AnyBURL rule miner. These rules are processed and used as priors in the generators. The model is trained using only the triples “compound treats disease” so the computational complexity is reduced. Once the predictions are made, the rules and corresponding scores are stored in natural language, so they can be easily understood. Moreover, our framework can generate Cypher queries to obtain the paths in Hetionet given the rules. This adds interpretability to the predictions without adding extra storage requirements.

**Figure 6**
Comparison of the MRR of different models for “compound treats disease” in Hetionet. The confidence intervals at 90% are included. Rule-based models work better than reinforcement learning. Due to the small test set, confidence intervals for rule-based models overlap, so it is not possible to identify the best performing one in statistical terms.

**Figure 7**
Set of paths that represent the triple Epirubicin treats breast cancer following the metapath [Compound $\overset{upregulates}{⟶}$ Gene $\overset{is expressed by}{⟶}$ Anatomy $\overset{is localized to}{⟶}$ Disease]. The number of nodes has been limited to facilitate visualization.

**Figure 8**
Set of paths that represent the triple “Epirubicin treats lung cancer” following the metapath [Compound $\overset{upregulates}{⟶}$ Gene $\overset{is upregulated by}{⟶}$ Compound $\overset{treats}{⟶}$ Disease]. The number of nodes has been limited to facilitate visualization.

See this image and copyright information in PMC

Cited by

Bind: large-scale biological interaction network discovery through knowledge graph-driven machine learning.
Aamer N, Asim MN, Bhatti AI, Dengel A. Aamer N, et al. J Transl Med. 2025 Jul 31;23(1):856. doi: 10.1186/s12967-025-06789-5. J Transl Med. 2025. PMID: 40745316 Free PMC article.
Universal multilayer network embedding reveals a causal link between GABA neurotransmitter and cancer.
Pio-Lopez L, Levin M. Pio-Lopez L, et al. BMC Bioinformatics. 2025 Jun 2;26(1):149. doi: 10.1186/s12859-025-06158-5. BMC Bioinformatics. 2025. PMID: 40457205 Free PMC article.

References

1. Parvathaneni V, Kulkarni NS, Muth A, Gupta V. Drug repurposing: A promising tool to accelerate the drug discovery process. Drug Discov. Today. 2019;24:2076–2085. doi: 10.1016/j.drudis.2019.06.014. - DOI - PMC - PubMed
1. Saberian N, Peyvandipour A, Donato M, Ansari S, Draghici S. A new computational drug repurposing method using established disease-drug pair knowledge. Bioinformatics. 2019;35:3672–3678. doi: 10.1093/bioinformatics/btz156. - DOI - PMC - PubMed
1. Danishuddin M, Khan AU. Structure based virtual screening to discover putative drug candidates: Necessary considerations and successful case studies. Methods. 2015;71:135–145. doi: 10.1016/j.ymeth.2014.10.019. - DOI - PubMed
1. Beck BR, Shin B, Choi Y, Park S, Kang K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput. Struct. Biotechnol. J. 2020;18:784–790. doi: 10.1016/j.csbj.2020.03.025. - DOI - PMC - PubMed
1. Zeng X, et al. Repurpose open data to discover therapeutics for covid-19 using deep learning. J. Proteome Res. 2020;19:4624–4636. doi: 10.1021/acs.jproteome.0c00316. - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central

[1] Parvathaneni V, Kulkarni NS, Muth A, Gupta V. Drug repurposing: A promising tool to accelerate the drug discovery process. Drug Discov. Today. 2019;24:2076–2085. doi: 10.1016/j.drudis.2019.06.014. - DOI - PMC - PubMed

[2] Parvathaneni V, Kulkarni NS, Muth A, Gupta V. Drug repurposing: A promising tool to accelerate the drug discovery process. Drug Discov. Today. 2019;24:2076–2085. doi: 10.1016/j.drudis.2019.06.014. - DOI - PMC - PubMed

[3] Saberian N, Peyvandipour A, Donato M, Ansari S, Draghici S. A new computational drug repurposing method using established disease-drug pair knowledge. Bioinformatics. 2019;35:3672–3678. doi: 10.1093/bioinformatics/btz156. - DOI - PMC - PubMed

[4] Saberian N, Peyvandipour A, Donato M, Ansari S, Draghici S. A new computational drug repurposing method using established disease-drug pair knowledge. Bioinformatics. 2019;35:3672–3678. doi: 10.1093/bioinformatics/btz156. - DOI - PMC - PubMed

[5] Danishuddin M, Khan AU. Structure based virtual screening to discover putative drug candidates: Necessary considerations and successful case studies. Methods. 2015;71:135–145. doi: 10.1016/j.ymeth.2014.10.019. - DOI - PubMed

[6] Danishuddin M, Khan AU. Structure based virtual screening to discover putative drug candidates: Necessary considerations and successful case studies. Methods. 2015;71:135–145. doi: 10.1016/j.ymeth.2014.10.019. - DOI - PubMed

[7] Beck BR, Shin B, Choi Y, Park S, Kang K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput. Struct. Biotechnol. J. 2020;18:784–790. doi: 10.1016/j.csbj.2020.03.025. - DOI - PMC - PubMed

[8] Beck BR, Shin B, Choi Y, Park S, Kang K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput. Struct. Biotechnol. J. 2020;18:784–790. doi: 10.1016/j.csbj.2020.03.025. - DOI - PMC - PubMed

[9] Zeng X, et al. Repurpose open data to discover therapeutics for covid-19 using deep learning. J. Proteome Res. 2020;19:4624–4636. doi: 10.1021/acs.jproteome.0c00316. - DOI - PubMed

[10] Zeng X, et al. Repurpose open data to discover therapeutics for covid-19 using deep learning. J. Proteome Res. 2020;19:4624–4636. doi: 10.1021/acs.jproteome.0c00316. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Explainable drug repurposing via path based knowledge graph completion

Affiliations

Explainable drug repurposing via path based knowledge graph completion

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources