Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 18;14(1):16587.
doi: 10.1038/s41598-024-67163-x.

Explainable drug repurposing via path based knowledge graph completion

Affiliations

Explainable drug repurposing via path based knowledge graph completion

Ana Jiménez et al. Sci Rep. .

Abstract

Drug repurposing aims to find new therapeutic applications for existing drugs in the pharmaceutical market, leading to significant savings in time and cost. The use of artificial intelligence and knowledge graphs to propose repurposing candidates facilitates the process, as large amounts of data can be processed. However, it is important to pay attention to the explainability needed to validate the predictions. We propose a general architecture to understand several explainable methods for graph completion based on knowledge graphs and design our own architecture for drug repurposing. We present XG4Repo (eXplainable Graphs for Repurposing), a framework that takes advantage of the connectivity of any biomedical knowledge graph to link compounds to the diseases they can treat. Our method allows methapaths of different types and lengths, which are automatically generated and optimised based on data. XG4Repo focuses on providing meaningful explanations to the predictions, which are based on paths from compounds to diseases. These paths include nodes such as genes, pathways, side effects, or anatomies, so they provide information about the targets and other characteristics of the biomedical mechanism that link compounds and diseases. Paths make predictions interpretable for experts who can validate them and use them in further research on drug repurposing. We also describe three use cases where we analyse new uses for Epirubicin, Paclitaxel, and Predinisone and present the paths that support the predictions.

Keywords: Drug repurposing; Heterogeneous knowledge graphs; Hetionet; Interpretability; Knowledge graph completion; Rule-based link prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
General architecture to train path-based drug repurposing models. The information is presented in the form of a knowledge graph composed of triples (in yellow). The input of the model is the graph and the query that has to be solved. The model (in blue) generates paths between compounds and different diseases. A score is computed to assess the quality of the paths and diseases that they propose. The model is optimised (green block) so that the ground truth diseases have the highest score.
Figure 2
Figure 2
Description of AnyBURL based on our architecture. In this case, the input of the model is the whole graph (training set) including the ground truth. Paths are sampled based on random walk, and they are used to generate rules using a bottom-up approach. Then the confidence of the rule is computed, so only the rules with a confidence higher than a threshold are used for prediction. Rules are applied to the graph to obtain predictions which are ranked using the confidence of the rule.
Figure 3
Figure 3
Description of MINERVA based on our architecture. The core of the algorithm is the policy generator, which is trained to obtain the best policy through the reward using policy search. Paths are sampled from the generator to obtain candidate diseases which are ranked according to the path that proposes them.
Figure 4
Figure 4
Description of RNNLogic based on our architecture. In addition to the graph and the query, there is another input which is a set of prior rules to initialize the generator. The model consists of a rule generator and a reasoning predictor. A set of rules is sampled and used for prediction. During training, the predictor is updated using maximum likelihood estimation (MLE). Combining information of the generation and the prediction, a score for each rule H(zi) is computed and it is using during the training of the generator which is based on expectation maximisation.
Figure 5
Figure 5
Description of XG4Repo architecture. The first step of the process is to generate a set of prior rules using AnyBURL rule miner. These rules are processed and used as priors in the generators. The model is trained using only the triples “compound treats disease” so the computational complexity is reduced. Once the predictions are made, the rules and corresponding scores are stored in natural language, so they can be easily understood. Moreover, our framework can generate Cypher queries to obtain the paths in Hetionet given the rules. This adds interpretability to the predictions without adding extra storage requirements.
Figure 6
Figure 6
Comparison of the MRR of different models for “compound treats disease” in Hetionet. The confidence intervals at 90% are included. Rule-based models work better than reinforcement learning. Due to the small test set, confidence intervals for rule-based models overlap, so it is not possible to identify the best performing one in statistical terms.
Figure 7
Figure 7
Set of paths that represent the triple Epirubicin treats breast cancer following the metapath [Compound upregulates Gene is expressed by Anatomy is localized to Disease]. The number of nodes has been limited to facilitate visualization.
Figure 8
Figure 8
Set of paths that represent the triple “Epirubicin treats lung cancer” following the metapath [Compound upregulates Gene is upregulated by Compound treats Disease]. The number of nodes has been limited to facilitate visualization.

Similar articles

Cited by

References

    1. Parvathaneni V, Kulkarni NS, Muth A, Gupta V. Drug repurposing: A promising tool to accelerate the drug discovery process. Drug Discov. Today. 2019;24:2076–2085. doi: 10.1016/j.drudis.2019.06.014. - DOI - PMC - PubMed
    1. Saberian N, Peyvandipour A, Donato M, Ansari S, Draghici S. A new computational drug repurposing method using established disease-drug pair knowledge. Bioinformatics. 2019;35:3672–3678. doi: 10.1093/bioinformatics/btz156. - DOI - PMC - PubMed
    1. Danishuddin M, Khan AU. Structure based virtual screening to discover putative drug candidates: Necessary considerations and successful case studies. Methods. 2015;71:135–145. doi: 10.1016/j.ymeth.2014.10.019. - DOI - PubMed
    1. Beck BR, Shin B, Choi Y, Park S, Kang K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput. Struct. Biotechnol. J. 2020;18:784–790. doi: 10.1016/j.csbj.2020.03.025. - DOI - PMC - PubMed
    1. Zeng X, et al. Repurpose open data to discover therapeutics for covid-19 using deep learning. J. Proteome Res. 2020;19:4624–4636. doi: 10.1021/acs.jproteome.0c00316. - DOI - PubMed

LinkOut - more resources