Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 13;2(8):100307.
doi: 10.1016/j.patter.2021.100307. eCollection 2021 Aug 13.

HeTDR: Drug repositioning based on heterogeneous networks and text mining

Affiliations

HeTDR: Drug repositioning based on heterogeneous networks and text mining

Shuting Jin et al. Patterns (N Y). .

Abstract

Using existing knowledge to carry out drug-disease associations prediction is a vital method for drug repositioning. However, effectively fusing the biomedical text and biological network information is one of the great challenges for most current drug repositioning methods. In this study, we propose a drug repositioning method based on heterogeneous networks and text mining (HeTDR). This model can combine drug features from multiple drug-related networks, disease features from biomedical corpora with the known drug-disease associations network to predict the correlation scores between drug and disease. Experiments demonstrate that HeTDR has excellent performance that is superior to that of state-of-the-art models. We present the top 10 novel HeTDR-predicted approved drugs for five diseases and prove our model is capable of discovering potential candidate drugs for disease indications.

Keywords: drug repositioning; drug-disease associations; feature representation; heterogeneous networks; text mining.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Flowchart of HeTDR The model consists of three parts: (A) HeTDR integrates nine drug-related networks to obtain global information of drugs. In the heterogeneous interaction networks, we first use the Jaccard similarity coefficient to calculate the similarity network. Then, we fuse these drug-related networks into one network by SNF and apply SAE to obtain low-dimensional features of the drugs. (B) HeTDR obtains vector representation of the disease features by text mining biomedical corpora. In the pre-training stage, we directly use the model parameters pre-trained by BioBERT. Then, we select the relation extraction task for fine-tuning training. After the fine-tuning process has taken place, we extract the representations of sub-words and use the representations of these sub-words to obtain the representations of all diseases. (C) HeTDR predicts potential drug-disease associations by an embedding learning method, which can capture both the drug-disease associations network topological structural proximity and node attributes proximity.
Figure 2
Figure 2
Performance of HeTDR comparing the different features (A) ROC curves of prediction results by using different features. (B) PR curves of prediction results by using different attributes. (C) F1 scores of prediction results by using different features.
Figure 3
Figure 3
HeTDR outperforms other state-of-the-art methods for drug-disease associations prediction (A) ROC curves of prediction results obtained by applying HeTDR and five previously reported methods in 5-fold cross-validation. (B) PR curves of prediction results obtained by HeTDR and five previously reported methods in 5-fold cross-validation.
Figure 4
Figure 4
Network visualization of the drug-disease associations predicted by HeTDR In this network, the predicted novel top 150 drug-disease pairs network is visualized. The label of the node represents the ID of the drugs (Drugbank_ID) or diseases (UMLS_ID). The node size denotes the degree. The weight of edges (drug-disease pairs) denotes the predicted score by HeTDR. The novel top 150 pairs of the highest similarity drug-disease associations can be found in Table S1. This image was generated by Gephi (https://gephi.org).
Figure 5
Figure 5
The interpretability of HeTDR Identifies novel associations (A) The upper nodes are the top 20 most relevant diseases of C0342882. (B) The upper nodes are the top 20 most relevant diseases of C0026705. The edges are the known drug-disease associations, and the heavier color of the edge represents edge linking disease rank higher in the top 20. This image was generated by Gephi (https://gephi.org).

References

    1. Dickson M., Gagnon J.P. Key factors in the rising cost of new drug discovery and development. Nat. Rev. Drug Discov. 2004;3:417–429. doi: 10.1038/nrd1382. - DOI - PubMed
    1. Pushpakom S., Iorio F., Eyers P.A., Escott K.J., Hopper S., Wells A., Doig A., Guilliams T., Latimer J., McNamee C. Drug repurposing: progress, challenges and recommendations. Nat. Rev. Drug Discov. 2019;18:41–58. doi: 10.1038/nrd.2018.168. - DOI - PubMed
    1. Li J., Zheng S., Chen B., Butte A.J., Swamidass S.J., Lu Z. A survey of current trends in computational drug repositioning. Brief. Bioinformatics. 2016;17:2–12. doi: 10.1093/bib/bbv020. - DOI - PMC - PubMed
    1. Ashburn T.T., Thor K.B. Drug repositioning: identifying and developing new uses for existing drugs. Nat. Rev. Drug Discov. 2014;3:673–683. doi: 10.1038/nrd1468. - DOI - PubMed
    1. Napolitano F., Zhao Y., Moreira V.M., Tagliaferri R., Kere J., D’Amato M., Greco D. Drug repositioning: a machine-learning approach through data integration. J. Cheminformatics. 2013;5:30. doi: 10.1186/1758-2946-5-30. - DOI - PMC - PubMed

LinkOut - more resources