Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction
- PMID: 35801312
- PMCID: PMC10358641
- DOI: 10.1177/00368504221109215
Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction
Abstract
Identifying new therapeutic indications for existing drugs is a major challenge in drug repositioning. Most computational drug repositioning methods focus on known targets. Analyzing multiple aspects of various protein associations provides an opportunity to discover underlying drug-associated proteins that can be used to improve the performance of the drug repositioning approaches. In this study, machine learning models were developed based on the similarities of diversified biological features, including protein interaction, topological network, sequence alignment, and biological function to predict protein pairs associating with the same drugs. The crucial set of features was identified, and the high performances of protein pair predictions were achieved with an area under the curve (AUC) value of more than 93%. Based on drug chemical structures, the drug similarity levels of the promising protein pairs were used to quantify the inferred drug-associated proteins. Furthermore, these proteins were employed to establish an augmented drug-protein matrix to enhance the efficiency of three existing drug repositioning techniques: a similarity constrained matrix factorization for the drug-disease associations (SCMFDD), an ensemble meta-paths and singular value decomposition (EMP-SVD) model, and a topology similarity and singular value decomposition (TS-SVD) technique. The results showed that the augmented matrix helped to improve the performance up to 4% more in comparison to the original matrix for SCMFDD and EMP-SVD, and about 1% more for TS-SVD. In summary, inferring new protein pairs related to the same drugs increase the opportunity to reveal missing drug-associated proteins that are important for drug development via the drug repositioning technique.
Keywords: Protein-protein interaction network; drug repositioning; drug repurposing; drug-protein association; machine learning.
Conflict of interest statement
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Figures
References
-
- Blockeel H, Kersting K, Nijssen S, et al. Machine Learning and Knowledge Discovery in Databases. 2013.
-
- Ashburn TT, Thor KB. Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discovery 2004; 3: 673–683. - PubMed
-
- Sleigh SH, Barton CL. Repurposing strategies for therapeutics. Pharmaceut Med 2010; 24: 151–159.
-
- Roses AD. Pharmacogenetics in drug discovery and development: a translational perspective. Nat Rev Drug Discovery 2008; 7: 807–817. - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
