Empowering the discovery of novel target-disease associations via machine learning approaches in the open targets platform
- PMID: 35710324
- PMCID: PMC9202116
- DOI: 10.1186/s12859-022-04753-4
Empowering the discovery of novel target-disease associations via machine learning approaches in the open targets platform
Abstract
Background: The Open Targets (OT) Platform integrates a wide range of data sources on target-disease associations to facilitate identification of potential therapeutic drug targets to treat human diseases. However, due to the complexity that targets are usually functionally pleiotropic and efficacious for multiple indications, challenges in identifying novel target to indication associations remain. Specifically, persistent need exists for new methods for integration of novel target-disease association evidence and biological knowledge bases via advanced computational methods. These offer promise for increasing power for identification of the most promising target-disease pairs for therapeutic development. Here we introduce a novel approach by integrating additional target-disease features with machine learning models to further uncover druggable disease to target indications.
Results: We derived novel target-disease associations as supplemental features to OT platform-based associations using three data sources: (1) target tissue specificity from GTEx expression profiles; (2) target semantic similarities based on gene ontology; and (3) functional interactions among targets by embedding them from protein-protein interaction (PPI) networks. Machine learning models were applied to evaluate feature importance and performance benchmarks for predicting targets with known drug indications. The evaluation results show the newly integrated features demonstrate higher importance than current features in OT. In addition, these also show superior performance over association benchmarks and may support discovery of novel therapeutic indications for highly pursued targets.
Conclusion: Our newly generated features can be used to represent additional underlying biological relatedness among targets and diseases to further empower improved performance for predicting novel indications for drug targets through advanced machine learning models. The proposed methodology enables a powerful new approach for systematic evaluation of drug targets with novel indications.
Keywords: Data Integration; Drug discovery; Drug repurposing; Feature engineering; Machine learning; Open targets; Target indication expansion; XGBoost.
© 2022. The Author(s).
Conflict of interest statement
YH, KK, DKR, CZ, and ET are employees of Sanofi and may hold shares and/or stock options in the company.
Figures




Similar articles
-
Machine learning prediction of oncology drug targets based on protein and network properties.BMC Bioinformatics. 2020 Mar 14;21(1):104. doi: 10.1186/s12859-020-3442-9. BMC Bioinformatics. 2020. PMID: 32171238 Free PMC article.
-
An integrative network-based approach for drug target indication expansion.PLoS One. 2021 Jul 9;16(7):e0253614. doi: 10.1371/journal.pone.0253614. eCollection 2021. PLoS One. 2021. PMID: 34242265 Free PMC article.
-
Automated annotation of disease subtypes.J Biomed Inform. 2024 Jun;154:104650. doi: 10.1016/j.jbi.2024.104650. Epub 2024 May 1. J Biomed Inform. 2024. PMID: 38701887
-
Machine learning approach to gene essentiality prediction: a review.Brief Bioinform. 2021 Sep 2;22(5):bbab128. doi: 10.1093/bib/bbab128. Brief Bioinform. 2021. PMID: 33842944 Review.
-
Machine Learning Empowering Drug Discovery: Applications, Opportunities and Challenges.Molecules. 2024 Feb 18;29(4):903. doi: 10.3390/molecules29040903. Molecules. 2024. PMID: 38398653 Free PMC article. Review.
Cited by
-
Oral ENPP1 inhibitor designed using generative AI as next generation STING modulator for solid tumors.Nat Commun. 2025 May 23;16(1):4793. doi: 10.1038/s41467-025-59874-0. Nat Commun. 2025. PMID: 40410143 Free PMC article.
-
From function to translation: Decoding genetic susceptibility to human diseases via artificial intelligence.Cell Genom. 2023 May 4;3(6):100320. doi: 10.1016/j.xgen.2023.100320. eCollection 2023 Jun 14. Cell Genom. 2023. PMID: 37388909 Free PMC article. Review.
-
Therapeutic target prediction for orphan diseases integrating genome-wide and transcriptome-wide association studies.Nat Commun. 2025 Apr 18;16(1):3355. doi: 10.1038/s41467-025-58464-4. Nat Commun. 2025. PMID: 40251160 Free PMC article.
-
EMBL's European Bioinformatics Institute (EMBL-EBI) in 2022.Nucleic Acids Res. 2023 Jan 6;51(D1):D9-D17. doi: 10.1093/nar/gkac1098. Nucleic Acids Res. 2023. PMID: 36477213 Free PMC article.
-
Exploring chemical space for "druglike" small molecules in the age of AI.Front Mol Biosci. 2025 Mar 17;12:1553667. doi: 10.3389/fmolb.2025.1553667. eCollection 2025. Front Mol Biosci. 2025. PMID: 40166082 Free PMC article. Review.
References
MeSH terms
LinkOut - more resources
Full Text Sources