. 2018 Sep 15;34(18):3151-3159.

doi: 10.1093/bioinformatics/bty325.

EMUDRA: Ensemble of Multiple Drug Repositioning Approaches to improve prediction accuracy

Xianxiao Zhou^{1

2}, Minghui Wang^{1

2}, Igor Katsyv^{1

2

3}, Hanna Irie^{4

5}, Bin Zhang^{1

2

5}

Affiliations

¹ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
² Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
³ Medical Scientist Training Program, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁴ Division of Hematology and Medical Oncology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁵ Department of Oncological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

PMID: 29688306
PMCID: PMC6138000
DOI: 10.1093/bioinformatics/bty325

EMUDRA: Ensemble of Multiple Drug Repositioning Approaches to improve prediction accuracy

Xianxiao Zhou et al. Bioinformatics. 2018.

. 2018 Sep 15;34(18):3151-3159.

doi: 10.1093/bioinformatics/bty325.

Authors

Xianxiao Zhou^{1

2}, Minghui Wang^{1

2}, Igor Katsyv^{1

2

3}, Hanna Irie^{4

5}, Bin Zhang^{1

2

5}

Affiliations

¹ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
² Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
³ Medical Scientist Training Program, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁴ Division of Hematology and Medical Oncology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁵ Department of Oncological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

PMID: 29688306
PMCID: PMC6138000
DOI: 10.1093/bioinformatics/bty325

Abstract

Motivation: Availability of large-scale genomic, epigenetic and proteomic data in complex diseases makes it possible to objectively and comprehensively identify the therapeutic targets that can lead to new therapies. The Connectivity Map has been widely used to explore novel indications of existing drugs. However, the prediction accuracy of the existing methods, such as Kolmogorov-Smirnov statistic remains low. Here we present a novel high-performance drug repositioning approach that improves over the state-of-the-art methods.

Results: We first designed an expression weighted cosine (EWCos) method to minimize the influence of the uninformative expression changes and then developed an ensemble approach termed ensemble of multiple drug repositioning approaches (EMUDRA) to integrate EWCos and three existing state-of-the-art methods. EMUDRA significantly outperformed individual drug repositioning methods when applied to simulated and independent evaluation datasets. We predicted using EMUDRA and experimentally validated an antibiotic rifabutin as an inhibitor of cell growth in triple negative breast cancer. EMUDRA can identify drugs that more effectively target disease gene signatures and will thus be a useful tool for identifying novel therapies for complex diseases and predicting new indications for existing drugs.

Availability and implementation: The EMUDRA R package is available at doi: 10.7303/syn11510888.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
**Workflows of EWCos (A) and EMUDRA (B).** (A) To adjust the lowly expressed genes, a logistic function was used to weight drug-induced expression changes. First, weight matrices were calculated for the parameters in the function. Next, for each instance, drug-induced signatures identified from replicates were used to optimize the parameters. Finally, weighted fold changes were used to calculate EWCos scores for a given query signature. (B) Matching scores from EWCos, Cosine, XCor and XSpe were normalized and combined to obtain an ensemble score to rank order drugs. GO enrichment analysis was performed on the signature gene sets reversed by the top drugs

**Fig. 2.**
**Evaluation of EWCos, EMUDRA and the existing methods based on simulation studies**. (A) For each instance, a drug-induced gene signature was identified based on treatment and the corresponding controls, which was used to query the CMap data by each method. Instances treated with the same drug of a query signature were considered as positive cases and other instances were used as negative. Performance was evaluated by ROC curves and pAUC at false positive rate 0.01. (B) Performance for simulated data with random noise from a uniform distribution

**Fig. 3.**
Performance of EMUDRA, EWCos and the existing drug repositioning approaches based on positive controls determined by ATC Codes and the LINCS Dataset. (A) ROC curves and pAUC for the prediction of the 1864 drug pairs sharing at least one ATC codes. These drug pairs were taken as positive cases and the rest drug pairs were set as negative cases. ROC curves and pAUC were generated with FPR <0.01. (B) Performance for predicting the drug pairs sharing at least two ATC codes. (C) Performance for predicting positive control drugs from the LINCS data. 24 cell line specific drug signatures identified from the LINCS data were then used to query the instances in CMap using nineapproaches. The instances in CMap with the same drug and cell line as those in a given LINCS signature were set as positive cases while other instances were taken as negative cases for prediction

**Fig. 4.**
Performance comparison of all possible combinations of the non-ensemble methods. (A) AUCs of the 255 possible combinations of the 8 non-ensemble methods based on the simulation data with noise. The numbers in the legend are number of methods assembled. (B)–(D) The ensemble rate of individual methods in the simulation, ATC and LINCS datasets. All 247 ensemble and 8 non-ensemble methods were rank ordered by AUC

**Fig. 5.**
**Rifabutin dose-dependently inhibits growth of TNBC cells in 3D culture.** (A) MDA-MB-231 cells were grown in 3D Matrigel^TM and treated every 24–48 h with DMSO or 1, 4.8 or 25 μM rifabutin. Representative fields (5×) shown. (B). At least four fields each from at least three independent experiments were used for statistical analysis. Error bars represent standard error. Statistical significance of the difference in proportion of a field containing cells (field cellularity) between DMSO- and rifabutin-treated cells was tested using a one-tailed student’s t-test. (C). Viability of rifabutin and taxol-treated MDA-MB-231 cells grown in 3D Matrigel and treated every 48 h with media containing 0.4% DMSO, rifabutin or taxol. Luminescence was assayed using CellTiter-Glo 3D (Promega). Error bars represent standard error of the mean (SEM) from three independent experiments. One-sided student’s t-test comparing treatment to DMSO: *P < 0.05; **P < 0.005; ***P < 0.0005

See this image and copyright information in PMC

Cited by

Drug Repurposing Using Modularity Clustering in Drug-Drug Similarity Networks Based on Drug-Gene Interactions.
Groza V, Udrescu M, Bozdog A, Udrescu L. Groza V, et al. Pharmaceutics. 2021 Dec 8;13(12):2117. doi: 10.3390/pharmaceutics13122117. Pharmaceutics. 2021. PMID: 34959398 Free PMC article.
Identification of significant gene expression changes in multiple perturbation experiments using knockoffs.
Zhao T, Zhu G, Dubey HV, Flaherty P. Zhao T, et al. Brief Bioinform. 2023 Mar 19;24(2):bbad084. doi: 10.1093/bib/bbad084. Brief Bioinform. 2023. PMID: 36892174 Free PMC article.
Machine and deep learning approaches for cancer drug repurposing.
Issa NT, Stathias V, Schürer S, Dakshanamurthy S. Issa NT, et al. Semin Cancer Biol. 2021 Jan;68:132-142. doi: 10.1016/j.semcancer.2019.12.011. Epub 2020 Jan 3. Semin Cancer Biol. 2021. PMID: 31904426 Free PMC article. Review.
Disentangling the Molecular Pathways of Parkinson's Disease using Multiscale Network Modeling.
Wang Q, Zhang B, Yue Z. Wang Q, et al. Trends Neurosci. 2021 Mar;44(3):182-188. doi: 10.1016/j.tins.2020.11.006. Trends Neurosci. 2021. PMID: 33358606 Free PMC article. Review.
Functional genomics pipeline identifies CRL4 inhibition for the treatment of ovarian cancer.
Claridge SE, Nath S, Baum A, Farias R, Cavallo JA, Rizvi NM, De Boni L, Park E, Granados GL, Hauesgen M, Fernandez-Rodriguez R, Kozan EN, Kanshin E, Huynh KQ, Chen PJ, Wu K, Ueberheide B, Mosquera JM, Hirsch FR, DeVita RJ, Elemento O, Pauli C, Pan ZQ, Hopkins BD. Claridge SE, et al. Clin Transl Med. 2025 Feb;15(2):e70078. doi: 10.1002/ctm2.70078. Clin Transl Med. 2025. PMID: 39856363 Free PMC article.

See all "Cited by" articles

References

1. Ashburner M. et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet., 25, 25–29. - PMC - PubMed
1. Barrett T. et al. (2013) NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res., 41, D991–D995. - PMC - PubMed
1. Benjamini Y., Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. B Met., 57, 289–300.
1. Benjamini Y., Yekutieli D. (2001) The control of the false discovery rate in multiple testing under dependency. Ann. Stat., 29, 1165–1188.
1. Cheng J. et al. (2014) Systematic evaluation of connectivity map for disease indications. Genome Med., 6, 540.. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

EMUDRA: Ensemble of Multiple Drug Repositioning Approaches to improve prediction accuracy

Affiliations

EMUDRA: Ensemble of Multiple Drug Repositioning Approaches to improve prediction accuracy

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources