Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 5;21(1):231.
doi: 10.1186/s12859-020-03568-5.

PS4DR: a multimodal workflow for identification and prioritization of drugs based on pathway signatures

Affiliations

PS4DR: a multimodal workflow for identification and prioritization of drugs based on pathway signatures

Mohammad Asif Emon et al. BMC Bioinformatics. .

Abstract

Background: During the last decade, there has been a surge towards computational drug repositioning owing to constantly increasing -omics data in the biomedical research field. While numerous existing methods focus on the integration of heterogeneous data to propose candidate drugs, it is still challenging to substantiate their results with mechanistic insights of these candidate drugs. Therefore, there is a need for more innovative and efficient methods which can enable better integration of data and knowledge for drug repositioning.

Results: Here, we present a customizable workflow (PS4DR) which not only integrates high-throughput data such as genome-wide association study (GWAS) data and gene expression signatures from disease and drug perturbations but also takes pathway knowledge into consideration to predict drug candidates for repositioning. We have collected and integrated publicly available GWAS data and gene expression signatures for several diseases and hundreds of FDA-approved drugs or those under clinical trial in this study. Additionally, different pathway databases were used for mechanistic knowledge integration in the workflow. Using this systematic consolidation of data and knowledge, the workflow computes pathway signatures that assist in the prediction of new indications for approved and investigational drugs.

Conclusion: We showcase PS4DR with applications demonstrating how this tool can be used for repositioning and identifying new drugs as well as proposing drugs that can simulate disease dysregulations. We were able to validate our workflow by demonstrating its capability to predict FDA-approved drugs for their known indications for several diseases. Further, PS4DR returned many potential drug candidates for repositioning that were backed up by epidemiological evidence extracted from scientific literature. Source code is freely available at https://github.com/ps4dr/ps4dr.

Keywords: Bioinformatics; Drug discovery; Drug repositioning; Multi-omics; Pathways; Software.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
An overview of the PS4DR workflow. The workflow requires three different datasets as inputs, (i) disease perturbed gene expression signatures, (ii) genome-wide association study (GWAS) data, and (iii) drug perturbed gene expression signatures. The first and optional part of the workflow involves different filtering steps based on gene set intersection operations that enable the identification of genes in the gene expression signatures that have also been identified in a GWAS of the studied disease. To retain the maximum flexibility in the workflow, users can decide which of the filtering steps they wish to apply, if any. The next step uses the transcriptomics datasets, filtered or not, to conduct pathway enrichment analysis and evaluate the direction of perturbation for each affected pathway in a particular disease context. While the dotted lines in the figure represent all possible combinations of the filtering steps that can be applied and lead to the pathway enrichment step, solid lines show the option we chose to demonstrate the workflow. Finally, the last step uses the correlation of the pathway scores calculated by the previous step to prioritize drugs that are predicted to invert the pathway signatures observed in a given disease context
Fig. 2
Fig. 2
Combined scatter plots of the drug’s correlation scores against affected pathways (%) in each disease. The relative number of target pathways affected by the drug in the disease context is plotted along the x-axis and correlation scores on the y-axis. Drugs in the top-right corner of the plot might be interesting for developing in vitro disease models since this group of drugs shows positive correlation scores, covering a broad range of the affected pathways. The circles represent drugs and the color coding indicates their respective disease indication, as shown at the bottom
Fig. 3
Fig. 3
Data preprocessing workflow. This workflow describes the preprocessing of gene expression signatures (left side) and GWAS data (right side) to make them interoperable, as well as the primary and final outcome after the preprocessing. Preprocessing steps include multiple intermediary mappings to get common identifiers for Genes (ENSEMBL identifiers), chemicals (ChEMBL identifiers) and diseases (EFO identifiers)
Fig. 4
Fig. 4
Distributions of the p-values resulting from SPIA true and simulated pathways represented as violin plots for a) KEGG, b) Reactome, and c) Biocarta pathway databases. Mann-Whitney U test confirmed that the distributions are significantly different for all three pathway databases (KEGG: p-value = 8.26e-102, Reactome: p-value = 3.05e− 114, Biocarta: p-value = 8.01e− 09). These results demonstrate that while true pathways yield meaningful results (i.e., lower p-values), simulated pathways are rarely significantly enriched
Fig. 5
Fig. 5
ROC curve of PS4DR predicted drugs. ROC curve with 95% confidence interval obtained using existing clinical trials for predicted drugs as positive labels and correlation scores as the ranking metric

Similar articles

Cited by

References

    1. Dickson M, Gagnon JP. Key factors in the rising cost of new drug discovery and development. Nat Rev Drug Discov. 2004;3(5):417. doi: 10.1038/nrd1382. - DOI - PubMed
    1. Waring MJ, et al. An analysis of the attrition of drug candidates from four major pharmaceutical companies. Nat Rev Drug Discov. 2015;14(7):475. doi: 10.1038/nrd4609. - DOI - PubMed
    1. Li J, et al. A survey of current trends in computational drug repositioning. Briefings Bioinformatics. 2015;17(1):2–12. doi: 10.1093/bib/bbv020. - DOI - PMC - PubMed
    1. Lamb J, et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006;313(5795):1929–1935. doi: 10.1126/science.1132939. - DOI - PubMed
    1. Duan Q, et al. LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures. Nucleic Acids Res. 2014;42(W1):W449–W460. doi: 10.1093/nar/gku476. - DOI - PMC - PubMed

Substances