Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Mar;35(3):419-31.
doi: 10.1038/aps.2013.153. Epub 2014 Feb 3.

Multi-algorithm and multi-model based drug target prediction and web server

Affiliations

Multi-algorithm and multi-model based drug target prediction and web server

Ying-tao Liu et al. Acta Pharmacol Sin. 2014 Mar.

Abstract

Aim: To develop a reliable computational approach for predicting potential drug targets based merely on protein sequence.

Methods: With drug target and non-target datasets prepared and 3 classification algorithms (Support Vector Machine, Neural Network and Decision Tree), a multi-algorithm and multi-model based strategy was employed for constructing models to predict potential drug targets.

Results: Twenty one prediction models for each of the 3 algorithms were successfully developed. Our evaluation results showed that ∼30% of human proteins were potential drug targets, and ∼40% of putative targets for the drugs undergoing phase II clinical trials were probably non-targets. A public web server named D3TPredictor (http://www.d3pharma.com/d3tpredictor) was constructed to provide easy access.

Conclusion: Reliable and robust drug target prediction based on protein sequences is achieved using the multi-algorithm and multi-model strategy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Dataset preparation flowchart.
Figure 2
Figure 2
Comparison of three algorithms using two descriptor selection methods. FSBM, F-score based modeling; DRM, descriptor randomization modeling.
Figure 3
Figure 3
Evaluation of extensibility of the training sets of the 21 SVM models and the 36 NN models. The X-axis represents all of the performance metrics for the three algorithms, and the Y-axis is the model serial number. (A) Evaluation based on the training and testing sets of the 21 SVM models for the three algorithms. (B) Evaluation based on the training and testing sets of the 36 NN models for the three algorithms.
Figure 4
Figure 4
ANOVA statistical test. Analysis of differences in (A) the accuracies and (B) the AUCs between the DT models based on the training sets of the 21 SVM models and those of the 36 NN models. Analysis of differences in (C) the accuracies and (D) the AUCs between the NN models based on the training sets of the 21 SVM models and those of the 36 NN models. Analysis of differences in (E) the accuracies and (F) the AUCs between the SVM models based on the training sets of the 21 SVM models and those of the 36 NN models.
Figure 5
Figure 5
Receiver operating characteristic curves (ROCs) of the 21 SVM models.
Figure 6
Figure 6
Evaluation of the 21 parallel models against three testing datasets. Evaluation against (A) Dataset I, clinical phase II targets (size: 202), (B) Dataset II, human proteome (size: 20 025), and (C) Dataset III, targets of withdrawn drugs (size: 55). Mean values and standard errors of the 21 models using the 3 algorithms against (D) Dataset I, (E) Dataset II, and (F) Dataset III.
Figure 7
Figure 7
Bar chart of accumulated standard errors (ASE).
Figure 8
Figure 8
Illustration of multi-algorithm and/or multi-model based strategy. The red colored block represents a predicted non-target; the green colored block stands for a predicted target. Multi-algorithm based strategy: for i (i=1, 2, …, 21), there are three corresponding models: SVM-model-i, NN-model-i, and DT-model-i. If a sequence is predicted as a target by no less than 2 models in the three models, the sequence is defined as a potential target. Multi-model based strategy: for algorithm j (j=SVM, NN, DT), there are N models (N=1, 2, …, 21). If a sequence is predicted as a target by no less than [(N+1)/2] models, the sequence is defined as a potential target. Multi-algorithm and multi-model based strategy: successive combination of multi-algorithm based strategy and multi-model based strategy.
Figure 9
Figure 9
Multi-algorithm and/or multi-model based evaluation. Single-algorithm and single-model based evaluation using (A) the SVM algorithm, (B) the DT algorithm, and (C) the NN algorithm. (D) Multi-algorithm based evaluation. (E) Multi-model based evaluation. (F) Multi-algorithm and multi-model based evaluation.

Similar articles

Cited by

References

    1. 1Ohlstein EH, Ruffolo RR, Elliott JD. Drug discovery in the next millennium. Annu Rev Pharmacol Toxicol 2000; 40: 177–91. - PubMed
    1. 2Hopkins AL, Groom CR. The druggable genome. Nat Rev Drug Discov 2002; 1: 727–30. - PubMed
    1. 3Drews J. Drug discovery: a historical perspective. Science 2000; 287: 1960–4. - PubMed
    1. 4Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 2006; 34: D668–72. - PMC - PubMed
    1. 5Drews J. Genomic sciences and the medicine of tomorrow. Nat Biotechnol 1996; 14: 1516–8. - PubMed

Publication types

LinkOut - more resources