Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 26;16(1):6915.
doi: 10.1038/s41467-025-62235-6.

Evidential deep learning-based drug-target interaction prediction

Affiliations

Evidential deep learning-based drug-target interaction prediction

Yanpeng Zhao et al. Nat Commun. .

Abstract

Drug-target interaction (DTI) prediction is a crucial component of drug discovery. Recent deep learning methods show great potential in this field but also encounter substantial challenges. These include generating reliable confidence estimates for predictions, enhancing robustness when handling novel, unseen DTIs, and mitigating the tendency toward overconfident and incorrect predictions. To solve these problems, we propose EviDTI, a novel approach utilizing evidential deep learning (EDL) for uncertainty quantification in neural network-based DTI prediction. EviDTI integrates multiple data dimensions, including drug 2D topological graphs and 3D spatial structures, and target sequence features. Through EDL, EviDTI provides uncertainty estimates for its predictions. Experimental results on three benchmark datasets demonstrate the competitiveness of EviDTI against 11 baseline models. In addition, our study shows that EviDTI can calibrate prediction errors. More importantly, well-calibrated uncertainty information enhances the efficiency of drug discovery by prioritizing DTIs with higher confident predictions for experimental validation. In a case study focused on tyrosine kinase modulators, uncertainty-guided predictions identify novel potential modulators targeting tyrosine kinase FAK and FLT3. These results underscore the potential of evidential deep learning as a robust tool for uncertainty quantification in DTI prediction and its broader implications for accelerating drug discovery.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. The flowchart of EviDTI.
For a given drug-target pair, the protein feature encoder employs the pre-trained ProtTrans for initial target representation, further refined by a light attention (LA) module. The drug feature encoder processes the 2D topology and 3D structure representations. The 2D representation is derived from the pre-trained MG-BERT model and processed by 1D CNN. The 3D structure representation is obtained via the GeoGNN. These representations are concatenated and fed into the evidence layer, which outputs parameter α for prediction probability and uncertainty.
Fig. 2
Fig. 2. The results of two ablation experiments.
a Performance comparison on the DrugBank, KIBA and Davis datasets using single-dimensional features and multidimensional feature fusion strategies. Five independent replications of each method were performed (n = 5). Data are expressed as means ± std. b Performance comparison of feature extraction with and without pre-trained models on DrugBank, KIBA, and Davis datasets. Five independent replications of each method were performed (n = 5). Data are expressed as means ± std. Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Evidential deep learning provides a favorable measure of uncertainty.
a A Mann–Whitney test was performed on the error distribution of uncertainty in samples classified as TP, FP, FN, TN for three datasets: DrugBank (n = 3312 observations), KIBA (n = 11,639 observations), and Davis (n = 2,583 observations). The central line indicates the median, the box bounds indicate the 25th and 75th percentiles, whiskers extend to the minimum and maximum values (within 1.5× interquartile range), and outliers are shown as individual points. All tests were two-sided, with no adjustments made for multiple comparisons. Asterisks indicate statistically significant differences based on Mann–Whitney U test p-values: ****p ≤ 0.0001. Significance is indicated as follows: For DrugBank dataset, TP vs. FN has a p-value of 1.055e-10, FP vs. TN has a p-value of 4.954e-74, TP vs. FP has a p-value of 1.546e-51, FN vs. TN has a p-value of 1.895e-26. For KIBA dataset, TP vs. FN has a p-value of 9.713e-30, FP vs. TN has a p-value of 4.954e-74, TP vs. FP has a p-value of 1.546e-51, FN vs. TN has a p-value of 1.895e-26. For Davis dataset, TP vs. FN has a p-value of 3.502e-09, FP vs. TN has a p-value of 5.662e-45, TP vs. FP has a p-value of 6.667e-21, FN vs. TN has a p-value of 7.434e-40. b Test data sorted and divided into 20 confidence intervals based on uncertainty. All tests were two-sided with no adjustments made for multiple comparisons. The ACC was calculated for samples within each confidence interval. Five independent replications (n = 5) were performed in each data set. Data are presented as mean ± std. Source data for the figure are shown in Supplementary Data. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Evidential deep learning helps reduce the risk of false predictions in decision-making.
a Comparison of OFR between uncertainty-based and probability-based frameworks on DrugBank dataset at different thresholds. Five independent replications of each method were performed (n = 5). Data are expressed as means ± std. b Comparison of OFR between uncertainty-based and probability-based frameworks on the KIBA dataset at different thresholds. Five independent replications of each method were performed (n = 5). Data are expressed as means ± std. c Comparison of OFR between uncertainty-based and probability-based frameworks on the Davis dataset at different thresholds. The line represents the mean OFR, and the shaded area indicates the standard deviation. Five independent replications of each method were performed (n = 5). Data are expressed as means ± std. d Hit rates of the Top20 ranked predictions determined by uncertainty ranking strategies and probability ranking strategies. Five independent replications of each method were performed (n = 5). Data are expressed as means ± std. e Case study. Column interaction is the true label of the DTIs. Column Uncertainty-based is the predicted probability based on the uncertainty method and the predicted label, with the uncertainty given by the model in parentheses. Column Probability-based is the predicted probability of the probability-based method and the predicted labels. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. Application of EviDTI in the discovery of multi-target tyrosine kinase modulators.
a The validation framework on multi-target tyrosine kinase modulators. Initially, the validation was carried out using the data reported in the patents. Two lenvatinib analogs with 11 known targets were collected from the patents to conduct this validation. Subsequently, the validation was carried out using the data reported in the literature. The uncertainty score and probability of interactions between the 67 tyrosine kinase targets and 51 tyrosine kinase modulators were predicted via EviDTI. Finally, two targets of interest were selected from the 67 targets collected above. Based on the uncertainty for these two targets and 51 modulators, the interactions with lowest uncertainty between these two targets and the seven modulators were validated experimentally. b The 50% effective concentration of Tyrphostin 9, Vodobatinib, Flumatinib and PF-562271 in the FAK kinase ADP-Glo assays, respectively. PF-562271 as positive control. Mean ± SEM of three independent experiments is shown (n = 3). c The 50% effective concentration of Vodobatinib, Tyrphostin 9 and sorafenib in the FLT3 kinase ADP-Glo assays, respectively. Sorafenib as positive control. Mean ± SEM of three independent experiments is shown (n = 3). Source data are provided as a Source Data file.
Fig. 6
Fig. 6. Visualization of attention scores of all the residues in the four randomly selected drug-target complexes.
ad The correctly predicted amino acid residues surrounding the corresponding ligands (sticks) are highlighted. The residues around the corresponding ligands that were correctly predicted are highlighted in the figure. Their color indicates the degree of contribution of these residues to the prediction results. 3D representations of all structures were visualized using Pymol software.

References

    1. Miteva, M. A. et al. FAF-Drugs: free ADME/tox filtering of compound collections. Nucleic Acids Res.34, W738–W744 (2006). - PMC - PubMed
    1. Bagherian, M. et al. Machine learning approaches and databases for prediction of drug–target interaction: a survey paper. Brief. Bioinforma.22, 247–269 (2021). - PMC - PubMed
    1. Yang, Z., Zeng, X., Zhao, Y. & Chen, R. AlphaFold2 and its applications in the fields of biology and medicine. Signal Transduct. Target. Ther.8, 115 (2023). - PMC - PubMed
    1. Liu, T. et al. Applying high-performance computing in drug discovery and molecular simulation. Natl. Sci. Rev.3, 49–63 (2016). - PMC - PubMed
    1. Zitnik, M. et al. Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. Inf. Fusion50, 71–91 (2019). - PMC - PubMed

Substances

LinkOut - more resources