Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 30;16(1):5021.
doi: 10.1038/s41467-025-59917-6.

DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation

Affiliations

DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation

Pir Masoom Shah et al. Nat Commun. .

Abstract

Identifying novel drugs that can interact with target proteins is a highly challenging, time-consuming, and costly task in drug discovery and development. Numerous machine learning-based models have recently been utilized to accelerate the drug discovery process. However, these existing methods are primarily uni-tasking, either designed to predict drug-target interaction (DTI) or generate new drugs. Through the lens of pharmacological research, these tasks are intrinsically interconnected and play a critical role in effective drug development. Therefore, the learning models must be utilized in such a manner to learn the structural properties of drug molecules, the conformational dynamics of proteins, and the bioactivity between drugs and targets. To this end, this paper develops a novel multitask learning framework that can predict drug-target binding affinities and simultaneously generate new target-aware drug variants, using common features for both tasks. In addition, we developed the FetterGrad algorithm to address the optimization challenges associated with multitask learning particularly those caused by gradient conflicts between distinct tasks. Comprehensive experiments on three real-world datasets demonstrate that the proposed model provides an effective mechanism for predicting drug-target binding affinities and generating novel drugs, thus greatly facilitating the drug discovery process.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Illustration of the proposed model.
A The overall architecture of the proposed model. B The architecture of the standard transformer decoder. In this study, we used eight transformer decoders. C The Encoder and Decoder Modules and the incorporation of Target condition.
Fig. 2
Fig. 2. Scattered visualizations of predicted affinity values against actual affinity values on the KIBA, Davis, and BindingDB test sets.
A The scatter plot of predicted affinities for the KIBA test set, B for Davis, and C for BindingDB.
Fig. 3
Fig. 3. Affinities score distribution comparison between the actual affinities of original drugs and targets, predicted affinities of original drugs and targets, and predicted affinities of generated drugs and targets.
A covers the KIBA test set, while B shows the BindingDB test set. The x-axis represents the affinity scores, and the y-axis represents the density of the scores.
Fig. 4
Fig. 4. Interaction-based drug generation.
The first column in the figure represents the trained model. The second column shows the PubChem ID for the drugs and the UniProt ID for the targets (both the drugs and targets are used as seeds to generate new SMILES). The third column lists the chemical structure of the seed SMILES. The fourth column shows the chemical structure of the generated SMILES. The fifth column shows the Tanimoto Similarity (TS) between the generated and seed drugs. The sixth column displays the chemical properties of the generated drugs. The last column shows the Docking Scores (DS) for the seed target with the seed drug and the generated drug, where the seed value represents the DS between the seed drug and the seed target, while the generated value represents the DS between the generated drug and the seed target.
Fig. 5
Fig. 5. The visualization of pocket areas of the generated drugs and seed drugs with their corresponding target proteins from Fig. 4.
AD (labeled “KIBA” in the figure) represent the binding sites for the KIBA-generated and their seed drugs, corresponding to rows 1–4 of the table. AD, which are labeled as “BindingDB” in the figure, represent the binding sites for the BindingDB-generated and seed drugs, corresponding to rows 5–8 of the table. The red folds in the figure are the binding sites for the respective targets as per the uniport database.
Fig. 6
Fig. 6. Property distribution between the KIBA test set and generated molecules using the Trained KIBA model.
A The QED, LogP, and SAS properties distributions in the original KIBA test set. B The same QED, LogP, and SAS properties distribution in the generated molecules by on SMILES synthesis method, C the distribution of generated molecules using the stochastic method. In each panel, the notation μ represents the mean of that distribution.
Fig. 7
Fig. 7. Interaction visualization between the generated drugs and the EGFR protein.
A Represents the interaction of the drug generated by the KIBA-trained model, while B shows the interaction of the drug generated by the BindingDB-trained model.
Fig. 8
Fig. 8. Polypharmacological druggability of generated drugs.
The first column in the figure shows the PubChem ID for the drugs and the UniProt ID for the targets (both the drugs and targets are used as seeds to generate new SMILES). The second column lists the chemical structure of the seed SMILES. The third column shows the chemical structure of the generated SMILES. The fourth column shows the Tanimoto Similarity (TS) between the generated and seed drugs. The fifth column displays the chemical properties of the generated drugs. The sixth column shows the other active targets against the seed drug. The seventh column represents the Docking Score (DS) between the three corresponding targets and the generated drug. The last column represents the Docking Score (DS) of the seed target with the seed drug and the generated drug.

Similar articles

Cited by

References

    1. Oprea, T. & Mestres, J. Drug repurposing: far beyond new targets for old drugs. AAPS J.14, 759–763 (2012). - PMC - PubMed
    1. Noble, M. E., Endicott, J. A. & Johnson, L. N. Protein kinase inhibitors: insights into drug design from structure. Science303, 1800–1805 (2004). - PubMed
    1. Lu, Z. et al. DTIAM: a unified framework for predicting drug-target interactions, binding affinities and drug mechanisms. Nat. Commun.16, 2548 (2025). - PMC - PubMed
    1. Wang, K., Zhou, R., Li, Y. & Li, M. DeepDTAF: a deep learning method to predict protein–ligand binding affinity. Brief. Bioinform.22, 072 (2021). - PubMed
    1. Wang, K., Zhou, R., Tang, J. & Li, M. GraphscoreDTA: optimized graph neural network for protein–ligand binding affinity prediction. Bioinformatics39, 340 (2023). - PMC - PubMed

LinkOut - more resources