. 2025 May 30;16(1):5021.

doi: 10.1038/s41467-025-59917-6.

DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation

Pir Masoom Shah¹, Huimin Zhu¹, Zhangli Lu¹, Kaili Wang², Jing Tang^{3

4}, Min Li^{5

6

7}

Affiliations

¹ School of Computer Science and Engineering, Central South University, Changsha, China.
² School of Computer Science and Technology, Donghua University, Shanghai, China.
³ Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
⁴ Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
⁵ School of Computer Science and Engineering, Central South University, Changsha, China. limin@mail.csu.edu.cn.
⁶ Xiangjiang Laboratory, Changsha, China. limin@mail.csu.edu.cn.
⁷ Furong Laboratory, Central South University, Changsha, China. limin@mail.csu.edu.cn.

PMID: 40447614
PMCID: PMC12125237
DOI: 10.1038/s41467-025-59917-6

DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation

Pir Masoom Shah et al. Nat Commun. 2025.

. 2025 May 30;16(1):5021.

doi: 10.1038/s41467-025-59917-6.

Authors

Pir Masoom Shah¹, Huimin Zhu¹, Zhangli Lu¹, Kaili Wang², Jing Tang^{3

4}, Min Li^{5

6

7}

Affiliations

¹ School of Computer Science and Engineering, Central South University, Changsha, China.
² School of Computer Science and Technology, Donghua University, Shanghai, China.
³ Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
⁴ Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
⁵ School of Computer Science and Engineering, Central South University, Changsha, China. limin@mail.csu.edu.cn.
⁶ Xiangjiang Laboratory, Changsha, China. limin@mail.csu.edu.cn.
⁷ Furong Laboratory, Central South University, Changsha, China. limin@mail.csu.edu.cn.

PMID: 40447614
PMCID: PMC12125237
DOI: 10.1038/s41467-025-59917-6

Abstract

Identifying novel drugs that can interact with target proteins is a highly challenging, time-consuming, and costly task in drug discovery and development. Numerous machine learning-based models have recently been utilized to accelerate the drug discovery process. However, these existing methods are primarily uni-tasking, either designed to predict drug-target interaction (DTI) or generate new drugs. Through the lens of pharmacological research, these tasks are intrinsically interconnected and play a critical role in effective drug development. Therefore, the learning models must be utilized in such a manner to learn the structural properties of drug molecules, the conformational dynamics of proteins, and the bioactivity between drugs and targets. To this end, this paper develops a novel multitask learning framework that can predict drug-target binding affinities and simultaneously generate new target-aware drug variants, using common features for both tasks. In addition, we developed the FetterGrad algorithm to address the optimization challenges associated with multitask learning particularly those caused by gradient conflicts between distinct tasks. Comprehensive experiments on three real-world datasets demonstrate that the proposed model provides an effective mechanism for predicting drug-target binding affinities and generating novel drugs, thus greatly facilitating the drug discovery process.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

**Fig. 1. Illustration of the proposed model.**
A The overall architecture of the proposed model. B The architecture of the standard transformer decoder. In this study, we used eight transformer decoders. C The Encoder and Decoder Modules and the incorporation of Target condition.

**Fig. 2. Scattered visualizations of predicted affinity values against actual affinity values on the KIBA, Davis, and BindingDB test sets.**
A The scatter plot of predicted affinities for the KIBA test set, B for Davis, and C for BindingDB.

Fig. 3. Affinities score distribution comparison between the actual affinities of original drugs and targets, predicted affinities of original drugs and targets, and predicted affinities of generated drugs and targets.
A covers the KIBA test set, while B shows the BindingDB test set. The x-axis represents the affinity scores, and the y-axis represents the density of the scores.

**Fig. 4. Interaction-based drug generation.**
The first column in the figure represents the trained model. The second column shows the PubChem ID for the drugs and the UniProt ID for the targets (both the drugs and targets are used as seeds to generate new SMILES). The third column lists the chemical structure of the seed SMILES. The fourth column shows the chemical structure of the generated SMILES. The fifth column shows the Tanimoto Similarity (TS) between the generated and seed drugs. The sixth column displays the chemical properties of the generated drugs. The last column shows the Docking Scores (DS) for the seed target with the seed drug and the generated drug, where the seed value represents the DS between the seed drug and the seed target, while the generated value represents the DS between the generated drug and the seed target.

**Fig. 5. The visualization of pocket areas of the generated drugs and seed drugs with their corresponding target proteins from Fig. 4.**
A–D (labeled “KIBA” in the figure) represent the binding sites for the KIBA-generated and their seed drugs, corresponding to rows 1–4 of the table. A–D, which are labeled as “BindingDB” in the figure, represent the binding sites for the BindingDB-generated and seed drugs, corresponding to rows 5–8 of the table. The red folds in the figure are the binding sites for the respective targets as per the uniport database.

**Fig. 6. Property distribution between the KIBA test set and generated molecules using the Trained KIBA model.**
A The QED, LogP, and SAS properties distributions in the original KIBA test set. B The same QED, LogP, and SAS properties distribution in the generated molecules by on SMILES synthesis method, C the distribution of generated molecules using the stochastic method. In each panel, the notation μ represents the mean of that distribution.

**Fig. 7. Interaction visualization between the generated drugs and the EGFR protein.**
A Represents the interaction of the drug generated by the KIBA-trained model, while B shows the interaction of the drug generated by the BindingDB-trained model.

**Fig. 8. Polypharmacological druggability of generated drugs.**
The first column in the figure shows the PubChem ID for the drugs and the UniProt ID for the targets (both the drugs and targets are used as seeds to generate new SMILES). The second column lists the chemical structure of the seed SMILES. The third column shows the chemical structure of the generated SMILES. The fourth column shows the Tanimoto Similarity (TS) between the generated and seed drugs. The fifth column displays the chemical properties of the generated drugs. The sixth column shows the other active targets against the seed drug. The seventh column represents the Docking Score (DS) between the three corresponding targets and the generated drug. The last column represents the Docking Score (DS) of the seed target with the seed drug and the generated drug.

See this image and copyright information in PMC

Cited by

AI-Driven Polypharmacology in Small-Molecule Drug Discovery.
Abdelsayed M. Abdelsayed M. Int J Mol Sci. 2025 Jul 21;26(14):6996. doi: 10.3390/ijms26146996. Int J Mol Sci. 2025. PMID: 40725243 Free PMC article. Review.

References

1. Oprea, T. & Mestres, J. Drug repurposing: far beyond new targets for old drugs. AAPS J.14, 759–763 (2012). - PMC - PubMed
1. Noble, M. E., Endicott, J. A. & Johnson, L. N. Protein kinase inhibitors: insights into drug design from structure. Science303, 1800–1805 (2004). - PubMed
1. Lu, Z. et al. DTIAM: a unified framework for predicting drug-target interactions, binding affinities and drug mechanisms. Nat. Commun.16, 2548 (2025). - PMC - PubMed
1. Wang, K., Zhou, R., Li, Y. & Li, M. DeepDTAF: a deep learning method to predict protein–ligand binding affinity. Brief. Bioinform.22, 072 (2021). - PubMed
1. Wang, K., Zhou, R., Tang, J. & Li, M. GraphscoreDTA: optimized graph neural network for protein–ligand binding affinity prediction. Bioinformatics39, 340 (2023). - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

No.2225209/National Natural Science Foundation of China (National Science Foundation of China)

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation

Affiliations

DeepDTAGen: a multitask deep learning framework for drug-target affinity prediction and target-aware drugs generation

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources