Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Jan 22;25(2):bbae081.
doi: 10.1093/bib/bbae081.

Prediction of protein-ligand binding affinity via deep learning models

Affiliations
Review

Prediction of protein-ligand binding affinity via deep learning models

Huiwen Wang. Brief Bioinform. .

Erratum in

Abstract

Accurately predicting the binding affinity between proteins and ligands is crucial in drug screening and optimization, but it is still a challenge in computer-aided drug design. The recent success of AlphaFold2 in predicting protein structures has brought new hope for deep learning (DL) models to accurately predict protein-ligand binding affinity. However, the current DL models still face limitations due to the low-quality database, inaccurate input representation and inappropriate model architecture. In this work, we review the computational methods, specifically DL-based models, used to predict protein-ligand binding affinity. We start with a brief introduction to protein-ligand binding affinity and the traditional computational methods used to calculate them. We then introduce the basic principles of DL models for predicting protein-ligand binding affinity. Next, we review the commonly used databases, input representations and DL models in this field. Finally, we discuss the potential challenges and future work in accurately predicting protein-ligand binding affinity via DL models.

Keywords: accurate prediction; database; deep learning model; input representation; protein–ligand binding affinity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) The 2D chemical structure of vemurafenib. (B) The 2D chemical structure of SJF-0628 consisting of vemurafenib, a short linker and a Von Hippel Lindau (VHL)-recruiting ligand. (C) The structure of BRAF-vemurafenib complex. BRAF kinase, ligand vemurafenib and identified pocket are colored in green, red and yellow, respectively. (D) The interaction between vemurafenib and BRAF kinase, in which the hydrogen bonds and other contacts are shown as blue and purple lines, respectively.
Figure 2
Figure 2
(A) Three overlapping subsets, including the general, refined and core sets, in the PDBbind database. (B) The number of protein–ligand complexes in the PDBbind database from 2002 to 2020. (C) The pie chart shows the distribution of protein–ligand binding affinity values in the PDBbind database. (D) The distribution of 367 human kinases in the Davis database on the human kinome tree in which the red dots represent each kinase. (E) The pie chart shows the distribution of protein–ligand binding affinity values in the Davis database. The lowest formula image values (formula image) constitute 70% of all binding affinity values in the Davis database. (F) The distribution of 216 human kinases in the KIBA database on the human kinome tree in which the red dots represent each kinase. (G) The pie chart shows the distribution of protein–ligand binding affinity values in the KIBA database.
Figure 3
Figure 3
(A) The binding affinity values of 22 ligands in the Davis database targeting wild-type BRAF and BRAF V600E mutant. (BD) The structure diagrams of the 4th, 7th and 8th ligands in Figure 3A, respectively.
Figure 4
Figure 4
An overall flowchart for predicting protein–ligand interactions based on DL models.
Figure 5
Figure 5
(A) Conceptual workflow of interaction-based DL models. Inputs are the pocket–ligand complex structures and their characteristics. (B) Conceptual workflow of interaction-free DL models. The structure-free models can predict protein–ligand binding affinity without protein–ligand interaction information. The inputs of interaction-free models are ligand SMILES strings/protein sequences or ligand/protein monomers 3D structures and their characteristics.
Figure 6
Figure 6
(A) The binding affinity values of 16 ligands in the Davis database targeting CDK4-CyclinD1 and CDK4-CyclinD3 complexes. (B) The structures of CDK4-CyclinD1 and CDK4-CyclinD3 complexes. (CH) Diagrams of 2D protein–ligand interaction for the 4th, 6th and 15th ligands targeting CDK4-CyclinD1 and CDK4-CyclinD3 complexes in Figure 6A, respectively. The hydrophobic residues of protein and hydrophobic interactions between residues and ligands are colored in red. The hydrogen bonds between residues and ligands are indicated by the green lines. The hydrogen bond residues are colored in yellow and names are colored in green.
Figure 7
Figure 7
(A) The accuracies of nine interaction-based models on the PDBbind-2016 core set. (B) The accuracies of 15 interaction-based models on the CASF-2016 set.

Similar articles

Cited by

References

    1. Miller DW, DILL KA. Ligand binding to proteins: the binding landscape model. Protein Sci 1997;6:2166–79. - PMC - PubMed
    1. Wei J, Chen S, Zong L, et al. Protein-RNA interaction prediction with deep learning: structure matters. Brief Bioinform 2022;23:1–19. - PMC - PubMed
    1. Altemose N, Maslan A, Smith OK, et al. DiMeLo-seq: a long-read, single-molecule method for mapping protein-DNA interactions genome wide. Nat Methods 2022;19:711–23. - PMC - PubMed
    1. Volkamer A, Eid S, Turk S, et al. Pocketome of human kinases: prioritizing the ATP binding sites of (yet) untapped protein kinases for drug discovery. J Chem Inf Model 2015;55(3):538–49. - PubMed
    1. Zarrin AA, Bao K, Lupardus P, et al. Kinase inhibition in autoimmunity and inflammation. Nat Rev Drug Discov 2021;20:39–63. - PMC - PubMed

Publication types