Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 18;14(1):24447.
doi: 10.1038/s41598-024-72784-3.

Ensembling methods for protein-ligand binding affinity prediction

Affiliations

Ensembling methods for protein-ligand binding affinity prediction

Jiffriya Mohamed Abdul Cader et al. Sci Rep. .

Abstract

Protein-ligand binding affinity prediction is a key element of computer-aided drug discovery. Most of the existing deep learning methods for protein-ligand binding affinity prediction utilize single models and suffer from low accuracy and generalization capability. In this paper, we train 13 deep learning models from combinations of 5 input features. Then, we explore all possible ensembles of the trained models to find the best ensembles. Our deep learning models use cross-attention and self-attention layers to extract short and long-range interactions. Our method is named Ensemble Binding Affinity (EBA). EBA extracts information from various models using different combinations of input features, such as simple 1D sequential and structural features of the protein-ligand complexes rather than 3D complex features. EBA is implemented to accurately predict the binding affinity of a protein-ligand complex. One of our ensembles achieves the highest Pearson correlation coefficient (R) value of 0.914 and the lowest root mean square error (RMSE) value of 0.957 on the well-known benchmark test set CASF2016. Our ensembles show significant improvements of more than 15% in R-value and 19% in RMSE on both well-known benchmark CSAR-HiQ test sets over the second-best predictor named CAPLA. Furthermore, the superior performance of the ensembles across all metrics compared to existing state-of-the-art protein-ligand binding affinity prediction methods on all five benchmark test datasets demonstrates the effectiveness and robustness of our approach. Therefore, our approach to improving binding affinity prediction between proteins and ligands can contribute to improving the success rate of potential drugs and accelerate the drug development process.

PubMed Disclaimer

Conflict of interest statement

The authors have disclosed no conflict of interest.

Figures

Fig. 1
Fig. 1
t-SNE visualization of the high dimensional feature representation after self-attention layer in models la (left) and lapst (right) on the test dataset CASF2016_290. The two axes represent the output of the two dimensions by t-SNE.
Fig. 2
Fig. 2
Real and predicted binding affinity values and their relative frequencies on Training2016 and Validation2016.
Fig. 3
Fig. 3
Real and predicted affinity values and their relative frequencies for the best ensembles on the four test sets.
Fig. 4
Fig. 4
Real binding affinity values and their relative frequencies for the datasets.
Fig. 5
Fig. 5
Performance of ensembles AX, AY and AZ in screening: ROC curve (left) and EF (right). The ensembles are trained using the training and validation sets Training2020 and Validation2020 from PDBbind 2020 dataset.
Fig. 6
Fig. 6
Model architectures for feature combinations (left) la and st; (right) lapst; and (middle) all other combinations.

References

    1. Mufassirin, M. M., Newton, M. H. & Sattar, A. Artificial intelligence for template-free protein structure prediction: A comprehensive review. Artif. Intell. Rev.56, 7665–7732 (2023).
    1. Gilson, M. K. & Zhou, H.-X. Calculation of protein-ligand binding affinities. Annu. Rev. Biophys. Biomol. Struct.36, 21–42 (2007). - PubMed
    1. Seo, M.-H., Park, J., Kim, E., Hohng, S. & Kim, H.-S. Protein conformational dynamics dictate the binding affinity for a ligand. Nat. Commun.5, 3724 (2014). - PubMed
    1. Jin, Z. et al. CAPLA: Improved prediction of protein-ligand binding affinity by a deep learning approach based on a cross-attention mechanism. Bioinformatics39, 049 (2023). - PMC - PubMed
    1. McInnes, C. Virtual screening strategies in drug discovery. Current Opin. Chem. Biol.11, 494–502 (2007). - PubMed

LinkOut - more resources