Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 23;25(4):bbae297.
doi: 10.1093/bib/bbae297.

GAPS: a geometric attention-based network for peptide binding site identification by the transfer learning approach

Affiliations

GAPS: a geometric attention-based network for peptide binding site identification by the transfer learning approach

Cheng Zhu et al. Brief Bioinform. .

Abstract

Protein-peptide interactions (PPepIs) are vital to understanding cellular functions, which can facilitate the design of novel drugs. As an essential component in forming a PPepI, protein-peptide binding sites are the basis for understanding the mechanisms involved in PPepIs. Therefore, accurately identifying protein-peptide binding sites becomes a critical task. The traditional experimental methods for researching these binding sites are labor-intensive and time-consuming, and some computational tools have been invented to supplement it. However, these computational tools have limitations in generality or accuracy due to the need for ligand information, complex feature construction, or their reliance on modeling based on amino acid residues. To deal with the drawbacks of these computational algorithms, we describe a geometric attention-based network for peptide binding site identification (GAPS) in this work. The proposed model utilizes geometric feature engineering to construct atom representations and incorporates multiple attention mechanisms to update relevant biological features. In addition, the transfer learning strategy is implemented for leveraging the protein-protein binding sites information to enhance the protein-peptide binding sites recognition capability, taking into account the common structure and biological bias between proteins and peptides. Consequently, GAPS demonstrates the state-of-the-art performance and excellent robustness in this task. Moreover, our model exhibits exceptional performance across several extended experiments including predicting the apo protein-peptide, protein-cyclic peptide and the AlphaFold-predicted protein-peptide binding sites. These results confirm that the GAPS model is a powerful, versatile, stable method suitable for diverse binding site predictions.

Keywords: attention mechanism; binding sites; geometric deep learning; transfer learning.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The overview of the GAPS to predict the peptide binding sites. (A) Method used for modeling the spatial structure of the protein receptors. (B) The architecture of GAPS. QS, KS, and VS, respectively, represent the query, key, and value extracted from the residue scalar embedding. QV, KV, and VV, respectively, represent the query, key, and value extracted from the residue vector embedding. (C) The input and output of GAPS. (D) The architecture of geometric attention mechanism. FeatureS, FeatureV, FeatureN, and FeatureE, respectively, represent the atomic scalar feature, atomic vector feature, nearest neighbors feature, and edge feature. EmbeddingS and EmbeddingV, respectively, represent the atomic scalar embedding and atomic vector embedding. (E) The architecture of cross-attention mechanism. (F) The various downstream tasks.
Figure 2
Figure 2
The predictive performance of the GAPS in the protein–peptide binding sites task. (A) An example of predicted binding sites by GAPS (PDB ID: 6R7W). (B) The comparison of the GAPS and other methods on the TS092 with the AUROC and MCC metrics. (C) The comparison of the GAPS and other methods on the TS125 with the AUROC and MCC metrics. (D) The comparison between the GAPS and BiteNetPp on the TS125 with average metrics values. The better scores are shown in bold. The results of baseline methods are derived from the corresponding papers.
Figure 3
Figure 3
The analysis of GAPS’s performance from different aspects. (A) An example of predicted multiple binding sites on a protein (PDB ID: 6TYT). (B) An example of predicted the binding sites on a multimer (PDB ID: 6JZD). One chain is more transparent to distinguish. (C) The clustering results on the TS092 using the Foldseek cluster. The size of the circle represents the AUROC score. (D) Distributions of AUROC and MCC predicted by the GAPS on proteins with different residue lengths on the TS092. (E) Distributions of AUROC and MCC predicted by the GAPS on proteins with different buried solvent-accessible surface areas on the TS092.
Figure 4
Figure 4
The analysis of the ablation study. (A) Predictive performance comparison between GAPS and its variants on the TS092. w/o stands for without. The best scores are shown in bold. (B) AUROC comparison of the GAPS and various ablation models for each sample within TS092. (C) The T-SNE visualization of GAPS and its variants. (D) The predicted binding sties of GAPS and its variants. (E) The performance comparison between the pre-trained GAPS and AGAT-PPIS in the protein–protein binding sites prediction task. (F) The comparison of the size of training and validation sets used by the pre-trained GAPS and PeSTo. (G) The performance comparison between the pre-trained GAPS and PeSTo in the protein–protein binding sites prediction task.
Figure 5
Figure 5
The results of the apo protein–peptide and the protein-cyclic peptide binding sites predictions. (A) An example of predicted binding sites on the apo protein (PDB ID: 2HWQ). The holo protein based on the structure of the complex (PDB ID: 2FVJ) is added. (B) The comparison of the GAPS’s predictive performance between apo and holo proteins. (C) AUROC of the GAPS for paired apo protein and holo protein on the Testset_holo_apo. (D) The comparison of AUROC predicted by the GAPS corresponding to the RMSD between apo protein and holo protein on the Testset_holo_apo. (E) The predictive performance of the GAPS in holo protein-cyclic peptide binding sites task. (F) An example of predicted binding sites on the apo protein for cyclic peptide (PDB ID: 4KGA). The holo protein based on the structure of the complex (PDB ID: 4K1E) is added. (G) The comparison of the GAPS’s predictive performance between apo protein and holo protein for cyclic peptide. (H) AUROC of the GAPS for paired apo protein and holo protein for cyclic peptide on the Testset_cyclic_holo_apo.
Figure 6
Figure 6
The protein–peptide binding sites prediction for AF-predicted structures. (A) The workflow for predicting the binding sites using the GAPS based on AF-predicted proteins. (B) The correlation between RMSD and average pLDDT. (C) An example of predicted binding sites on the proteins predicted by AlphaFold. The crystal protein based on the structure of the complex (PDB ID: 7CFC) is added. (D) The comparison of the GAPS’s predictive performance between experimental protein and AF-predicted protein. (E) AUROC comparison of experimental and AF-predicted proteins for each sample using the GAPS. (F) MCC of the GAPS for paired experimental protein and predicted protein on the Testset_AFpredicted.

Similar articles

References

    1. Varlas S, Maitland GL, Derry MJ. Protein-, (poly)peptide-, and amino acid-based nanostructures prepared via polymerization-induced self-assembly. Polymers 2021;13:2603. 10.3390/polym13162603. - DOI - PMC - PubMed
    1. Petsalaki E, Russell RB. Peptide-mediated interactions in biological systems: new discoveries and applications. Curr Opin Biotechnol 2008;19:344–50. 10.1016/j.copbio.2008.06.004. - DOI - PubMed
    1. Fletcher JC. Recent advances in Arabidopsis CLE peptide Signaling. Trends Plant Sci 2020;25:1005–16. 10.1016/j.tplants.2020.04.014. - DOI - PubMed
    1. Haney EF, Straus SK, Hancock REW. Reassessing the host Defense peptide landscape. Frontiers. Chemistry 2019;7:43. 10.3389/fchem.2019.00043. - DOI - PMC - PubMed
    1. Mookherjee N, Anderson MA, Haagsman HP. et al. . Antimicrobial host defence peptides: functions and clinical potential. Nat Rev Drug Discov 2020;19:311–32. 10.1038/s41573-019-0058-8. - DOI - PubMed