This is a preprint.
Sliding Window INteraction Grammar (SWING): a generalized interaction language model for peptide and protein interactions
- PMID: 38746274
- PMCID: PMC11092674
- DOI: 10.1101/2024.05.01.592062
Sliding Window INteraction Grammar (SWING): a generalized interaction language model for peptide and protein interactions
Update in
-
Sliding Window Interaction Grammar (SWING): a generalized interaction language model for peptide and protein interactions.Nat Methods. 2025 Aug;22(8):1707-1719. doi: 10.1038/s41592-025-02723-1. Epub 2025 Jul 28. Nat Methods. 2025. PMID: 40721872 Free PMC article.
Abstract
The explosion of sequence data has allowed the rapid growth of protein language models (pLMs). pLMs have now been employed in many frameworks including variant-effect and peptide-specificity prediction. Traditionally, for protein-protein or peptide-protein interactions (PPIs), corresponding sequences are either co-embedded followed by post-hoc integration or the sequences are concatenated prior to embedding. Interestingly, no method utilizes a language representation of the interaction itself. We developed an interaction LM (iLM), which uses a novel language to represent interactions between protein/peptide sequences. Sliding Window Interaction Grammar (SWING) leverages differences in amino acid properties to generate an interaction vocabulary. This vocabulary is the input into a LM followed by a supervised prediction step where the LM's representations are used as features. SWING was first applied to predicting peptide:MHC (pMHC) interactions. SWING was not only successful at generating Class I and Class II models that have comparable prediction to state-of-the-art approaches, but the unique Mixed Class model was also successful at jointly predicting both classes. Further, the SWING model trained only on Class I alleles was predictive for Class II, a complex prediction task not attempted by any existing approach. For de novo data, using only Class I or Class II data, SWING also accurately predicted Class II pMHC interactions in murine models of SLE (MRL/lpr model) and T1D (NOD model), that were validated experimentally. To further evaluate SWING's generalizability, we tested its ability to predict the disruption of specific protein-protein interactions by missense mutations. Although modern methods like AlphaMissense and ESM1b can predict interfaces and variant effects/pathogenicity per mutation, they are unable to predict interaction-specific disruptions. SWING was successful at accurately predicting the impact of both Mendelian mutations and population variants on PPIs. This is the first generalizable approach that can accurately predict interaction-specific disruptions by missense mutations with only sequence information. Overall, SWING is a first-in-class generalizable zero-shot iLM that learns the language of PPIs.
Conflict of interest statement
Conflict of Interest The authors declare no conflict of interest.
Figures





References
-
- LeCun Y., Bengio Y. & Hinton G. Deep learning. Nature 521, 436–444 (2015). - PubMed
-
- Unsal S. et al. Learning functional properties of proteins with language models. Nature Machine Intelligence 4, 227–245 (2022).
-
- Mock M., Langmead C. J., Grandsard P., Edavettal S. & Russell A. Recent advances in generative biology for biotherapeutic discovery. Trends Pharmacol. Sci. 45, 255–267 (2024). - PubMed
Publication types
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials